Thursday, 20 May 2010

Replacing Text In Html (Outside Html Tags)

private const string OUTSIDE_TAG_LOOKAHEAD = "(?![^<]+>)";

public static string HighlightWordsInHtmlText( string htmlText, params string[] words ){
  if (words == null || string.IsNullOrEmpty(htmlText) ) return htmlText;
  Regex regex = new Regex(OUTSIDE_TAG_LOOKAHEAD + "("+ string.Join("|", words) +")", RegexOptions.IgnoreCase);
  return regex.Replace(htmlText, "<span class=\"highlight\">$&</span>" );
}
OUTSIDE_TAG_LOOKAHEAD - uses regular expression magic, that matches text inside tags, but as it is negated, the text matched is really outside of html tags.
$& - refers to the current match. We cannot put here any word as we don't precisely know what of them was found and in what case.
This example matches also tron in strong, if you need an exact match add word boundaries like that:
OUTSIDE_TAG_LOOKAHEAD + "\\b("+ string.Join("|", words) +")\\b"