Create Google-Style Text Snippets

This function accepts a large string, such as a text node in XML file, and creates a Google-style text snippet. The search keywords supplied via txt1 input. The GetSb function removes stop words from search keywords string, while ReplaceKeyWords function formats search keywords in bold within the snippet.

Public Function Createsnippets(snippet As String) As String

Dim strResults As String = Nothing

Dim str, strSbMAtch As String

Dim strArr() As String

Dim count As Integer

str = txt1.Text.Trim
strArr = str.Split(" ")    

For count = 0 To strArr.Length - 1        
strSbMAtch = GetSb(strArr(count))

If strSbMAtch <> "" Then

Dim strSearch As String = "((\w+\s*\b){0,20})" & strArr(count) & "((\b\s*\w+){0,20})"
Dim reg_exp As New Regex(strSearch, RegexOptions.IgnoreCase)

Dim matches As MatchCollection

matches = reg_exp.Matches(snippet)

For Each a_match As Match In matches

strResults &= a_match.Value & " ... "

Dim RegExp As Regex = New Regex(strArr(count), RegexOptions.IgnoreCase)

strResults = RegExp.Replace(strResults, New MatchEvaluator(AddressOf ReplaceKeyWords))

Next a_match

End If

Return strResults
End Function

Inside the snippet, ReplaceKeyWords function is used to put the search keywords in bold.

Public Function ReplaceKeyWords(ByVal m As Match) As String

Dim strBuff As String

strBuff = Nothing

strBuff = "<b>" & m.Value & "<b>"

Return strBuff

End Function


The GetSb functions adds a nice touch by removing stop words or noise words. I have included only a few for space sake. You can find a complete list of such words on the Internet.

Public Function GetSb(strInp As String) As String

Dim str, strNoiseWords As String

Dim sb As StringBuilder

sb = New StringBuilder("a, about, and, another, afterwards, going")

Dim strFinal As String

strNoiseWords = sb.Tostring

strFinal = strNoiseWords.Replace(",", " ")
Dim strSearch As String = "(\b)" & strInp & "(\b)"
Dim reg_exp As New Regex(strSearch)

Dim m As Match = reg_exp.Match(strFinal)
str = m.Value

Return str

End Function