Create Google-Style Text Snippets

This function accepts a large string, such as a text node in XML file, and creates a Google-style text snippet. The search keywords supplied via txt1 input. The GetSb function removes stop words from search keywords string, while ReplaceKeyWords function formats search keywords in bold within the snippet.


Public Function Createsnippets(snippet As String) As String
Dim strResults As String = Nothing
Dim str, strSbMAtch As String
Dim strArr() As String
Dim count As Integer
str = txt1.Text.Trim
strArr = str.Split(" ")    
For count = 0 To strArr.Length - 1        
strSbMAtch = GetSb(strArr(count))
If strSbMAtch <> "" Then
Else
Dim strSearch As String = "((\w+\s*\b){0,20})" & strArr(count) & "((\b\s*\w+){0,20})"
Dim reg_exp As New Regex(strSearch, RegexOptions.IgnoreCase)
Dim matches As MatchCollection
matches = reg_exp.Matches(snippet)
For Each a_match As Match In matches
strResults &= a_match.Value & " ... "
Dim RegExp As Regex = New Regex(strArr(count), RegexOptions.IgnoreCase)
strResults = RegExp.Replace(strResults, New MatchEvaluator(AddressOf ReplaceKeyWords))
Next a_match
End If
Next
Return strResults
End Function

Inside the snippet, ReplaceKeyWords function is used to put the search keywords in bold.


Public Function ReplaceKeyWords(ByVal m As Match) As String
Dim strBuff As String
strBuff = Nothing
strBuff = "<b>" & m.Value & "<b>"
Return strBuff
End Function
        

The GetSb functions adds a nice touch by removing stop words or noise words. I have included only a few for space sake. You can find a complete list of such words on the Internet.


Public Function GetSb(strInp As String) As String
Dim str, strNoiseWords As String
Dim sb As StringBuilder
sb = New StringBuilder("a, about, and, another, afterwards, going")
Dim strFinal As String
strNoiseWords = sb.Tostring
strFinal = strNoiseWords.Replace(",", " ")
Dim strSearch As String = "(\b)" & strInp & "(\b)"
Dim reg_exp As New Regex(strSearch)
Dim m As Match = reg_exp.Match(strFinal)
str = m.Value
Return str
End Function