The next string search algorithm I am going to cover seems pretty cool since the match is made in a reverse fashion. We check if the last character in the keyword equals some character in the text and decrease from that point on.As a result, a mismatch is detected extremely fast and this is also the reason why this algorithm is performs beautifully when the keyword is longer.
Now, when a mismatch occurs, the position where we look for the next character is determined by a table of values mapping each character in the keyword to an integer. Example, we find an “a” in the text which at a particular state is sitting atop an M in the keyword, we shift the position in the text by the corresponding number in our table and look again.
Here is how we generate the table:
def goodSuffixShift(key):
for i in xrange(len(key)-1, -1, -1):
if key[i] not in bcs.keys():
bcs[key[i]] = len(key)-i-1
Next, comes the actual search:
i = len(key)-1
index = len(key) -1
j = i
while True:
print str(i) + ", " + str(j)
if i == 0:
return j #if every character in the keyword matches, return current position in text
elif j > len(text):
return "not found" #if we run out of the string while searching, there is no chance…
elif text[j] != key[i] and text[j] not in bcs.keys():
j += len(key) #If there isn’t a match and we don’t have a switch value in our table, don’t bother looking for characters in the middle and start matching again
i = index
elif text[j] != key[i] and text[j] in bcs.keys():
j += bcs[text[j]] #move forward as table says and start looking again
i = index
else:
j -= 1 #in case a match is found, check for the character to the left
i -= 1
Here are some sample runs:
text = "SHRIPHANI IS A COMPUTER SCIENCE POWERHOUSE"
for key in key_list:
goodSuffixShift(key)
print key + " location: " + str(search(text, key))
bcs = {}
The output:
tark-b-120:scripts shriphani$ python boyer_moore.py POWER location: 32 HOUSE location: 37 COMP location: 15 SCIENCE location: 24 SHRIPHANI location: 0 BRUAH location: not found
You can get the entire script at http://shriphani.com/scripts/boyer_moore.py.
Got to get back to Psychology and GUI stuff in Java. Anyway, GUIs are for losers, CLI FTW !!!



There’s a problem with the code if you search for a single character, ‘A’ for example. I changed the following to make it work:
while True:
if i < 0: #changed
return j + 1 #changed
in the search function