Boyer-Moore

Features

  • Time: O(m+n), where m is the length of pattern, and n is the length of input string.
  • Space: O(N), right array holds, if finite number of different chars as pattern.
  • Times of compare: It will take O(m/n) times compares. However, it is normally less than that. You may apply some rules to define a better right array in some case.

Code

Code in English

  1. Build a right array to hold all chars.
  2. For each different char in pattern string, set each char in right array to -1.
  3. For each different char in pattern string, loop every char and put each one in the right array to find the most-right one.
  4. We have done pre-caculate part.
  5. A variable named skip is initialised. It use to hold the number of char we skip in input string. So it will be initialised to 0, which means there is no need to skip (found the positiion of substring).
  6. Loop input String with i, from 0 to N-M, where N is the length of input string, and M is the length of parttern string. Simply loop from
  7. Loop pattren string from back to start with j, from M-1(M is the length of pattern string) to 0
  8. we compare j to i+j (i+j is match position in input string), if they are not same, we find the char in our right array, and use j minus it to get our number of skips.
  9. if skips is less than 1, it means that we have to go back to find again. At this point, we don’t want this happen. we have to set it to 1, to make sure our algorithm will not loop back.
  10. Finnally, becase we don’t get match, we break the inner loop.
  11. After we jump out inner loop, we need to check variable skip, if it equals to 0, which means, after pass through all chars in pattern, we have all chars matchs with input string, we simply return i.
  12. if our loops didn’t find anything, return N.