The first example is a function that finds all occurrences of a word in a string (a sequence of characters). The function string_search(w,s) (see Figure 6) takes a search word w and a string s, and returns the starting position of all substrings in s that match w. For example,
Figure: Finding all occurrences of the word w in
the string s.
The algorithm starts by considering all positions between 0 and #s-#w as candidates for a match (no candidate could be greater than this since it would have to match past the end of the string). The candidates are stored as pointers (indices) into s of the beginning of each match. The algorithm then progresses through the search string, using recursive calls to next_char, narrowing the set of candidate matches on each step.
Based on the current candidates, next_char narrows the set of
candidates by only keeping the candidates that match on the next
character of w. To do this, each candidate checks whether the
i
character in w matches the i
position past the candidate index. All candidates that do match are
packed and passed into the recursive call of next_char. The
recursion completes when the algorithm reaches the end of w.
The progression of candidates in the "foo" example would be:
20pt
![]()
Lets consider the complexity of the algorithm. We assume #w =
m and #s = n. The depth complexity of the algorithm is some
constant times the number of recursive calls, which is simply
.
The work complexity of the algorithm is the sum over the recursive
calls, of the number of candidates in each recursive call. In
practice, this is usually
, but in the worst case this can be
the product of the two lengths
(the worst case can only happen
if most of the characters in w are repeated). There are
parallel string-searching algorithms that give better bounds on the
parallel time (depth complexity), and that bound the worst case work
complexity to be linear in the length of the search
string [15, 44], but these algorithms are
somewhat more complicated.