Algorithm engineering : string processing
The string matching problem has attracted a lot of interest throughout the history of
computer science, and is crucial to the computing industry. The theoretical community
in Computer Science has a developed a rich literature in the design and analysis of string
matching algorithms. To date, most of this work has been based on the asymptotic
analysis of the algorithms. This analysis rarely tell us how the algorithm will perform
in practice and considerable experimentation and fine-tuning is typically required to
get the most out of a theoretical idea.
In this thesis, promising string matching algorithms discovered by the theoretical community
are implemented, tested and refined to the point where they can be usefully
applied in practice. In the course of this work we have presented the following new
algorithms. We prove that the time complexity of the new algorithms, for the average
case is linear. We also compared the new algorithms with the existing algorithms by
" We implemented the existing one dimensional string matching algorithms for English
texts. From the findings of the experimental results we identified the best two
algorithms. We combined these two algorithms and introduce a new algorithm.
" We developed a new two dimensional string matching algorithm. This algorithm
uses the structure of the pattern to reduce the number of comparisons required to
search for the pattern.
" We described a method for efficiently storing text. Although this reduces the size
of the storage space, it is not a compression method as in the literature. Our aim
is to improve both space and time taken by a string matching algorithm. Our new
algorithm searches for patterns in the efficiently stored text without decompressing
" We illustrated that by pre-processing the text we can improve the speed of the
string matching algorithm when we search for a large number of patterns in a
" We proposed a hardware solution for searching in an efficiently stored DNA text.