#String #Pattern Matching

🔗 detailed implementation in Github

The Knuth-Morris-Pratt (KMP) algorithm is an efficient algorithm used for pattern matching in a string of characters. It was developed by Donald Knuth, Vaughan Pratt, and James Morris in 1977. The KMP algorithm is used to find the occurrence of a pattern string in a given string.

How KMP Algorithm Works

The KMP algorithm works by comparing the pattern string with the given string character by character. It uses a table called the 'failure function' to determine where to start comparing the pattern string with the given string.

The failure function is created by finding the longest proper prefix of the pattern string that is also a suffix of the pattern string. The prefix and suffix should not be the same. For example, if the pattern string is 'ABA', then the longest proper prefix that is also a suffix is 'A'. Therefore, the failure function for 'ABA' would be [-1, -1, 0].

Once the failure function is created, the KMP algorithm starts comparing the pattern string with the given string character by character. If there is a match, it continues comparing the next character in the pattern string with the next character in the given string. However, if there is a mismatch, the algorithm uses the failure function to determine where to start comparing again. It does not start comparing from the beginning of the pattern string.

Time Complexity

Failure Function

Formula of Failure Function

$$ f(j)= \begin{cases} \text{largest }i<j,\text{ such that }p_pp_1\dotsi p_i=p_{j-i}p_{j-i+1}\dotsi p_j & \text{if such an } i\geq 0\text{ exists}\\ -1 & \text{otherwise} \end{cases} $$

Example of Failure Function

Screenshot 2023-03-17 at 8.45.42 PM.png

Code and Example of KMP Algorithm