KMP algorithm
Mainly used for string matching
The main idea of KMP is that when there is a string mismatch, you can know part of the text content that has been matched before, and you can use this information to avoid having to do the matching from scratch.
So how to record the content of the text has been matched, is the focus of KMP, but also the next array shoulders the heavy responsibility.
The next array is a prefix table.
What does the prefix list do?
The prefix table is used for backtracking, it records where the pattern string should be re-matched from when it does not match the main string (text string).
At this point it is important to ask how the prefix table is recorded?
The first thing to know is that the task of the prefix table is to fail to match the current position, find the position that has been matched before, and then re-match, this also means that in the case of a character mismatch, the prefix table will tell you the next step in the match, the pattern of the string should be jumped to which position.
So what is a prefix table: a record of how many strings up to and including subscript i have the same prefix suffix of the same length.
The longest equal prefix and suffix strings for the part of the string before subscript 5 (i.e., the string aabaa) are the substring aa, because the longest equal prefix and suffix are found, and the match fails at the back of the suffix substring, so we find the back of the prefix that is the same as it and re-match it.
So the prefix table has the ability to tell us that a match has failed at the current location and to jump to a place that has been matched before.
Many implementations of the KMP algorithm use next arrays for backoff operations, so what is the relationship between next arrays and prefix tables?
The next array can then be a prefix table, but many implementations uniformly decrement the prefix table by one (shifting it one place to the right, with an initial position of -1) and then use it as the next array.
Where n is the length of the text string, m is the length of the pattern string, because in the process of matching, according to the prefix table constantly adjust the position of the match, it can be seen that the process of matching is O(n), and before that, we have to generate a separate next array, the time complexity is O(m). So the time complexity of the whole KMP algorithm is O(n+m).
The violent solution is obviously O(n × m), so KMP greatly improves the efficiency of search in string matching.
28. Implementation of strStr()
Given two strings, haystack and needle, find the subscript (starting at 0) of the first match of the needle string in the haystack string. If needle is not part of haystack, return -1.
Example 1:
Input: haystack = "sadbutsad", needle = "sad"
Output: 0
Explanation: "sad" matches at subscripts 0 and 6.
The first match is at subscript 0, so it returns 0.
Example 2:
Input: haystack = "leetcode", needle = "leeto"
Output: -1
Explanation: "leeto" does not appear in "leetcode", so -1 is returned.
Tip:
1 <= , <= 104
haystack and needle consist of lowercase English characters only.
Positive solution (KMP)
Apparently a KMP template question
Upcode (●'◡'●)
class Solution {
public.
void getNext(int* next, const string& s) {
next[0] = j; next[0] = s; next[0] = j; next[0] = s; s
next[0] = j.
for(int i = 1; i < (); i++) { // note that i starts at 1
while (j >= 0 && s[i] ! = s[j + 1]) { // the prefix and suffix are not the same anymore
j = next[j]; // go backward in time
}
if (s[i] == s[j + 1]) { // find identical prefixes and suffixes
j++;
}
next[i] = j; // assign j (the length of the prefix) to next[i]
}
}
int strStr(string haystack, string needle) {
if (() == 0) {
return 0; }
}
vector<int> next(()).
getNext(& next[0], needle); int j = -1; // because the next string is the same as the next string, but the next string is the same as the next.
int j = -1; // // because the start position of the record in the next array is -1
for (int i = 0; i < (); i++) { // Note that i starts at 0
while(j >= 0 && haystack[i] ! = needle[j + 1]) { // mismatch
j = next[j]; // j looks for the previous match
}
if (haystack[i] == needle[j + 1]) { // match, j and i move back at the same time
j++; // i's incremented in the for loop
}
if (j == (() - 1) ) { // pattern string t appears in text string s
return (i - () + 1); }
}
}
return -1; }
}
}; }
459. Repeated substrings
Given a non-empty string s, check whether it can be composed by repeating one of its substrings many times.
Example 1.
Input: s = "abab"
Output: true
Explanation: Can be formed by repeating the substring "ab" twice.
Example 2.
Input: s = "aba"
Output: false
Example 3.
Input: s = "abcabcabcabc"
Output: true
Explanation: It can be formed by repeating the substring "abc" four times. (Or substring "abcabc" repeated twice.)
Tip:
1 <= <= 104
s consists of lowercase letters
Positive solution (still KMP)
When a string s: abcabc, internally consists of repeated substrings
That is, it consists of substrings that are the same before and after.
Then since there is the same substring before and the same substring after, with s + s, the string thus formed must still form an s with the substring after as the pre-string and the substring before as the post-string;
So to determine whether a string s consists of a repeated substring, whenever two s's are spliced together and an s's still occurs inside, it means it consists of a repeated substring.
Of course, we have to shave off the first and last characters of s + s when determining whether an s appears in the s + s concatenated string;
This avoids searching for the original s in s+s; we're searching for the spliced-out s in the middle.
Upcode (●'◡'●)
class Solution {
public:
void getNext (int* next, const string& s){
next[0] = -1;
int j = -1;
for(int i = 1;i < (); i++){
while(j >= 0 && s[i] != s[j + 1]) {
j = next[j];
}
if(s[i] == s[j + 1]) {
j++;
}
next[i] = j;
}
}
bool repeatedSubstringPattern (string s) {
if (() == 0) {
return false;
}
int next[()];
getNext(next, s);
int len = ();
if (next[len - 1] != -1 && len % (len - (next[len - 1] + 1)) == 0) {
return true;
}
return false;
}
};
It's not easy to write a blog, so please give me some support 8~!