algorithm-base/animation-simulation/数据结构和算法/KMP.md

188 lines
8.6 KiB
Java
Raw Blame History

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

## KMPKnuth-Morris-Pratt
> **[tan45du_one](https://raw.githubusercontent.com/tan45du/tan45du.github.io/master/个人微信.15egrcgqd94w.jpg)** ,备注 github + 题目 + 问题 向我反馈
>
>
>
> <u>[****](https://raw.githubusercontent.com/tan45du/test/master/微信图片_20210320152235.2pthdebvh1c0.png)</u> 两个平台同步,想要和题友一起刷题,互相监督的同学,可以在我的小屋点击<u>[**刷题小队**](https://raw.githubusercontent.com/tan45du/test/master/微信图片_20210320152235.2pthdebvh1c0.png)</u>进入。
BM BM KMP BM KMP
![](https://img-blog.csdnimg.cn/20210319193924180.gif)
KMP KMP
****
![KMP](https://cdn.jsdelivr.net/gh/tan45du/photobed@master/photo/KMP例子.1uirbimk5fcw.png)
绿
![](https://img-blog.csdnimg.cn/20210401204019428.png)
![](https://cdn.jsdelivr.net/gh/tan45du/photobed@master/photo/原理.bghc3ecm4z4.png)
KMP
BM bc,suffix,prefix KMP next next
next
![next](https://cdn.jsdelivr.net/gh/tan45du/photobed@master/photo/next数组.3nir7pgcs9c0.png)
next KMP next
![KMP1](https://cdn.jsdelivr.net/gh/tan45du/photobed@master/photo/KMP1.j74ujxjuq1c.png)
![kmp2](https://cdn.jsdelivr.net/gh/tan45du/photobed@master/photo/kmp2.6jx846nmyd00.png)
![](https://img-blog.csdnimg.cn/20210319193924754.gif)
** next **
```java
class Solution {
public int strStr(String haystack, String needle) {
//两种特殊情况
if (needle.length() == 0) {
return 0;
}
if (haystack.length() == 0) {
return -1;
}
// char 数组
char[] hasyarr = haystack.toCharArray();
char[] nearr = needle.toCharArray();
//长度
int halen = hasyarr.length;
int nelen = nearr.length;
//返回下标
return kmp(hasyarr,halen,nearr,nelen);
}
public int kmp (char[] hasyarr, int halen, char[] nearr, int nelen) {
//获取next 数组
int[] next = next(nearr,nelen);
int j = 0;
for (int i = 0; i < halen; ++i) {
//发现不匹配的字符,然后根据 next 数组移动指针,移动到最大公共前后缀的,
//前缀的后一位,和咱们移动模式串的含义相同
while (j > 0 && hasyarr[i] != nearr[j]) {
j = next[j - 1] + 1;
//超出长度时,可以直接返回不存在
if (nelen - j + i > halen) {
return -1;
}
}
//如果相同就将指针同时后移一下,比较下个字符
if (hasyarr[i] == nearr[j]) {
++j;
}
//遍历完整个模式串,返回模式串的起点下标
if (j == nelen) {
return i - nelen + 1;
}
}
return -1;
}
//这一块比较难懂,不想看的同学可以忽略,了解大致含义即可,或者自己调试一下,看看运行情况
//我会每一步都写上注释
public int[] next (char[] needle,int len) {
//定义 next 数组
int[] next = new int[len];
// 初始化
next[0] = -1;
int k = -1;
for (int i = 1; i < len; ++i) {
//我们此时知道了 [0,i-1]的最长前后缀但是k+1的指向的值和i不相同时我们则需要回溯
//因为 next[k]就时用来记录子串的最长公共前后缀的尾坐标(即长度)
//就要找 k+1前一个元素在next数组里的值,即next[k+1]
while (k != -1 && needle[k + 1] != needle[i]) {
k = next[k];
}
// 相同情况,就是 k的下一位和 i 相同时,此时我们已经知道 [0,i-1]的最长前后缀
//然后 k + 1 又和 i 相同最长前后缀加1即可
if (needle[k+1] == needle[i]) {
++k;
}
next[i] = k;
}
return next;
}
}
```
Python Code:
```python
from typing import List
class Solution:
def strStr(self, haystack: str, needle: str)->int:
#
if len(needle) == 0:
return 0
if len(haystack) == 0:
return -1
#
halen = len(haystack)
nelen = len(needle)
#
return self.kmp(haystack, halen, needle, nelen)
def kmp(self, hasyarr: str, halen: int, nearr: str, nelen: int)->int:
# next
next = self.next(nearr, nelen)
j = 0
for i in range(0, halen):
# next
# ,
while j > 0 and hasyarr[i] != nearr[j]:
j = next[j - 1] + 1
#
if nelen - j + i > halen:
return -1
#
if hasyarr[i] == nearr[j]:
j += 1
#
if j == nelen:
return i - nelen + 1
return -1
#
#
def next(self, needle: str, len:int)->List[int]:
# next
next = [0] * len
#
next[0] = -1
k = -1
for i in range(1, len):
# [0,i-1]k+1i
# next[k]
# k+1next,next[k+1]
while k != -1 and needle[k + 1] != needle[i]:
k = next[k]
# k i [0,i-1]
# k + 1 i 1
if needle[k + 1] == needle[i]:
k += 1
next[i] = k
return next
```