1 minute read

What is KMP

KMP is an optimized algorithm to determine if one string is sub-string of another array.

Brute Force solution of above problem is we match string on each position of original string, with O(M * N) time complexity, it can be very worse if the length of matching string is very long, in extreme case, this time complexity can be O(N ^ 2)

KMP is trying to optimized above solution, with O(N) time complexity

Flow of KMP

  1. build a helper array for match string, each position record the length of “prefix == suffix”
  2. assume the value of index 0 is -1, and value of index 1 is 0
  3. then compare the original string with match string, if un-match happens, instead of starting from next position of original string, we can jump on match string based on helper array

Implementation

public class KMP {
	public static void main(String[] args) {
		String str = "aasbbtaasbbk";
		String match1 = "taab";
        String match2 = "taas";
		System.out.println("contains: " + isContain(str, match1)); // false
        System.out.println("contains: " + isContain(str, match2)); // true

	}

	private static boolean isContain(String str, String match) {
		char[] s = str.toCharArray();
		char[] m = match.toCharArray();
		int i = 0, j = 0;

		int[] next = getNextArray(m);

		while (i < s.length && j < m.length) {
			if (s[i] == m[j]) {
				i++;
				j++;
			} else if (j == 0) {
				i++;
			} else {
				j = next[j];
			}
		}

		return j == m.length;
	}

	private static int[] getNextArray(char[] m) {
		if (m.length == 1) {
			return new int[] { -1 };
		}

		int[] res = new int[m.length];
		res[0] = -1;
		res[1] = 0;

		int i = 2;
		int position = 0;

		while (i < m.length) {
			if (m[i - 1] == m[position]) {
				res[i++] = ++position;
			} else if (position == 0) {
				i++;
			} else {
				position = res[position];
			}
		}

		return res;
	}
}