Find All Anagrams in a String
Find All Anagrams in a String (window sliding alg)
update Jun 29, 2017
leetcode Given a string s and a non-empty string p, find all the start indices of p's anagrams in s.
Strings consists of lowercase English letters only and the length of both strings s and p will not be larger than 20,100.
The order of output does not matter.
Example 1:
Input:
s: "cbaebabacd" p: "abc"
Output:
[0, 6]
Explanation:
The substring with start index = 0 is "cba", which is an anagram of "abc".
The substring with start index = 6 is "bac", which is an anagram of "abc".
Example 2:
Input:
s: "abab" p: "ab"
Output:
[0, 1, 2]
Explanation:
The substring with start index = 0 is "ab", which is an anagram of "ab".
The substring with start index = 1 is "ba", which is an anagram of "ab".
The substring with start index = 2 is "ab", which is an anagram of "ab".
思路
最简单的方法就是把pattern string的char进行统计,建counting map,然后brute force,但是这样做的速度太慢。一个优化的方法是使用 sliding window algorithm ,也就是 maintain 一个长度等于len(pattern)
的 window, 每次向右移动一格,统计进入和离开 window的char,用一个 int need变量记录当前还需要match的char数量,如果need==0,则说明找到一个subString。
这里 是一个很棒的分析,有提供window slide解决substring问题的模板。
Java code:
public class Solution {
public List<Integer> findAnagrams(String s, String p) {
if (s==null || p == null || s.length() == 0 || p.length() == 0) return new ArrayList<Integer>();
// count each char in p using a counting map
int[] counter = new int[26];
for (char c : p.toCharArray()) {
counter[c - 'a']++;
}
// using a sliding window with size == len(p) to traverse s,
// counting each char goes in and out the window
List<Integer> res = new ArrayList<>();
int left = 0;
int right = 0;
int need = p.length(); // first, we need to find len(p) chars to match
while (right < s.length()) {
if (counter[s.charAt(right) - 'a'] > 0) {
need--; // means we have found a match char
}
// do these every time, need to restore counter when chars goes out of the window
counter[s.charAt(right) - 'a']--;
right++;
// when need == 0, means we got a match subString, left is its start index
if (need == 0) {
res.add(left);
}
// Then we need to restore need and counter when chars goes out.
if (right - left == p.length()) {
// because we just add 1 to right, so the window is actually from left to right - 1 now
if (counter[s.charAt(left) - 'a'] >= 0) {
need++;
}
counter[s.charAt(left) - 'a']++;
left++;
}
}
return res;
}
}
Python code:
上面的java实现采用的操作顺序是 移动right -- check need -- 移动left,下面的python实现采用 移动right -- 移动left -- check need,感觉更好理解。
这里的need其实就是九章中所讲的 “需要的字母数量减去window中相应字母数量的table的绝对值和”,维持这个need,就可以做到O(1)时间内更新因为移动window造成的所有变化。
class Solution(object):
def findAnagrams(self, s, p):
"""
:type s: str
:type p: str
:rtype: List[int]
"""
res = []
if not s or not p or len(s) == 0 or len(p) == 0:
return res
counter = collections.Counter(p)
left = 0
right = 0
need = len(p)
while right < len(s):
# move right bound
if counter[s[right]] > 0:
need -= 1
counter[s[right]] -= 1
right += 1
# move left bound
if right - left == len(p) + 1:
# 这里可以这么判断是因为p中没有的或者少于当前window中的字符都已经在从right进入的时候
# 减成了负数,等于0说明window中该字符数刚好和p中相等。
if counter[s[left]] >= 0:
need += 1
counter[s[left]] += 1
left += 1
# check if need == 0
if need == 0:
res.append(left)
return res
Contains Duplicate II这道题也可以用window sliding的方法做,可以结合在一起。
update Jan 27,2018 20:08
Update: 更新最新的 sliding window 写法
每层外循环开始时,先用一个while循环判断是否需要移动right,保证该循环结束之后,window size 是符合要求的 ,然后做事情,最后右移left;另外,为了一开始的时候可以初始化第一个位置,可以令 right 初始化为 -1;
Java Code:
class Solution { public List<Integer> findAnagrams(String s, String p) { List<Integer> res = new ArrayList<>(); // 记录p中各字母个数,用 count 跟踪剩余字母个数,当count==0时,说明我们找到一个结果 int[] counter = new int[256]; for (int i = 0; i < p.length(); ++i) { counter[p.charAt(i)]++; } int count = p.length(); // sliding window int left = 0, right = -1; final int WINDOWSIZE = p.length(); while (true) { while (right - left + 1 < WINDOWSIZE) { right++; if (right == s.length()) break; counter[s.charAt(right)]--; if (counter[s.charAt(right)] >= 0) count--; } if (right == s.length()) break; // 这行限制整个循环结束 // now window size must be p.length() if (count == 0) res.add(left); // move left pointer to right counter[s.charAt(left)]++; if (counter[s.charAt(left)] > 0) count++; left++; } return res; } }
update May 8,2018 23:24
C++ Code
更新一个C++的解法,和上面的Java解法基本思路是一致的,不同之处在于采用直接return的方法跳出循环。
class Solution {
public:
vector<int> findAnagrams(string s, string p) {
int _map[256] = {0};
for (char c : p) {
++_map[c];
}
vector<int> res;
int windowSize = p.size();
int remain = p.size();
int left = 0, right = -1;
while (true) {
while (right - left + 1 < windowSize) {
++right;
if (right >= s.size()) return res;
if (--_map[s[right]] >= 0) --remain;
}
// 此时window size 一定是p.size
if (remain == 0) res.push_back(left);
// move left pointer to right by 1
if (++_map[s[left++]] > 0) ++remain;
}
}
};