1. 程式人生 > >LeetCode438. Find All Anagrams in a String

LeetCode438. Find All Anagrams in a String

438. Find All Anagrams in a String

Given a string s and a non-empty string p, find all the start indices of p's anagrams in s.

Strings consists of lowercase English letters only and the length of both strings s and p will not be larger than 20,100.

The order of output does not matter.

Example 1:

Input:
s: "cbaebabacd" p: "abc"

Output:
[0, 6]

Explanation:
The substring with start index = 0 is "cba", which is an anagram of "abc".
The substring with start index = 6 is "bac", which is an anagram of "abc".

Example 2:

Input:
s: "abab" p: "ab"

Output:
[0, 1, 2]

Explanation:
The substring with start index = 0 is "ab", which is an anagram of "ab".
The substring with start index = 1 is "ba", which is an anagram of "ab".
The substring with start index = 2 is "ab", which is an anagram of "ab".

題目的意思是要求解在s中所有和p是'字謎'的子串。即子串中的字元和p相同,不考慮順序。

通過對p建立雜湊表map_target(key:字元, value:字元個數)。然後根據map_target的大小,記為count,即p中獨特的字元個數在s中取子串。建立兩個位置變數,begin和end,分別表示s中子串的起始位置和終止位置。

更新end的位置使得begin和end之間的子串包含p所需的所有字元,如果end-begin == p.size(),說明該子串是p的一個‘字謎’,返回此時begin所在的位置。

此時有個問題:begin和end怎麼更新呢。

begin和end的更新主要是根據count的大小來判斷。count表示還需要的獨特的字元的個數,比如針對"ababa"中的字元'a',只有begin~end之間的map_target['a']的值等於0時,count才能減1。當count==0時,說明變數begin和end之間的子串已經包含p所需的所有字元,此時可以調整變數begin的位置,如果s[begin]不是目標p的字元,此時可以將變數begin往前移;如果s[begin]是目標p的字元,還要判斷map_target[s[begin]]的大小,如果大小大於0了,說明begin到end之間的子串相對於目標p已經相差了字元s[begin]了,可以開始調整變數end的位置了,依此類推。

參考https://leetcode.com/problems/find-all-anagrams-in-a-string/discuss/92007/Sliding-Window-algorithm-template-to-solve-all-the-Leetcode-substring-search-problem

https://github.com/abesft/leetcode/blob/master/438FindAllAnagramsInString/438FindAllAnagramsInString.cpp

#include <iostream>
#include<vector>
#include<string>
#include<unordered_map>
using namespace std;


class Solution {
public:
	vector<int> findAnagrams(string s, string p) {

		vector<int> res;
		unordered_map<char, int> target;
		for (const auto & c : p)
			target[c]++;

		//注意,這個取的是target.size()作為判斷條件,而不是p.size()。因為p中有可能有重複的元素。
		int count = target.size();

		size_t begin = 0;
		size_t end = 0;
		while (end < s.size())
		{
			char tmp = s[end];
			if (target.find(tmp) != target.end())
			{
				target[tmp]--;
				if (target[tmp] == 0)
					count--;
			}
			end++;

			//此時begin到end之間已經包含了p所需的所有字元。
			while (count == 0)
			{
				char tmp = s[begin];
				if (target.find(tmp) != target.end())
				{
					target[tmp]++;
					if (target[tmp] > 0)
						count++;
				}

				if (end - begin == p.size())
					res.push_back(begin);

				begin++;
			}
		}
		return res;
	}
};

int main()
{
	Solution sln;
	vector<int> res = sln.findAnagrams("cbaebabacd", "abc");
	for (const auto &i : res)
		cout << i << " ";
	cout << endl;
	std::cout << "Hello World!\n";
}