187. Repeated DNA Sequences - Medium
阿新 • • 發佈:2019-01-03
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
Example:
Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT" Output: ["AAAAACCCCC", "CCCCCAAAAA"]
用兩個hashset,set和res。從s[0]開始,十位十位的掃,如果不能被加入set中,說明該子串重複出現,就要加入res中。
注意為了避免"AAAAAAAAAAAA"情況重複輸出,在加入res時應該check一下是否已經存在該子串,或者res也用hashset
time: O(n), space: O(n)
class Solution { public List<String> findRepeatedDnaSequences(String s) { Set<String> set = new HashSet<>(); Set<String> res = new HashSet<>(); for(int i = 0; i + 9 < s.length(); i++) { if(!set.add(s.substring(i, i + 10))) { res.add(s.substring(i, i + 10)); } } return newArrayList<>(res); } }