lc.187 The DNA sequence is composed of a series of nucleotides abbreviated as 'A', 'C', 'G', and 'T'.
*For example, "ACGAATTCCG" is a DNA sequence.
Given a string s that represents a DNA sequence, return all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule. You may return the answer in any order.
example 1: Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT" Output: ["AAAAACCCCC","CCCCCAAAAA"]
example 2: Input: s = "AAAAAAAAAAAAA" Output: ["AAAAAAAAAA"]
classSolution{ public List<String> findRepeatedDnaSequences(String s){ int L = 10, n = s.length(); HashSet<String> seen = new HashSet(), output = new HashSet();
// iterate over all sequences of length L for (int start = 0; start < n - L + 1; ++start) { String tmp = s.substring(start, start + L); if (seen.contains(tmp)) output.add(tmp); seen.add(tmp); } returnnew ArrayList<String>(output); } }
the official solution makes me wonder the if THE TIME COMPLEXITY of contains method is as simple(O(1)) as it seems.besides,also about how HashSet works.
from the code below(jdk8) can we see that actually HashSet is realized by HashMap,
Method containsKey: /** * Returns <tt>true</tt> if this map contains a mapping for the * specified key. * * @param key The key whose presence in this map is to be tested * @return <tt>true</tt> if this map contains a mapping for the specified * key. */ publicbooleancontainsKey(Object key){ return getNode(hash(key), key) != null;//hash function maps keys to some values,but there might be code collision } Method getNode:
Up to now,we find that method HashSet.contains utilizes the HashMap.getNode to check if HashSet contains such a object, and HashMap.getNode searches the object by traversing the HashMap
HashMap is realized by a Node array which probably has Tree or LinkedList linked on it; To fully understand method getNode,please refer to:HashMap底层实现.
In a word,in function containsKey,if key can be matched,it returns true.By traversing the tree or LinkedList,the time complexity isn’t O(1), it muchly depends on how HashMap is hashed.Thus I think the time complexity of HashSet.contains is O(k·l)
k is the size of HashSet(let’s assume that the value in HashSet is String)l is the length of a String.
Actually,only under the most extreme situation can it becomes Θ(k·l),under most situations,we can regard the time complexity of contains as Θ(1) or Θ(l).
(AHHHHHHH it suddenly reminds me of Pseudo Polynomial Time,but already too much for today, might post another blog to explain it later)