# 692. Top K Frequent Words
Given a non-empty list of words, return the k most frequent elements.
Your answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first.
Input: ["i", "love", "leetcode", "i", "love", "coding"], k = 2 Output: ["i", "love"] Explanation: "i" and "love" are the two most frequent words. Note that "i" comes before "love" due to a lower alphabetical order.
Input: ["the", "day", "is", "sunny", "the", "the", "the", "sunny", "is", "is"], k = 4 Output: ["the", "is", "sunny", "day"] Explanation: "the", "is", "sunny" and "day" are the four most frequent words, with the number of occurrence being 4, 3, 2 and 1 respectively.
You may assume k is always valid, 1 ≤ k ≤ number of unique elements.
Input words contain only lowercase letters.
Try to solve it in
O(n log k) time and
O(n) extra space.
Approach 1: use a hash table to store frequencies, then put them into a heap and pull out the first k. Time:
O(N + klogN), space:
O(N). If you count the length of words(L) themselves, hashing each word takes L time, so total time is
O(LN + LklogN).
Approach 2: after we have the table fo frequencies, instead of using a heap, we can also use bucket sort (buckets from 1 to len(words)) to sort the frequencies, then in each bucket build a trie, and traverse the trie to output words in order. Time:
O(LN + LN + kL) = O(LN + Lk) if counting length of words -- a bucket has up to N words, each takes L time to insert into trie, and kL time to pull out the top k.
# Code (Python)
def topKFrequent(self, words: 'List[str]', k: 'int') -> 'List[str]': counts = Counter(words) heap = [(-count, word) for word, count in counts.items()] heapq.heapify(heap) return [heapq.heappop(heap) for _ in range(k)]
# Code (C++)