1. 程式人生 > 其它 >python中的Collections模組之Counter

python中的Collections模組之Counter

雖然工作中常用Python,但都是些基本操作,對於這種高階的工具包,一直是隻知道有那麼個東西,沒呼叫過,每次都是自己造輪子。

人生苦短, 我用Python,為毛還重複造輪子,裝什麼C呢。

一、Counter

  猜名字,是跟計數有關的玩意兒

  看原始碼中類的介紹

 1 class Counter(dict):
 2     '''Dict subclass for counting hashable items.  Sometimes called a bag
 3     or multiset.  Elements are stored as dictionary keys and their counts
4 are stored as dictionary values.'''

  大概就是,字典的子類,為雜湊元素提供計數功能,新生成的字典,元素為key,計數為values,按原來的key順序進行的排序。

import collections
ret=collections.Counter("abbbbbccdeeeeeeeeeeeee")
ret
Out[12]: Counter({'a': 1, 'b': 5, 'c': 2, 'd': 1, 'e': 13})

  看原始碼中給的案例:

    >>> c = Counter('abcdeabcdabcaba
') # count elements from a string # 返回最多的3個key和value >>> c.most_common(3) # three most common elements [('a', 5), ('b', 4), ('c', 3)]
    
>>> sorted(c) # list all unique elements ['a', 'b', 'c', 'd', 'e'] >>> ''
.join(sorted(c.elements())) # list elements with repetitions 'aaaaabbbbcccdde' >>> sum(c.values()) # total of all counts 15 >>> c['a'] # count of letter 'a' 5

  #對元素的更新 >>> for elem in 'shazam': # update counts from an iterable ... c[elem] += 1 # by adding 1 to each element's count >>> c['a'] # now there are seven 'a' 7 >>> del c['b'] # remove all 'b' >>> c['b'] # now there are zero 'b' 0 >>> d = Counter('simsalabim') # make another counter >>> c.update(d) # add in the second counter >>> c['a'] # now there are nine 'a' 9 >>> c.clear() # empty the counter >>> c Counter() Note: If a count is set to zero or reduced to zero, it will remain in the counter until the entry is deleted or the counter is cleared: >>> c = Counter('aaabbc') >>> c['b'] -= 2 # reduce the count of 'b' by two >>> c.most_common() # 'b' is still in, but its count is zero [('a', 3), ('c', 1), ('b', 0)]

常用API:  

most_common(num),返回計數最多的num個元素,如果不傳引數,則返回所以元素

    def most_common(self, n=None):
        '''List the n most common elements and their counts from the most
        common to the least.  If n is None, then list all element counts.

        >>> Counter('abcdeabcdabcaba').most_common(3)
        [('a', 5), ('b', 4), ('c', 3)]

        '''
        # Emulate Bag.sortedByCount from Smalltalk
        if n is None:
            return sorted(self.items(), key=_itemgetter(1), reverse=True)
        return _heapq.nlargest(n, self.items(), key=_itemgetter(1))

elements

  返回一個迭代器,迭代物件是所有的元素,只不過給你按原始資料的順排了一下序,一樣的給你放一起了

  

>>> c = Counter('ABCABC')
>>> c.elements() ==》<itertools.chain at 0x5f10828>
>>> list(c.elements())
>>>
['A', 'A', 'B', 'B', 'C', 'C'] #這裡不是排序,是你恰好引數順序是ABC,官方給的這例子容易誤導人

d=Counter('ACBCDEFT')
list(d.elements())
['A','C','B','D','E','F','T']

所以,要想帶順序,自己在呼叫sorted一下
sorted(d.elements()) => ['A','B','C','D','E','F','T']

還有一個,subtract,感覺在leetcode刷題時,會比較實用

  

    def subtract(*args, **kwds):
        '''Like dict.update() but subtracts counts instead of replacing them.
        Counts can be reduced below zero.  Both the inputs and outputs are
        allowed to contain zero and negative counts.

        Source can be an iterable, a dictionary, or another Counter instance.

  啥意思,就是更新你的Counter物件,怎麼更新,基於你傳入的引數,它給你做減法,引數是可迭代物件,字典,或者另一個Counter

  看官方的例子

  

c=Counter("which")  
out:>> Counter({'w': 1, 'h': 2, 'i': 1, 'c': 1}) c.subtract('witch') #傳入一個迭代物件,對迭代物件的每一個元素,對原物件進行減法,注意,t是原物件沒有的 out:>> Counter({'w': 0, 'h': 1, 'i': 0, 'c': 0, 't': -1}) c.subtract(Counter('watch')) #傳入另一個Counter物件 out:>> Counter({'w': -1, 'h': 0, 'i': 0, 'c': -1, 't': -2, 'a': -1})

c.subtract({'h':3,'q':5}) #傳入一個字典,value是個數 也就是減去多少個key
out:>> Counter({'w': -1, 'h': -3, 'i': 0, 'c': -1, 't': -2, 'a': -1, 'q': -5})

 其他好像沒啥好玩的了,以為幾分鐘就搞定了Collections的所有模組,想多了,先寫個Counter,後邊的幾個慢慢補。