1. 程式人生 > >python 擴充套件庫 pandas

python 擴充套件庫 pandas

pd.qcut(x,bins,retbins=False)

根據陣列x內各數值的頻率以及bins數量對x進行等頻率分箱。retbins決定是否返回一個含有各切分點的list。返回值首先是一個含有每個x值所對應的分箱區間的list,其次是每個分箱的區間。呼叫返回物件的.value_counts()函式可檢視各分箱對應頻率。.describe()函式可展示各區間的count和freq,注意,如果輸入為pd.Series,describe函式將展示series類的describe,因此將不展示區間,因此我們需要輸入的是pd.Series.values

>>> a=
pd.qcut([1,1,2,3,4,4,5,6,7],3) >>> a [(0.999, 2.667], (0.999, 2.667], (0.999, 2.667], (2.667, 4.333], (2.667, 4.333], (2.667, 4.333], (4.333, 7.0], (4.333, 7.0], (4.333, 7.0]] Categories (3, interval[float64]): [(0.999, 2.667] < (2.667, 4.333] < (4.333, 7.0]] >>> a.value_counts() (0.999
, 2.667] 3 (2.667, 4.333] 3 (4.333, 7.0] 3 dtype: int64 >>> a.describe() counts freqs categories (0.999, 2.667] 3 0.333333 (2.667, 4.333] 3 0.333333 (4.333, 7.0] 3 0.333333