pandas resample 重采樣

阿新 • • 發佈：2018-07-08

目標 series uid option 下采樣類型方向 NPU 區間

下方是pandas中resample方法的定義，幫助文檔http://pandas.pydata.org/pandas-docs/stable/timeseries.html#resampling中有更加詳細的解釋。


    def resample(self, rule, how=None, axis=0, fill_method=None, closed=None,
                 label=None, convention=‘start‘, kind=None, loffset=None,
                 limit=None, base=0, on=None, level=None):
         
"""
        Convenience method for frequency conversion and resampling of time
        series.  Object must have a datetime-like index (DatetimeIndex,
        PeriodIndex, or TimedeltaIndex), or pass datetime-like values
        to the on or level keyword.（數據重采樣和頻率轉換，數據必須有時間類型的索引列）

        Parameters
        ----------
        rule : string
            the offset string or object representing target conversion（代表目標轉換的偏移量）
        axis : int, optional, default 0（操作的軸信息）
        closed : {‘right‘, ‘left‘}
            Which side of bin interval is closed. The default is ‘left‘
            for all frequency offsets except for ‘M‘, ‘A‘, ‘Q‘, ‘BM‘,
            ‘BA‘, ‘BQ‘, and ‘W‘ which all have a default of ‘right‘.（哪一個方向的間隔是關閉的，）
        label : {‘right‘, ‘left‘}
            Which bin edge label to label bucket with. The default is ‘left‘
            for all frequency offsets except for ‘M‘, ‘A‘, ‘Q‘, ‘BM‘,
            ‘BA‘, ‘BQ‘, and ‘W‘ which all have a default of ‘right‘.（區間的哪一個方向的邊界標簽保留）
        convention : {‘start‘, ‘end‘, ‘s‘, ‘e‘}
            For PeriodIndex only, controls whether to use the start or end of
            `rule`
        kind: {‘timestamp‘, ‘period‘}, optional
            Pass ‘timestamp‘ to convert the resulting index to a
            ``DateTimeIndex`` or ‘period‘ to convert it to a ``PeriodIndex``.
            By default the input representation is retained.
        loffset : timedelta
            Adjust the resampled time labels
        base : int, default 0
            For frequencies that evenly subdivide 1 day, the "origin" of the
            aggregated intervals. For example, for ‘5min‘ frequency, base could
            range from 0 through 4. Defaults to 0
        on : string, optional
            For a DataFrame, column to use instead of index for resampling.
            Column must be datetime-like.

            .. versionadded:: 0.19.0

        level : string or int, optional
            For a MultiIndex, level (name or number) to use for
            resampling.  Level must be datetime-like.

            .. versionadded:: 0.19.0

        Returns
        -------
        Resampler object

        Notes
        -----
        See the `user guide
        <http://pandas.pydata.org/pandas-docs/stable/timeseries.html#resampling>`_
        for more.

        To learn more about the offset strings, please see `this link
        <http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`__. 


        Examples
        --------

        Start by creating a series with 9 one minute timestamps.（新建頻率為1min的時間序列）

        >>> index = pd.date_range(‘1/1/2000‘, periods=9, freq=‘T‘)
        >>> series = pd.Series(range(9), index=index)
        >>> series
        2000-01-01 00:00:00    0
        2000-01-01 00:01:00    1
        2000-01-01 00:02:00    2
        2000-01-01 00:03:00    3
        2000-01-01 00:04:00    4
        2000-01-01 00:05:00    5
        2000-01-01 00:06:00    6
        2000-01-01 00:07:00    7
        2000-01-01 00:08:00    8
        Freq: T, dtype: int64

        Downsample the series into 3 minute bins and sum the values
        of the timestamps falling into a bin.（下采樣為三分鐘）

        >>> series.resample(‘3T‘).sum()
        2000-01-01 00:00:00     3
        2000-01-01 00:03:00    12
        2000-01-01 00:06:00    21
        Freq: 3T, dtype: int64

        Downsample the series into 3 minute bins as above, but label each
        bin using the right edge instead of the left. Please note that the
        value in the bucket used as the label is not included in the bucket,
        which it labels. For example, in the original series the
        bucket ``2000-01-01 00:03:00`` contains the value 3, but the summed
        value in the resampled bucket with the label ``2000-01-01 00:03:00``
        does not include 3 (if it did, the summed value would be 6, not 3).
        To include this value close the right side of the bin interval as
        illustrated in the example below this one.

        >>> series.resample(‘3T‘, label=‘right‘).sum()（保留間隔的右側標簽，上一個結果是左側標簽）
        2000-01-01 00:03:00     3
        2000-01-01 00:06:00    12
        2000-01-01 00:09:00    21
        Freq: 3T, dtype: int64

        Downsample the series into 3 minute bins as above, but close the right
        side of the bin interval.

        >>> series.resample(‘3T‘, label=‘right‘, closed=‘right‘).sum()
        2000-01-01 00:00:00     0
        2000-01-01 00:03:00     6
        2000-01-01 00:06:00    15
        2000-01-01 00:09:00    15
        Freq: 3T, dtype: int64

        Upsample the series into 30 second bins.

        >>> series.resample(‘30S‘).asfreq()[0:5] #select first 5 rows
        2000-01-01 00:00:00   0.0
        2000-01-01 00:00:30   NaN
        2000-01-01 00:01:00   1.0
        2000-01-01 00:01:30   NaN
        2000-01-01 00:02:00   2.0
        Freq: 30S, dtype: float64

        Upsample the series into 30 second bins and fill the ``NaN``
        values using the ``pad`` method.

        >>> series.resample(‘30S‘).pad()[0:5]
        2000-01-01 00:00:00    0
        2000-01-01 00:00:30    0
        2000-01-01 00:01:00    1
        2000-01-01 00:01:30    1
        2000-01-01 00:02:00    2
        Freq: 30S, dtype: int64

        Upsample the series into 30 second bins and fill the
        ``NaN`` values using the ``bfill`` method.

        >>> series.resample(‘30S‘).bfill()[0:5]
        2000-01-01 00:00:00    0
        2000-01-01 00:00:30    1
        2000-01-01 00:01:00    1
        2000-01-01 00:01:30    2
        2000-01-01 00:02:00    2
        Freq: 30S, dtype: int64

        Pass a custom function via ``apply``

        >>> def custom_resampler(array_like):
        ...     return np.sum(array_like)+5

        >>> series.resample(‘3T‘).apply(custom_resampler)
        2000-01-01 00:00:00     8
        2000-01-01 00:03:00    17
        2000-01-01 00:06:00    26
        Freq: 3T, dtype: int64

        For a Series with a PeriodIndex, the keyword `convention` can be
        used to control whether to use the start or end of `rule`.

        >>> s = pd.Series([1, 2], index=pd.period_range(‘2012-01-01‘,
                                                        freq=‘A‘,
                                                        periods=2))
        >>> s
        2012    1
        2013    2
        Freq: A-DEC, dtype: int64

        Resample by month using ‘start‘ `convention`. Values are assigned to
        the first month of the period.

        >>> s.resample(‘M‘, convention=‘start‘).asfreq().head()
        2012-01    1.0
        2012-02    NaN
        2012-03    NaN
        2012-04    NaN
        2012-05    NaN
        Freq: M, dtype: float64

        Resample by month using ‘end‘ `convention`. Values are assigned to
        the last month of the period.

        >>> s.resample(‘M‘, convention=‘end‘).asfreq()
        2012-12    1.0
        2013-01    NaN
        2013-02    NaN
        2013-03    NaN
        2013-04    NaN
        2013-05    NaN
        2013-06    NaN
        2013-07    NaN
        2013-08    NaN
        2013-09    NaN
        2013-10    NaN
        2013-11    NaN
        2013-12    2.0
        Freq: M, dtype: float64

        For DataFrame objects, the keyword ``on`` can be used to specify the
        column instead of the index for resampling.

        >>> df = pd.DataFrame(data=9*[range(4)], columns=[‘a‘, ‘b‘, ‘c‘, ‘d‘])
        >>> df[‘time‘] = pd.date_range(‘1/1/2000‘, periods=9, freq=‘T‘)
        >>> df.resample(‘3T‘, on=‘time‘).sum()
                             a  b  c  d
        time
        2000-01-01 00:00:00  0  3  6  9
        2000-01-01 00:03:00  0  3  6  9
        2000-01-01 00:06:00  0  3  6  9

        For a DataFrame with MultiIndex, the keyword ``level`` can be used to
        specify on level the resampling needs to take place.

        >>> time = pd.date_range(‘1/1/2000‘, periods=5, freq=‘T‘)
        >>> df2 = pd.DataFrame(data=10*[range(4)],
                               columns=[‘a‘, ‘b‘, ‘c‘, ‘d‘],
                               index=pd.MultiIndex.from_product([time, [1, 2]])
                               )
        >>> df2.resample(‘3T‘, level=0).sum()
                             a  b   c   d
        2000-01-01 00:00:00  0  6  12  18
        2000-01-01 00:03:00  0  4   8  12

pandas resample 重采樣

目標 series uid option 下采樣類型方向 NPU 區間下方是pandas中resample方法的定義，幫助文檔http://pandas.pydata.org/pandas-docs/stable/timeseries.html#resamplin

基於傅裏葉變換的音頻重采樣算法 (附完整c代碼)

操作 endif 傅裏葉變換思路 lis fin log 替換我們前面有提到音頻采樣算法： WebRTC 音頻采樣算法附完整C++示例代碼簡潔明了的插值音頻重采樣算法例子 (附完整C代碼) 近段時間有不少朋友給我寫過郵件，說了一些他們使用的情況和問題。坦白講，我

ENVI5.3 影像重采樣和 tiff 保存

時有 ref size resize agg gre png 參考雙線輸入---之前用envi4.5處理後的2013分類影像---輸出重采樣的影像直接在工具欄搜索 resize data---出來對話框，這裏有幾種方法----sample line 指的行列號，可

MCMC采樣理論的一點知識

mat otto www 要求 style 函數表現在 tin 能量看了好多相關的知識，大致了解了一下馬爾可夫鏈-蒙特卡羅采樣理論，有必要記來下來。蒙特卡羅積分：（來自：http://blog.csdn.net/itplus/article/details/191

<數字圖像處理1> 數字圖像定義(Definition) 類型(Type) 采樣 (Sampling) 量化 (Quantisation)

nali rom pixel diff 類型 out 4.3 this ecif Continuous Greyscale Image 1 mapping f from a rectangular domain Ω =(0,a1) X (0,a2) to a co-do

音頻中采樣位數，采樣率，比特率的名詞解釋（轉）

工程性能 dvd 工作室轉化術語意思普通時間間隔采樣位數：采樣位數可以理解為采集卡處理聲音的解析度。這個數值越大，解析度就越高，錄制和回放的聲音就越真實。我們首先要知道：電腦中的聲音文件是用數字0和1來表示的。所以在電腦上錄音的本質就是把模擬聲音信號轉換成

圖像的降采樣與升采樣（二維插值）----轉自LOFTER-gengjiwen

sample esc text arch 均可分享 lose earch 測試圖像的降采樣與升采樣（二維插值） 1、先說說這兩個詞的概念：降采樣，即是采樣點數減少。對於一幅N*M的圖像來說，如果降采樣系數為k,則即是在原圖中每行每列每隔k個點取一個點組成一幅圖像。

空間譜專題03：時空特性與采樣定理

時間技術 big 得出 -s lin conf only rar 作者：桂。時間：2017-08-27 08:07:30 鏈接：http://www.cnblogs.com/xingshansi/p/7439558.html 一、一階無模糊特性在DO

什麽是音頻視頻比特率，采樣率，講的很不錯

article details 次數要素清晰 lame 質量位數 class 簡單來講，采樣率和比特率就像是坐標軸上的橫縱坐標。橫坐標的采樣率表示了每秒鐘的采樣次數。縱坐標的比特率表示了用數字量來量化模擬量的時候的精度。采樣率類似於動態影像的幀數，比如電影

大數據量樣本隨機采樣-蓄水池算法

image 采樣 0.00 選擇 add 隨著個推 val 了解最近在個性化推薦系統的優化過程中遇到一些問題，大致描述如下：目前在我們的推薦系統中，各個推薦策略召回的item相對較為固定，這樣就會導致一些問題，用戶在多個推薦場景（如果多個推薦場景下使用了相同的召回策略）

win10 插入16k采樣的耳機無法播放和錄音的問題定位

解決辦法疑難問題問題采樣率解決辦法。聲音了解增強沒有　　平時做智能耳機，需要經常在windows上測試不同采樣率的聲音信號。可是，最近在16k雙聲道輸入的情況下，無論系統都使用該耳機進行播放，該問題思索了好久，一直沒有解決辦法。　　今天無意中使用了wi

類不平衡問題與SMOTE過采樣算法

focus 英文分享能夠目前 div -i n) macbookp 在前段時間做本科畢業設計的時候，遇到了各個類別的樣本量分布不均的問題——某些類別的樣本數量極多，而有些類別的樣本數量極少，也就是所謂的類不平衡（class-imbalance）問題。本篇

修改采樣線名稱

nbsp ttext typeof ldoc info mit ble tor post 問題來源：在Autodesk論壇中，一位朋友提出了這樣一個問題：要把路線曲線點、超高點等特征信息在橫斷面圖標題中顯示出來，註意是橫斷面圖。解決方法：如果直接解決這個問題，貌似不可

音視頻處理之采樣數據20180223

大小 src 圖像常用 pla ado 防止 gpo yuv 一、視頻像素數據 1.最為常用的是YUV420,YUV格式像素數據查看工具yuv player,如下圖： ps:BMP文件中存儲的就是RGB格式的像素數據。 2.YUV格式簡介相關實驗表明，人眼堆亮度銘感

Hulu機器學習問題與解答系列 | 十四：如何對高斯分布進行采樣

系列 number 神經網絡 res per 功能 rand 生成器如果歡迎回到“采樣”系列~ 今天的內容是【如何對高斯分布進行采樣】場景描述高斯分布，又稱正態分布，是一個在數學、物理及工程領域都非常重要的概率分布。在實際應用中，我們經常需要對高斯分布進行

蒙特卡羅馬爾科夫與吉布斯采樣

body inf 9.png com pos eight div alt image 蒙特卡羅馬爾科夫與吉布斯采樣

Matlab產生隨機序列，並采樣

clas 四舍五入 true light ... zeros body gpo end clear all; M = 10; % bit數符號數 N = 100; % 總采樣數 L = N/M; % 每bit采樣數 emp_rate = 0.5; % 占空比 imp =

類別不平衡之欠采樣（undersampling）

HR shuffle cat 圖片 mage cascade sele cas awk 類別不平衡就是指分類任務中不同類別的訓練樣例數目差別很大的情況常用的做法有三種，分別是1.欠采樣， 2.過采樣， 3.閾值移動由於這幾天做的project的target為正值的概率不

Word2vec負采樣

比較最大值概率 repr 詞向量 [1] 直接 wid 證明下文中的模型都是以Skip-gram模型為主。 1、論文發展 word2vec中的負采樣(NEG)最初由 Mikolov在論文《Distributed Representations of Words a

線性混合效應模型Linear Mixed-Effects Models的部分折疊Gibbs采樣

分享圖片測試的折疊適合貝葉斯協變 acc 變量 C4D 本文介紹了線性混合效應模型的新型貝葉斯分析。該分析基於部分折疊的方法，該方法允許某些組件從模型中部分折疊。得到的部分折疊的Gibbs（PCG）采樣器被構造成適合線性混合效應模型，預計會比相應的Gibbs采樣器

pandas resample 重采樣

相關推薦