1. 程式人生 > >【Pandas-Cookbook】04:分組、聚集

【Pandas-Cookbook】04:分組、聚集

# -*-coding:utf-8-*-

#  by kevinelstri
#  2017.2.16

# ---------------------
# Chapter 4: Find out on which weekday people bike the most with groupby and aggregate
# ---------------------

import pandas as pd
import matplotlib.pyplot as plt

"""
    4.1 Adding a 'weekday' column to our dataframe
"""
bikes = pd.read_csv('../data/bikes.csv', sep=';', encoding='latin1', index_col='Date', parse_dates=['Date'], dayfirst=True) print bikes.head() bikes['Berri 1'].plot() # 繪製曲線 # plt.show() berri_bikes = bikes[['Berri 1']].copy() # 將某一列的資料複製出來,單獨為一列 print berri_bikes[:5] print berri_bikes.index print
berri_bikes.index.day print berri_bikes.index.weekday berri_bikes.loc[:, 'weekday'] = berri_bikes.index.weekday print berri_bikes[:5] """ 4.2 Adding up the cyclists by weekday """ """ 使用DataFrames中的.groupby()方法進行分組,並計算每一組的數量和 """ weekday_counts = berri_bikes.groupby('weekday').aggregate(sum) print
weekday_counts weekday_counts.index = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'] print weekday_counts weekday_counts.plot(kind='bar') # plt.show() """ 4.3 Putting it together """ """ 所有程式碼彙總 """ bikes = pd.read_csv('../data/bikes.csv', sep=';', encoding='latin1', index_col='Date', dayfirst=True, parse_dates=['Date']) berri_bikes = bikes[['Berri 1']].copy() berri_bikes.loc[:, 'weekday'] = berri_bikes.index.weekday weekday_counts = berri_bikes.groupby('weekday').aggregate(sum) weekday_counts.index = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'] weekday_counts.plot(kind='bar') plt.show() """ 分析: 主要是計算時間,分組處理一週時間,將每週對應的數量加到對應的天上 方法: 1、csv資料的讀取 2、列資料的複製 3、將資料按照一週來進行劃分 4、按照一週進行分組處理資料,修改索引 5、直方圖展示 """