pandas庫中資料結構DataFrame的繪製函式
在使用Canopy進行資料分析時,我們會用到pandas庫,通過它我們可以靈活的對資料進行處理、轉換和繪圖等操作。其中非常重要的資料結構就是DataFrame。
本文主要整理一下對DataFrame物件進行plot操作的使用說明。
函式名稱:
pandas.DataFrame.plot
函式引數列表及預設值:
DataFrame.plot(data, x=None, y=None, kind=’line’, ax=None, subplots=False, sharex=True, sharey=False, layout=None, figsize=None, use_index=True, title=None, grid=None, legend=True, style=None, logx=False, logy=False, loglog=False,
xticks=None, yticks=None, xlim=None, ylim=None, rot=None, fontsize=None, colormap=None, table=False, yerr=None, xerr=None, secondary_y=False, sort_columns=False, **kwds)
Make plots of DataFrame using matplotlib / pylab.
相關引數說明:
Parameters :
data : DataFrame
# 需要繪製的資料物件
x : label or position, default None
# x軸的標籤值,預設值是 None
y : label or position, default None
Allows plotting of one column versus another
# y軸的標籤值,預設值是 None
kind : str
‘line’ : line plot (default)
‘bar’ : vertical bar plot
‘barh’ : horizontal bar plot
‘hist’ : histogram
‘box’ : boxplot
‘kde’ : Kernel Density Estimation plot
‘density’ : same as ‘kde’
‘area’ : area plot
‘pie’ : pie plot
‘scatter’ : scatter plot
‘hexbin’ : hexbin plot
# 標識繪製方式的字串,預設值是 ‘line’
ax : matplotlib axes object, default None
# 當使用到subplots繪圖時,會得到包含子圖物件的引數,
再完善子圖內容時需要指定該引數,預設值是 None [可參照後面示例1]
subplots : boolean, default False
Make separate subplots for each column
# 所繪製物件資料 data 是否需要分成不同的子圖, 預設值是 False [可參照後面示例2]
sharex : boolean, default True
In case subplots=True, share x axis
# 當引數subplots 為 True時,該值表示各子圖是否共享x軸標籤值,預設值是 True
sharey : boolean, default False
In case subplots=True, share y axis
# 當引數subplots 為 True時,該值表示各子圖是否共享x軸標籤值,預設值為 True
layout : tuple (optional)
(rows, columns) for the layout of subplots
figsize : a tuple (width, height) in inches
use_index : boolean, default True
Use index as ticks for x axis
title : string
# 圖的標題
Title to use for the plot
grid : boolean, default None (matlab style default)
Axis grid lines
# 是否需要顯示網格,預設值是 None[需要留意的是,在Canopy中預設是顯示網格的]
legend : False/True/’reverse’
Place legend on axis subplots
# 新增子圖的圖例,預設值是True
style : list or dict
matplotlib line style per column
# 設定繪製線條格式,僅當引數kind 設定為 ‘line’ [可參照後面示例3]
logx : boolean, default False
Use log scaling on x axis
# 將x軸設定成對數座標,預設值是False
logy : boolean, default False
Use log scaling on y axis
# 將y軸設定成對數座標,預設值是False
loglog : boolean, default False
Use log scaling on both x and y axes
# 將x軸、y軸都設定成對數座標,預設值是False
xticks : sequence
Values to use for the xticks
# 指定 x軸標籤的取值範圍(或步長)
yticks : sequence
Values to use for the yticks
# 指定 y軸標籤的取值範圍(或步長)
xlim : 2-tuple/list
ylim : 2-tuple/list
rot : int, default None
Rotation for ticks (xticks for vertical,
yticks for horizontal plots)
fontsize : int, default None
Font size for xticks and yticks
# 字型大小,預設值是 None
colormap : str or matplotlib colormap object,
default None
Colormap to select colors from. If string,
load colormap with that name from matplotlib.
# 指定具體顏色取值或對應物件名稱,預設值是 None
colorbar : boolean, optional
If True, plot colorbar (only relevant for
‘scatter’ and ‘hexbin’ plots)
# 是否顯示顏色條,如果設為 True,則僅當引數kind 設定為 ‘scatter’、 ‘hexbin’時有效
position : float
Specify relative alignments for bar plot layout.
From 0 (left/bottom-end) to 1 (right/top-end).
Default is 0.5 (center)
layout : tuple (optional)
(rows, columns) for the layout of the plot
table : boolean, Series or DataFrame, default False
If True, draw a table using the data in the
DataFrame and the data will be transposed to meet
matplotlib’s default layout. If a Series or
DataFrame is passed, use passed data to draw a table.
yerr : DataFrame, Series, array-like, dict and str
See Plotting with Error Bars for detail.
xerr : same types as yerr.
stacked : boolean, default False in line and
bar plots, and True in area plot.
If True, create stacked plot.
# 引數kind 設定為 ‘line’、’bar’時,該值預設為False,
# 引數 kind 設定為’area’時,該值預設為True
# 當該引數設定為True時,生成對應的堆積圖
sort_columns : boolean, default False
Sort column names to determine plot ordering
secondary_y : boolean or sequence, default False
Whether to plot on the secondary y-axis If a list/tuple,
which columns to plot on secondary y-axis
mark_right : boolean, default True
When using a secondary_y axis, automatically mark the
column labels with “(right)” in the legend
kwds : keywords
Options to pass to matplotlib plotting method
Returns : axes : matplotlib.AxesSubplot or np.array of them
示例:
示例1:
1 |
<pre> ###
7. count the sum of each last letter of names of diferent 'sex' in each year |
2 |
get_last_letter = lambda x:
x[ - 1 ] |
3 |
last_letters = names.name. map (get_last_letter) |
4 |
last_letters.name = 'last_letter' |
5 |
table = names.pivot_table( 'births' ,
rows = last_letters,
cols = [ 'sex' |