Pandas Cheat Sheet學習筆記

阿新 • • 發佈：2019-01-28

Pandas 作為python的庫，包含易於使用的資料結構，是一個強大資料分析的工具。

Pandas資料結構

Pandas的主要資料結構有Series和DataFrame。Series是一種類似於一維陣列的物件，它由一組資料以及一組與之相關的一組標籤組成。DataFrame是一個表格型資料結構，它含有一組有序的列，每列可以是不同的值型別。

Pandas的I/O操作

1.從csv檔案中讀寫資料。

>>> df = pd.read_csv('file.csv', header=None, nrows=5)
>>> df.to_csv('myDataFrame.csv')

2.從Excel檔案中讀寫資料。

>>> pd.read_excel('file.xlsx')
>>> pd.to_excel('dir/myDataFrame.xlsx', sheet_name='Sheet1')

Pandas選擇資料

>>> df.at([0], ['Country'])

1. 獲得資料

>>> s['b'] Get one element

>>> df[1:] Get subset of a DataFrame

2. 通過位置或者標籤獲取資料

df.iloc([0],[0]) 位置

df.iat([0],[0]) 位置

>>> df.loc([0], ['Country']) 標籤

>>> df.at([0], ['Country']) 標籤

3. ix可以自動地通過位置或標籤獲取資料

>>> df.ix[2] Select single row of
Country Brazil subset of rows
Capital Brasília
Population 207847528
>>> df.ix[:,'Capital'] Select a single column of
0 Brussels subset of columns
1 New Delhi
2 Brasília
>>> df.ix[1,'Capital'] Select rows and columns
'New Delhi'

4. 布林運算獲取資料

>> s[~(s > 1)] Series s where value is not >1
>>> s[(s < -1) | (s > 2)] s  where value is <-1 or >2
>>> df[df['Population']>1200000000]  Use filter to adjust DataFrame

5. 資料設定

>>> s['a'] = 6

pandas的Drop操作去除資料

>>> s.drop(['a', 'c']) Drop values from rows (axis=0)
>>> df.drop('Country', axis=1) Drop values from columns(axis=1)

pandas的排序功能

>>> df.sort_index() Sort by labels along an axis
>>> df.sort_values(by='Country') Sort by the values along an axis
>>> df.rank() Assign ranks to entries

獲取pandas中Series和DataFrame 資訊

>>> df.shape (rows,columns)
>>> df.index  Describe index 
>>> df.columns Describe DataFrame columns
>>> df.info() Info on DataFrame
>>> df.count() Number of non-NA values

pandas中的運算功能

>>> df.sum() Sum of values
>>> df.cumsum() Cummulative sum of values
>>> df.min()/df.max() Minimum/maximum values
>>> df.idxmin()/df.idxmax() Minimum/Maximum index value
>>> df.describe() Summary statistics
>>> df.mean() Mean of values
>>> df.median() Median of values

pandas使用的應用函式

>>> f = lambda x: x*2
>>> df.apply(f) Apply function
>>> df.applymap(f) Apply function element-wise

pandas資料結構之間的運算

>>> s3 = pd.Series([7, -2, 3], index=['a', 'c', 'd'])
>>> s + s3
a 10.0
b NaN
c 5.0
d 7.0

pandas資料結構之間的運算加上填充值

>>> s.add(s3, fill_value=0)
a 10.0
b -5.0
c 5.0
d 7.0
>>> s.sub(s3, fill_value=2)
>>> s.div(s3, fill_value=4)
>>> s.mul(s3, fill_value=3)

參考：http://www.kdnuggets.com/2017/01/pandas-cheat-sheet.html

Pandas Cheat Sheet學習筆記

Pandas 作為python的庫，包含易於使用的資料結構，是一個強大資料分析的工具。 Pandas資料結構 Pandas的主要資料結構有Series和DataFrame。Series是一種類似於一維陣列的物件，它由一組資料以及一組與之相關的一組標籤組成。DataFrame

pandas模塊學習筆記1--數據結構

名稱 pandas taf 不同函數標記數據標簽命名 port pandas是基於Numpy構建的。 pandas的兩個主要數據結構：Series和DataFrame。 Series和DataFrame用的次數非常多，將其導入本地命名空間會更方便： from pa

numpy & pandas & matplotlib 學習筆記

bsp gpo pan blog vid mat matplot atp plot 3/9/2018 c >>>編寫>>> numpy >>>升級>>> pandas https://www.bilib

Pandas庫初步學習筆記【Ⅰ】

Pandas庫初步學習筆記 DataFrame中橫行叫index,豎行叫columns。 class pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)

python pandas庫的學習筆記一pandas的資料結構

要使用pandas，首先要熟悉他的兩個主要的資料結構：Series和DataFrame。一、Series Series 是一種類似於一維陣列的物件，由一組資料（各種numpy資料型別）以及一組與之相關的資料標籤（即索引）組成。僅由一組資料即可產生最簡單的Series

『Python學習』pandas進階學習筆記

Pandas資料分析基礎 http://blog.csdn.net/cbbing/article/details/50721468 1、 # 輸出系統當前時間 now = datetime.now() print now print now.day print now.we

Python資料科學：Pandas Cheat Sheet

Key and Imports In this cheat sheet, we use the following shorthand: df | Any pandas DataFrame object s | Any pandas Series obje

Pandas DataFrame學習筆記

置0 學習 end sort nbsp index data 包含 dataframe 對一個DF r1 r2 r3 c1 c2 c3 選行： df[‘r1‘] df[‘r2‘:‘r2‘] #包含r2 df[df[‘c1‘]>5] #按條件選

Pandas學習筆記，DataFrame的排序問題

log das blog value 1.0 col 11.15 問題 2.0 數據來源見前邊的幾篇隨筆對其中的一列排序 data.high.sort_values(ascending=False) data.high.sort_values(ascending=Tru

Pandas學習筆記，如何用列的值過濾行

urn hang logs style lose sta log class volume 通過tushare引入DataFrame d = ts.get_hist_data(‘600848‘, start=‘2015-01-05‘, end=‘2015-01-09‘)

Pandas學習筆記，字符串方法（string method）

api long top method hand capi borde tle row 一般語法格式Series.str.method()。具體方法見http://pandas.pydata.org/pandas-docs/stable/api.html#string-ha

pandas學習筆記

tony range req 列名學習 drop pan count 1.5 #構造一行數據>>> s = pd.Series([1,3,6,np.nan,44,1])>>> s0 1.01 3.02 6.03

pandas學習筆記D1

python3.6 中文名讀取最近使用的是python3.6版本，發現使用read_csv()無法讀取中文路徑下的文件，今天終於解決。代碼如下：import syssys.getfilesystemencoding()sys._enablelegacywindowsfsencoding()sy

pandas學習筆記D2

pandas 昨天 python sheet #獲取昨日的日期import datetimenow_time = datetime.datetime.now()yesterday_time = now_time + datetime.timedelta(days=-1)yesterday_ti

python:pandas學習筆記

python pandas 人工智能import pandas sub_info = pandas.read_csv("contract.csv") #sub_info #print (sub_info) type(sub_info) #print (sub_info.dtypes) first_row

學習筆記之pandas: Python Data Analysis Library

open .com sets 學習 and ref ftw pro title Python Data Analysis Library — pandas: Python Data Analysis Library https://pandas.pydat

《利用python進行數據分析》學習筆記--pandas(1)

索引 eight and dong 改變組成過濾 isnull 學習 pandas主要的兩個數據結構是：Series 和DataFrame 1、Series series 類似於一維數組，由索引+數據組成若不指定索引，則會自動創建0到N-1的整數型索引，可

數據分析學習筆記2-----pandas

ear 序列解釋它的轉換嵌套 class 不同的而不是要使用pandas，你首先就得熟悉它的兩個主要數據結構：Series和DataFrame。 1.Series Series是一種類似於一維數組的對象，它由一組數據（各種NumPy數據類型）以及一組與之相關的數

Pandas學習筆記（二）

數組面板 2.7 很快列表一維數組 spa 屬性 nump （1）Pandas處理以下三個數據結構系列(Series) 數據幀(DataFrame) 面板(Panel) 這些數據結構構建在Numpy數組之上，這意味著它們很快。考慮這些數據結構的最好方法是，較

pandas學習筆記（一）

大數據技術分享貢獻如何 name 自定義內存 ren nbsp Pandas是一款開放源碼的BSD許可的Python庫，為Python編程語言提供了高性能，易於使用的數據結構和數據分析工具。Pandas用於廣泛的領域，包括金融，經濟，統計，分析等學術和商業領域。在本

Pandas Cheat Sheet學習筆記

相關推薦