Pandas中DataFrame基本函式整理(小結)

阿新 • • 發佈：2020-07-21

建構函式

DataFrame([data,index,columns,dtype,copy]) #構造資料框

屬性和資料

DataFrame.axes                #index: 行標籤；columns: 列標籤
DataFrame.as_matrix([columns])        #轉換為矩陣
DataFrame.dtypes               #返回資料的型別
DataFrame.ftypes               #返回每一列的 資料型別float64:dense
DataFrame.get_dtype_counts()         #返回資料框資料型別的個數
DataFrame.get_ftype_counts()         #返回資料框資料型別float64:dense的個數
DataFrame.select_dtypes([include,include])  #根據資料型別選取子資料框
DataFrame.values               #Numpy的展示方式
DataFrame.axes                #返回橫縱座標的標籤名
DataFrame.ndim                #返回資料框的緯度
DataFrame.size                #返回資料框元素的個數
DataFrame.shape                #返回資料框的形狀
DataFrame.memory_usage()           #每一列的儲存

型別轉換

DataFrame.astype(dtype[,copy,errors])    #轉換資料型別
DataFrame.copy([deep])            #deep深度複製資料
DataFrame.isnull()              #以布林的方式返回空值
DataFrame.notnull()              #以布林的方式返回非空值

索引和迭代

DataFrame.head([n])              #返回前n行資料
DataFrame.at                 #快速標籤常量訪問器
DataFrame.iat                 #快速整型常量訪問器
DataFrame.loc                 #標籤定位，使用名稱
DataFrame.iloc                #整型定位，使用數字
DataFrame.insert(loc,column,value)     #在特殊地點loc[數字]插入column[列名]某列資料
DataFrame.iter()               #Iterate over infor axis
DataFrame.iteritems()             #返回列名和序列的迭代器
DataFrame.iterrows()             #返回索引和序列的迭代器
DataFrame.itertuples([index,name])      #Iterate over DataFrame rows as namedtuples,with index value as first element of the tuple.
DataFrame.lookup(row_labels,col_labels)   #Label-based “fancy indexing” function for DataFrame.
DataFrame.pop(item)              #返回刪除的專案
DataFrame.tail([n])              #返回最後n行
DataFrame.xs(key[,axis,level,drop_level]) #Returns a cross-section (row(s) or column(s)) from the Series/DataFrame.
DataFrame.isin(values)            #是否包含資料框中的元素
DataFrame.where(cond[,other,inplace,…])  #條件篩選
DataFrame.mask(cond[,…])   #Return an object of same shape as self and whose corresponding entries are from self where cond is False and otherwise are from other.
DataFrame.query(expr[,inplace])       #Query the columns of a frame with a boolean expression.

二元運算

DataFrame.add(other[,fill_value])    #加法，元素指向
DataFrame.sub(other[,fill_value])    #減法，元素指向
DataFrame.mul(other[,fill_value])    #乘法，元素指向
DataFrame.div(other[,fill_value])    #小數除法，元素指向
DataFrame.truediv(other[,…])  #真除法，元素指向
DataFrame.floordiv(other[,…])  #向下取整除法，元素指向
DataFrame.mod(other[,fill_value])    #模運算，元素指向
DataFrame.pow(other[,fill_value])    #冪運算，元素指向
DataFrame.radd(other[,fill_value])   #右側加法，元素指向
DataFrame.rsub(other[,fill_value])   #右側減法，元素指向
DataFrame.rmul(other[,fill_value])   #右側乘法，元素指向
DataFrame.rdiv(other[,fill_value])   #右側小數除法，元素指向
DataFrame.rtruediv(other[,…])     #右側真除法，元素指向
DataFrame.rfloordiv(other[,…])     #右側向下取整除法，元素指向
DataFrame.rmod(other[,fill_value])   #右側模運算，元素指向
DataFrame.rpow(other[,fill_value])   #右側冪運算，元素指向
DataFrame.lt(other[,level])      #類似Array.lt
DataFrame.gt(other[,level])      #類似Array.gt
DataFrame.le(other[,level])      #類似Array.le
DataFrame.ge(other[,level])      #類似Array.ge
DataFrame.ne(other[,level])      #類似Array.ne
DataFrame.eq(other[,level])      #類似Array.eq
DataFrame.combine(other,func[,fill_value,…]) #Add two DataFrame objects and do not propagate NaN values,so if for a
DataFrame.combine_first(other)        #Combine two DataFrame objects and default to non-null values in frame calling the method.

函式應用&分組&視窗

DataFrame.apply(func[,broadcast,…])  #應用函式
DataFrame.applymap(func)           #Apply a function to a DataFrame that is intended to operate elementwise,i.e.
DataFrame.aggregate(func[,axis])       #Aggregate using callable,string,dict,or list of string/callables
DataFrame.transform(func,*args,**kwargs)  #Call function producing a like-indexed NDFrame
DataFrame.groupby([by,…])    #分組
DataFrame.rolling(window[,min_periods,…])  #滾動視窗
DataFrame.expanding([min_periods,freq,…])  #拓展視窗
DataFrame.ewm([com,span,halflife,…])   #指數權重視窗

描述統計學

DataFrame.abs()                #返回絕對值
DataFrame.all([axis,bool_only,skipna])   #Return whether all elements are True over requested axis
DataFrame.any([axis,skipna])   #Return whether any element is True over requested axis
DataFrame.clip([lower,upper,axis])     #Trim values at input threshold(s).
DataFrame.clip_lower(threshold[,axis])    #Return copy of the input with values below given value(s) truncated.
DataFrame.clip_upper(threshold[,axis])    #Return copy of input with values above given value(s) truncated.
DataFrame.corr([method,min_periods])     #返回本資料框成對列的相關性係數
DataFrame.corrwith(other[,drop])    #返回不同資料框的相關性
DataFrame.count([axis,numeric_only]) #返回非空元素的個數
DataFrame.cov([min_periods])         #計算協方差
DataFrame.cummax([axis,skipna])       #Return cumulative max over requested axis.
DataFrame.cummin([axis,skipna])       #Return cumulative minimum over requested axis.
DataFrame.cumprod([axis,skipna])       #返回累積
DataFrame.cumsum([axis,skipna])       #返回累和
DataFrame.describe([percentiles,include,…]) #整體描述資料框
DataFrame.diff([periods,axis])        #1st discrete difference of object
DataFrame.eval(expr[,inplace])        #Evaluate an expression in the context of the calling DataFrame instance.
DataFrame.kurt([axis,skipna,…])   #返回無偏峰度Fisher's (kurtosis of normal == 0.0).
DataFrame.mad([axis,level])     #返回偏差
DataFrame.max([axis,…])    #返回最大值
DataFrame.mean([axis,…])   #返回均值
DataFrame.median([axis,…])  #返回中位數
DataFrame.min([axis,…])    #返回最小值
DataFrame.mode([axis,numeric_only])     #返回眾數
DataFrame.pct_change([periods,fill_method]) #返回百分比變化
DataFrame.prod([axis,…])   #返回連乘積
DataFrame.quantile([q,numeric_only])  #返回分位數
DataFrame.rank([axis,method,numeric_only]) #返回數字的排序
DataFrame.round([decimals])          #Round a DataFrame to a variable number of decimal places.
DataFrame.sem([axis,ddof])  #返回無偏標準誤
DataFrame.skew([axis,…])   #返回無偏偏度
DataFrame.sum([axis,…])    #求和
DataFrame.std([axis,ddof])  #返回標準誤差
DataFrame.var([axis,ddof])  #返回無偏誤差

從新索引&選取&標籤操作

DataFrame.add_prefix(prefix)         #新增字首
DataFrame.add_suffix(suffix)         #新增字尾
DataFrame.align(other[,join,level])  #Align two object on their axes with the
DataFrame.drop(labels[,…])   #返回刪除的列
DataFrame.drop_duplicates([subset,keep,…]) #Return DataFrame with duplicate rows removed,optionally only
DataFrame.duplicated([subset,keep])     #Return boolean Series denoting duplicate rows,optionally only
DataFrame.equals(other)            #兩個資料框是否相同
DataFrame.filter([items,like,regex,axis]) #過濾特定的子資料框
DataFrame.first(offset)            #Convenience method for subsetting initial periods of time series data based on a date offset.
DataFrame.head([n])              #返回前n行
DataFrame.idxmax([axis,skipna])       #Return index of first occurrence of maximum over requested axis.
DataFrame.idxmin([axis,skipna])       #Return index of first occurrence of minimum over requested axis.
DataFrame.last(offset)            #Convenience method for subsetting final periods of time series data based on a date offset.
DataFrame.reindex([index,columns])      #Conform DataFrame to new index with optional filling logic,placing NA/NaN in locations having no value in the previous index.
DataFrame.reindex_axis(labels[,…])   #Conform input object to new index with optional filling logic,placing NA/NaN in locations having no value in the previous index.
DataFrame.reindex_like(other[,…])  #Return an object with matching indices to myself.
DataFrame.rename([index,columns])      #Alter axes input function or functions.
DataFrame.rename_axis(mapper[,copy])  #Alter index and / or columns using input function or functions.
DataFrame.reset_index([level,drop,…])    #For DataFrame with multi-level index,return new DataFrame with labeling information in the columns under the index names,defaulting to ‘level_0',‘level_1',etc.
DataFrame.sample([n,frac,replace,…])    #返回隨機抽樣
DataFrame.select(crit[,axis])        #Return data corresponding to axis labels matching criteria
DataFrame.set_index(keys[,append ])  #Set the DataFrame index (row labels) using one or more existing columns.
DataFrame.tail([n])              #返回最後幾行
DataFrame.take(indices[,convert])   #Analogous to ndarray.take
DataFrame.truncate([before,after,axis ])  #Truncates a sorted NDFrame before and/or after some particular index value.

處理缺失值

DataFrame.dropna([axis,how,thresh,…])   #Return object with labels on given axis omitted where alternately any
DataFrame.fillna([value,…])  #填充空值
DataFrame.replace([to_replace,value,…])   #Replace values given in ‘to_replace' with ‘value'.

從新定型&排序&轉變形態

DataFrame.pivot([index,values])   #Reshape data (produce a “pivot” table) based on column values.
DataFrame.reorder_levels(order[,axis])    #Rearrange index levels using input order.
DataFrame.sort_values(by[,ascending]) #Sort by the values along either axis
DataFrame.sort_index([axis,…])    #Sort object by labels (along an axis)
DataFrame.nlargest(n,columns[,keep])    #Get the rows of a DataFrame sorted by the n largest values of columns.
DataFrame.nsmallest(n,keep])    #Get the rows of a DataFrame sorted by the n smallest values of columns.
DataFrame.swaplevel([i,j,axis])       #Swap levels i and j in a MultiIndex on a particular axis
DataFrame.stack([level,dropna])       #Pivot a level of the (possibly hierarchical) column labels,returning a DataFrame (or Series in the case of an object with a single level of column labels) having a hierarchical index with a new inner-most level of row labels.
DataFrame.unstack([level,fill_value])    #Pivot a level of the (necessarily hierarchical) index labels,returning a DataFrame having a new level of column labels whose inner-most level consists of the pivoted index labels.
DataFrame.melt([id_vars,value_vars,…])   #“Unpivots” a DataFrame from wide format to long format,optionally
DataFrame.T                  #Transpose index and columns
DataFrame.to_panel()             #Transform long (stacked) format (DataFrame) into wide (3D,Panel) format.
DataFrame.to_xarray()             #Return an xarray object from the pandas object.
DataFrame.transpose(*args,**kwargs)     #Transpose index and columns

Combining& joining&merging

DataFrame.append(other[,ignore_index,…])  #追加資料
DataFrame.assign(**kwargs)          #Assign new columns to a DataFrame,returning a new object (a copy) with all the original columns in addition to the new ones.
DataFrame.join(other[,on,lsuffix,…]) #Join columns with other DataFrame either on index or on a key column.
DataFrame.merge(right[,left_on,…]) #Merge DataFrame objects by performing a database-style join operation by columns or indexes.
DataFrame.update(other[,overwrite,…]) #Modify DataFrame in place using non-NA values from passed DataFrame.

時間序列

DataFrame.asfreq(freq[,…])   #將時間序列轉換為特定的頻次
DataFrame.asof(where[,subset])        #The last row without any NaN is taken (or the last row without
DataFrame.shift([periods,axis])    #Shift index by desired number of periods with an optional time freq
DataFrame.first_valid_index()         #Return label for first non-NA/null value
DataFrame.last_valid_index()         #Return label for last non-NA/null value
DataFrame.resample(rule[,…])   #Convenience method for frequency conversion and resampling of time series.
DataFrame.to_period([freq,copy])    #Convert DataFrame from DatetimeIndex to PeriodIndex with desired
DataFrame.to_timestamp([freq,axis])   #Cast to DatetimeIndex of timestamps,at beginning of period
DataFrame.tz_convert(tz[,copy]) #Convert tz-aware axis to target time zone.
DataFrame.tz_localize(tz[,…])  #Localize tz-naive TimeSeries to target time zone.

作圖

DataFrame.plot([x,y,kind,ax,….])     #DataFrame plotting accessor and method
DataFrame.plot.area([x,y])          #面積圖Area plot
DataFrame.plot.bar([x,y])          #垂直條形圖Vertical bar plot
DataFrame.plot.barh([x,y])          #水平條形圖Horizontal bar plot
DataFrame.plot.box([by])           #箱圖Boxplot
DataFrame.plot.density(**kwds)        #核密度Kernel Density Estimate plot
DataFrame.plot.hexbin(x,y[,C,…])      #Hexbin plot
DataFrame.plot.hist([by,bins])        #直方圖Histogram
DataFrame.plot.kde(**kwds)          #核密度Kernel Density Estimate plot
DataFrame.plot.line([x,y])          #線圖Line plot
DataFrame.plot.pie([y])            #餅圖Pie chart
DataFrame.plot.scatter(x,s,c])     #散點圖Scatter plot
DataFrame.boxplot([column,by,…])    #Make a box plot from DataFrame column optionally grouped by some columns or
DataFrame.hist(data[,grid,…])  #Draw histogram of the DataFrame's series using matplotlib / pylab.

轉換為其他格式

DataFrame.from_csv(path[,header,sep,…])  #Read CSV file (DEPRECATED,please use pandas.read_csv() instead).
DataFrame.from_dict(data[,orient,dtype])  #Construct DataFrame from dict of array-like or dicts
DataFrame.from_items(items[,orient]) #Convert (key,value) pairs to DataFrame.
DataFrame.from_records(data[,…])   #Convert structured or record ndarray to DataFrame
DataFrame.info([verbose,buf,max_cols,…])  #Concise summary of a DataFrame.
DataFrame.to_pickle(path[,compression,…])  #Pickle (serialize) object to input file path.
DataFrame.to_csv([path_or_buf,na_rep]) #Write DataFrame to a comma-separated values (csv) file
DataFrame.to_hdf(path_or_buf,key,**kwargs) #Write the contained data to an HDF5 file using HDFStore.
DataFrame.to_sql(name,con[,flavor,…])   #Write records stored in a DataFrame to a SQL database.
DataFrame.to_dict([orient,into])       #Convert DataFrame to dictionary.
DataFrame.to_excel(excel_writer[,…])     #Write DataFrame to an excel sheet
DataFrame.to_json([path_or_buf,…])  #Convert the object to a JSON string.
DataFrame.to_html([buf,col_space]) #Render a DataFrame as an HTML table.
DataFrame.to_feather(fname)          #write out the binary feather-format for DataFrames
DataFrame.to_latex([buf,…])     #Render an object to a tabular environment table.
DataFrame.to_stata(fname[,convert_dates,…]) #A class for writing Stata binary dta files from array-like objects
DataFrame.to_msgpack([path_or_buf,encoding]) #msgpack (serialize) object to input file path
DataFrame.to_sparse([fill_value,kind])    #Convert to SparseDataFrame
DataFrame.to_dense()             #Return dense representation of NDFrame (as opposed to sparse)
DataFrame.to_string([buf,…])    #Render a DataFrame to a console-friendly tabular output.
DataFrame.to_clipboard([excel,sep])     #Attempt to write text representation of object to the system clipboard This can be pasted into Excel,for example.

到此這篇關於Pandas中DataFrame基本函式整理(小結)的文章就介紹到這了,更多相關Pandas DataFrame基本函式內容請搜尋我們以前的文章或繼續瀏覽下面的相關文章希望大家以後多多支援我們！

Pandas中DataFrame基本函式整理(小結)

建構函式 DataFrame([data,index,columns,dtype,copy]) #構造資料框屬性和資料 DataFrame.axes#index: 行標籤；columns: 列標籤

pandas中DataFrame-mean函式用法

技術標籤：PandasPythonpython資料探勘 mean–>平均數 Pandas中的df.mean()函式預設是等價於df.mean(0)，即按軸方向求平均，得到每列資料的平均值。

python3中datetime庫，time庫以及pandas中的時間函式區別與詳解

1介紹datetime庫之前我們先比較下time庫和datetime庫的區別先說下time 在 Python 文件裡，time是歸類在Generic Operating System Services中，換句話說，它提供的功能是更加接近於作業系統層面的。通讀文件可知，

Pandas中DataFrame交換列順序的方法實現

一、獲取DataFrame列標籤 import pandas as pd file_path = \'/Users/Arithmetic/da-rnn-master/data/collectd67_power_after_test_smooth.csv\'

pandas中DataFrame如何檢測重複值

DataFrame.duplicated(subset=None, keep=\'first\') subset：如果你認為幾個欄位重複，則資料重複，就把那幾個欄位以列表形式放到subset後面。預設是所有欄位重複為重複資料。

pandas中的繪圖函式

import matplotlib.pyplot as plt import pandas as pd import numpy as np from pandas import Series, DataFrame

pandas dataframe 中的explode函式用法詳解

在使用 pandas 進行資料分析的過程中，我們常常會遇到將一行資料展開成多行的需求，多麼希望能有一個類似於 hive sql 中的 explode 函式。

pandas中遍歷dataframe的每一個元素的實現

假如有一個需求場景需要遍歷一個csv或excel中的每一個元素，判斷這個元素是否含有某個關鍵字

在pandas中遍歷DataFrame行的實現方法

有如下 Pandas DataFrame： import pandas as pd inp = [{\'c1\':10,\'c2\':100},{\'c1\':11,\'c2\':110},{\'c1\':12,\'c2\':120}]

ASP中常用的22個FSO檔案操作函式整理

在ASP中，FSO的意思是File System Object，即檔案系統物件。我們將要操縱的計算機檔案系統，在這裡是指位於web伺服器之上。所以，確認你對此擁有合適的許可權。理想情況下，你可以在自己的機器上建立一個web伺服器，

JavaScript中的this基本問題例項小結

本文例項講述了JavaScript中的this基本問題.分享給大家供大家參考，具體如下：

python pandas.DataFrame.loc函式使用詳解

官方函式 DataFrame.loc Access a group of rows and columns by label(s) or a boolean array. .loc[] is primarily label based,but may also be used with a boolean array.

python中pandas庫中DataFrame對行和列的操作使用方法示例

用pandas中的DataFrame時選取行或列： import numpy as np import pandas as pd from pandas import Sereis,DataFrame

pandas建立DataFrame的7種方法小結

筆者在學習pandas,在學習過程中總結了一下建立dataframe的方法，通過查閱資料總結遺下幾種方法，如果你有其他的方法歡迎留言補充。

Pandas庫的基本使用 pip安裝 Series DataFrame

Pandas庫的基本使用 pip安裝 Series DataFrame 安裝pip pip是Python的包管理工具，熟悉Linux的朋友應該對包管理工具很熟悉（yum），一些庫被整合在了pip中，因此我們需要安裝pip（win10）

7-Pandas的就基本繪圖函式

一、基於Matplotlib的Pandas繪圖方法　　Pandas繪製圖形相較於Matplotlib來說更為簡潔，基礎函式為df.plot(x,y)

Pandas中loc和iloc函式用法詳解（原始碼+例項）

loc函式：通過行索引 \"Index\" 中的具體值來取行資料（如取\"Index\"為\"A\"的行）

python中pyplot基礎圖示函式整理

python中畫圖的庫有很多，Matplotlib畫2D影象是個不錯的選擇。Matplotlib.pyplot中有很多種函式，今天就為大家簡單介紹。

Pandas中兩個dataframe的交集和差集的示例程式碼

建立測試資料： import pandas as pd import numpy as np #Create a DataFrame df1 = { \'Subject\':[\'semester1\',\'semester2\',\'semester3\',\'semester4\',\'semester1\',\'semester3\'],\'Score\':[62,47

詳解pandas中利用DataFrame物件的.loc[]、.iloc[]方法抽取資料

pandas的DataFrame物件，本質上是二維矩陣，跟常規二維矩陣的差別在於前者額外指定了每一行和每一列的名稱。這樣內部資料抽取既可以用“行列名稱（對應.loc[]方法）”，也可以用“矩陣下標（對應.iloc[]方法）”兩種

Pandas中DataFrame基本函式整理(小結)

相關推薦