Pandas系列教程（6）Pandas缺失值處理

阿新 • • 發佈：2020-10-21

Pandas缺失值處理

Pandas使用這些函式處理缺失值：

isnull和notnull: 檢測是否是空值，可用於df和Series
dropna: 丟棄，刪除缺失值
- axis: 刪除行還是列，{0 ro 'index', 1 or 'columns'}， default 0
- how: 如果等於any則任何值為空都刪除，如果等於all則所有值都為空時才刪除
- inplace: 如果為True則修改當前df, 否則返回新的df
fillna: 填充空值
- value: 用於填充的值，可以是單個值，或者字典（key是列名，value是值）
- method: 等於ffill使用前一個部位空的值填充forword fill; 等於bfill使用後一個部位空的值天充backword fill
- axis: 按行還是按列填充，{0 ro 'index', 1 or 'columns'}
- inplace: 如果為True則修改當前df, 否則返回新的df

例項：特殊Excel的讀取，清洗，處理

import pandas as pd

# 第一步：讀取Excel的時候忽略前幾個空行
print('*' * 25, '第一步：讀取Excel的時候忽略前幾個空行', '*' * 25)
file_path = "../datas/student_excel/student_excel.xlsx"
studf = pd.read_excel(file_path, skiprows=2)
 
print(studf)

# 第二步：檢測空值
print('*' * 25, '第二步：檢測空值', '*' * 25)
print(studf.isnull())
print('*' * 25, '篩選分數為空的值', '*' * 25)
print(studf['分數'].isnull())
print('*' * 25, '篩選分數不為空的值', '*' * 25)
print(studf['分數'].notnull())
print('*' * 25, '篩選沒有空分數的所有行', '*' * 25)
print(studf.loc[studf['分數'].notnull(), :])

 
# 第三步：刪除全是空值的列
studf.dropna(axis='columns', how='all', inplace=True)
print('*' * 25, '第三步：刪除全是空值的列', '*' * 25)
print(studf)

# 第四步：刪除全是空值的行
studf.dropna(axis='index', how='all', inplace=True)
print('*' * 25, '第四步：刪除全是空值的行', '*' * 25)
print(studf)

# 第五步：將分數列為空的填充為0分
# studf.fillna({"分數": 0})   # 有點小問題
studf.loc[:, '分數'] = studf['分數'].fillna(0)  # 兩種方式相同
print('*' * 25, '第五步：將分數列為空的填充為0分', '*' * 25)
print(studf)

# 第六步：將姓名的缺失值填充
studf.loc[:, '姓名'] = studf['姓名'].fillna(method='ffill')
print('*' * 25, '第六步：將姓名的缺失值填充', '*' * 25)
print(studf)

# 第七步：將清洗好的execel儲存
print('*' * 25, '第七步：將清洗好的execel儲存', '*' * 25)
studf.to_excel("../datas/student_excel/student_excel_clean.xlsx", index=False)

Pandas系列教程（6）Pandas缺失值處理

Pandas缺失值處理 Pandas使用這些函式處理缺失值： isnull和notnull: 檢測是否是空值，可用於df和Series

Pandas系列教程（1）Pandas資料讀取

1. 下載安裝pandas pip install pandas pip install pandas -i https://pypi.tuna.tsinghua.edu.cn/simple

Pandas系列教程（2）Pandas資料結構

Pandas資料結構 DataFrame: 二維陣列，整個表格，多行多列 Series: 一維資料，一行或一列

Pandas系列教程（4）Pandas新增資料列

Pandas新增資料列在進行資料分析時，經常需要按照一定的條件建立新的資料列，然後進行進一步分析

Pandas系列教程（3）Pandas資料查詢

Pandas資料查詢 pandas 查詢資料的幾種方法 df.loc方法，根據行，列的標籤值查詢 df.iloc方法，根據行，列的數字位置查詢

Pandas系列教程（5）Pandas資料統計函式

Pandas資料統計函式 1、讀取csv資料 import pandas as pd file_path = \"../../datas/files/beijing_tianqi_2018.csv\"

Pandas系列教程（7）Pandas的SettingWithCopyWarning

Pandas的SettingWithCopyWarning 1、讀取資料 import pandas as pd file_path = \"../datas/files/beijing_tianqi_2018.csv\"

Pandas系列教程（10）Pandas的axis引數

Pandas的axis引數 1、axis = 0 或者 axis = \'index\' 如果是單行操作，就是指某一行如果是聚合操作，指的就是跨行corss rows

Pandas系列教程（9）Pandas字串處理

Pandas字串處理前面我們已經使用了字串處理函式：　　df[\'bWendu\'].try.replace(\'℃\', \'\').astype(\'int32\')

Pandas系列教程（8）pandas資料排序

pandas資料排序 1. Series的排序： Series.sort_values(ascending=True, inplace=Flase) 引數說明：

Pandas系列教程（11）Pandas的索引index

Pandas的索引index 把資料儲存於普通的column列也能用於資料查詢，那使用index有什麼好處？

Angular入門到精通系列教程（6）- Angular的升級

1. 摘要 2. https://update.angular.io/ 3. 總結環境: Angular CLI: 11.0.6 Angular: 11.0.7 Node: 12.18.3

WINFORM許可權系統開發系列教程（八）角色管理模組

實現過程 1 角色列表頁和資訊頁面佈局 2 功能實現分析載入所有角色列表新增 --角色資訊頁面許可權分配--入口--角色選單設定頁面

Java NIO系列教程（六） Selector

Selector（選擇器）是Java NIO中能夠檢測一到多個NIO通道，並能夠知曉通道是否為諸如讀寫事件做好準備的元件。這樣，一個單獨的執行緒可以管理多個channel，從而管理多個網路連線。

Java NIO系列教程（十）client和server 示例

//客戶但package com.example.demo.nio;import java.io.IOException;import java.net.InetSocketAddress;import java.nio.ByteBuffer;import java.nio.channels.SelectionKey;import java.nio.channels.Selector;imp