Python中的去重
阿新 • • 發佈:2018-12-03
一、列表去重
1、迴圈去重
list_1 = [5,5,1,4,4,6,7,8,1]
new_list = []
for i in list_1:
if i not in new_list:
new_list.append(i)
print(new_list)
結果:[5, 1, 4, 6, 7, 8]。結果順序是原來的順序。
2、集合set()去重
list_1 = [5,5,1,4,4,6,7,8,1]
new_list = list(set(list_1))
print(new_list)
結果:[1, 4, 5, 6, 7, 8]。結果進行了排序。
二、資料框去重
1、unique()去重
import pandas as pd
data =pd.DataFrame({'score':[1,2,3,1,5,6],'name':['Tom','John','june','Tom','John','june']})
data.name.unique()
#import numpy as np
#np.unique(data.score)
2、frame.drop_duplicates()去重
import pandas as pd data =pd.DataFrame({'score':[1,2,3,1,5,6],'name':['Tom','John','june','Tom','John','june']}) data.drop_duplicates(['name']) data.drop_duplicates(['score']) data.drop_duplicates(['name','score'])
三個結果分別如下: