Pandas:類別變數向量化--get_dummies
阿新 • • 發佈:2019-02-18
import numpy as np
import pandas as pd
from pandas import Series,DataFrame
一、向量化
df = DataFrame({'key':['b','b','a','c','a','b'],
'data1':range(6)})
print(df)
data1 key
0 0 b
1 1 b
2 2 a
3 3 c
4 4 a
5 5 b
print(pd.get_dummies(df['key' ]))
a b c
0 0 1 0
1 0 1 0
2 1 0 0
3 0 0 1
4 1 0 0
5 0 1 0
二、與原始資料合併
dummies = pd.get_dummies(df['key'],prefix = 'key')
df_with_dummy = df[['data1']].join(dummies)
print(df_with_dummy)
data1 key_a key_b key_c 0 0 0 1 0 1 1 0 1 0 2 2 1 0 0 3 3 0 0 1 4 4 1 0 0 5 5 0 1 0