1. 程式人生 > 其它 >機器學習sklearn(十八): 特徵工程(九)特徵編碼(三)類別特徵編碼(一)標籤編碼 LabelEncoder

機器學習sklearn(十八): 特徵工程(九)特徵編碼(三)類別特徵編碼(一)標籤編碼 LabelEncoder

LabelEncoder是一個可以用來將標籤規範化的工具類,它可以將標籤的編碼值範圍限定在[0,n_classes-1]. 這在編寫高效的Cython程式時是非常有用的.LabelEncoder可以如下使用:

>>> from sklearn import preprocessing
>>> le = preprocessing.LabelEncoder()
>>> le.fit([1, 2, 2, 6])
LabelEncoder()
>>> le.classes_
array([1, 2, 6])
>>> le.transform([1, 1, 2, 6])
array([0, 0, 
1, 2]) >>> le.inverse_transform([0, 0, 1, 2]) array([1, 1, 2, 6])

當然,它也可以用於非數值型標籤的編碼轉換成數值標籤(只要它們是可雜湊並且可比較的):

>>> le = preprocessing.LabelEncoder()
>>> le.fit(["paris", "paris", "tokyo", "amsterdam"])
LabelEncoder()
>>> list(le.classes_)
['amsterdam', 'paris', 'tokyo
'] >>> le.transform(["tokyo", "tokyo", "paris"]) array([2, 2, 1]) >>> list(le.inverse_transform([2, 2, 1])) ['tokyo', 'tokyo', 'paris']

classsklearn.preprocessing.LabelEncoder

Encode target labels with value between 0 and n_classes-1.

This transformer should be used to encode target values,i.e.

y, and not the inputX.

Read more in theUser Guide.

New in version 0.12.

Attributes
classes_ndarray of shape (n_classes,)

Holds the label for each class.

Examples

LabelEncodercan be used to normalize labels.

>>> from sklearn import preprocessing
>>> le = preprocessing.LabelEncoder()
>>> le.fit([1, 2, 2, 6])
LabelEncoder()
>>> le.classes_
array([1, 2, 6])
>>> le.transform([1, 1, 2, 6])
array([0, 0, 1, 2]...)
>>> le.inverse_transform([0, 0, 1, 2])
array([1, 1, 2, 6])

It can also be used to transform non-numerical labels (as long as they are hashable and comparable) to numerical labels.

>>> le = preprocessing.LabelEncoder()
>>> le.fit(["paris", "paris", "tokyo", "amsterdam"])
LabelEncoder()
>>> list(le.classes_)
['amsterdam', 'paris', 'tokyo']
>>> le.transform(["tokyo", "tokyo", "paris"])
array([2, 2, 1]...)
>>> list(le.inverse_transform([2, 2, 1]))
['tokyo', 'tokyo', 'paris']