下采樣方法

阿新 • • 發佈：2018-11-30

.loc[],中括號裡面是先行後列，以逗號分割，行和列分別是行標籤和列標籤(label)

.iloc[]與loc一樣，中括號裡面也是先行後列，行列標籤用逗號分割，與loc不同的之處是，.iloc 是根據行數與列數來索引的

.ix上面兩種用法都可以

X=data.loc[:,data.columns != 'Class'] #loc 通過行標籤索引資料，
y=data.loc[:,data.columns == 'Class'] #取label

#number of data points in the minority class

number_records_fraud=len(data[data.Class==1]) #Class=1的數量
fraud_indices=np.array(data[data.Class==1].index) #取得其索引值

normal_indices=np.array(data[data.Class==0].index) # class為0的資料索引

random_normal_indices=np.random.choice(normal_indices,number_records_fraud,replace=False) # 隨機取樣，並不對原始dataframe進行替換
random_normal_indices=np.array(random_normal_indices) # 矩陣轉換成numpy的array格式

under_sample_indices=np.concatenate([fraud_indices,random_normal_indices]) # 合併class=1和class=0中隨機選取的資料

under_sample_data = data.iloc[under_sample_indices,:] #定位到真正資料，iloc通過行號索引行資料

X_undersample=under_sample_data.loc[:,under_sample_data.columns!='Class']
y_undersample=under_sample_data.loc[:,under_sample_data.columns=='Class']
print(X_undersample)
print(y_undersample)

print("Percentage of normal transactions: ", len(under_sample_data[under_sample_data.Class == 0])/len(under_sample_data))
print("Percentage of fraud transactions: ", len(under_sample_data[under_sample_data.Class == 1])/len(under_sample_data))
print("Total number of transactions in resampled data: ", len(under_sample_data))

思路：大樣本隨機取小樣本的數量A--》a

a和B再split成train和test

下采樣方法

下采樣方法

0021-用OpenCV的pyrUp和pyrDown函式計算影象金字塔(向上/下采樣)

python_bicubic_下采樣獲得LR

下采樣（處理資料不平衡問題）

解決U-net上取樣過程後，結合下采樣資訊時特徵圖大小不匹配問題

影象的上取樣（upsampling）與下采樣（subsampled）

10.邏輯迴歸-下采樣、過取樣、交叉驗證

opencv013-影象上取樣和下采樣（+高斯不同）

9.邏輯迴歸-下采樣、過取樣、交叉驗證

上取樣與下采樣

影象的上取樣和下采樣

PCL使用VoxelGrid filter對點雲進行下采樣

時間序列--上取樣、下采樣

OpenCV-Python——上取樣、下采樣與拉普拉斯金字塔

Glide4.8原始碼拆解（四）Bitmap解析之"下采樣"淺析

影象的上取樣（up-sampling）和下采樣(down-sampling)

降取樣，過取樣，欠取樣，子取樣，下采樣，上取樣，你學會了嗎？【總結】

影象金字塔——上取樣和下采樣

20180903影象的上取樣和下采樣

scipy.ndimage.zoom上取樣與下采樣

下采樣方法

相關推薦