Feature Engineering

阿新 • • 發佈：2017-09-22

timestamp more set special mea less result weight put

1. remove skew

Why:

Many model built on the hypothsis that the input data are distributed as a ‘Normal Distribution‘(Gaussian Distribution). So if the input data is more like Normal Distribution, the results are better.

Methods:

remove skewnewss: log function.

2. standardization

Why:

Different data have different scale, to avoid give to high weight to those data with large scale.

Methods:

min-max = (data - min) / (max - min)
z-score = (data - mean) / (sd), sd standard deviation

3. manual remove

Why:

sometimes we know that some columns are meanless, so we just remove it manually.

Method:

columns like "ID", "timestamp"

4. remove columns with too many nulls

Why:

if a feature has too many nulls, it‘s not reliable.

Method:

count the percentage of nulls.

5. drop outlier

Why:

outliers are the special cases for a set of data. they don‘t represent the common experience. so they will not contribute to a model, on the contrary, they will be harmful for our models.

Methods:

remove data that >= an extreme value, or <= an extreme value.

6. to be continued

Feature Engineering

timestamp more set special mea less result weight put 1. remove skew Why: Many model built on the hypothsis that the input data are

Feature Engineering

Feature Engineering

Understanding Feature Engineering (Part 4) — A hands-on intuitive approach to Deep Learning Methods

Understanding Feature Engineering (Part 3) — Traditional Methods for Text Data

Understanding Feature Engineering (Part 2) — Categorical Data

Understanding Feature Engineering (Part 1) — Continuous Numeric Data

Google Machine Learning Course NoteBook--Data Preparation and Feature Engineering in ML

Automatic Feature Engineering: An Event

Feature Engineering-（1）PCA的理解實現

Software Engineer (Feature Engineering & Data Transformation) | Open position

AI學習---特征工程(Feature Engineering)

Software Engineering——A PRACTITIONER'S APPROACH (english edition · eighth edition)

[RxJS] Implement pause and resume feature correctly through RxJS

arcgis engine 獲取高亮Feature、element

VINS（二）Feature Detection and Tracking

Git團隊協作 - 新feature的開發過程

scikit-learn：4.2. Feature extraction（特征提取，不是特征選擇）

這就是那個feature map256 256向量

在CNN網絡中roi從原圖映射到feature map中的計算方法

HOG feature

Angular5的new feature

Feature Engineering

相關推薦