1. 程式人生 > >生成式、判別式模型對比

生成式、判別式模型對比

觀點 classes NPU 預測 features ike del them ram

參考文獻:On Discriminative vs. Generative classifiers: A comparison of logistic regression and naive Bayes

生成式模型:model p(x,y)=p(x|y)*p(y) -> Bayes rule預測: p(y|x)=p(x,y)p(x),代表模型:Naive Bayes

判別式模型:model p(y|x),代表模型:Logistic Regression

參考文獻中的結論:

判別式模型有更低的理論漸近誤差[the generative model does indeed have a higher asymptotic error - as the number of training examples becomes large - than the discriminative model],

生成式模型理論上更快逼近漸近誤差(前提是樣本能夠滿足條件獨立性和特定的分布,比如Gaussian分布)[but the generative model may also approach its asymptotic error much faster than the discriminative model - possibly with a number of training examples that is only logarithmic, rather than linear, in the number of parameters]

實際情況由於樣本很難嚴格服從特定條件,使得判別式模型往往更優。

其他來源的觀點:

- Easy to fit?

G: easy, simple counting and averaging (NB, LDA)

D: much slower, solving a convex optimization problem (LogR)

- Fit classes separately?

G: not have to retrain when add more classes

D: must be retrained (all parameters interact)

- Handle missing features easily?

G: simple, marginalizing them out (NB)

D: no principled solution, model assumes that x is given

- Can handle feature preprocessing?

G: hard to define model on preprocessed data

D: allow to preprocess the input, replace x with kernel(x)

- Can handle unlabeled training data (like semi-supervised learning)?

G: easy

D: much harder

生成式、判別式模型對比