1. 程式人生 > >Model Prediction Accuracy Versus Interpretation in Machine Learning

Model Prediction Accuracy Versus Interpretation in Machine Learning

In their book Applied Predictive Modeling, Kuhn and Johnson comment early on the trade-off of model prediction accuracy versus model interpretation.

For a given problem, it is critical to have a clear idea of the which is a priority, accuracy or explainability so that this trade-off can be made explicitly rather than implicitly.

In this post you will discover and consider this important trade-off.

model accuracy

Model Accuracy vs Explainability
Photo by Donald Hobern, some rights reserved

Accuracy and Explainability

Model performance is estimated in terms of its accuracy to predict the occurrence of an event on unseen data. A more accurate model is seen as a more valuable model.

Model interpretability provides insight into the relationship between in the inputs and the output. An interpreted model can answer questions as to why the independent features predict the dependent attribute.

The issue arises because as model accuracy increases so does model complexity, at the cost of interpretability.

Model Complexity

A model with higher the accuracy can mean more opportunities, benefits, time or money to a company. And as such prediction accuracy is optimized.

The optimization of accuracy leads to further increases in the complexity of models in the form of additional model parameters (and resources required to tune those parameters).

Unfortunately, the predictive models that are most powerful are usually the least interpretable.

A model with fewer parameters is easier to interpret. This is intuitive. A linear regression model has a coefficient per input feature and an intercept term. For example, you can look at each term and understand how they contribute to the output. Moving to logistic regression gives more power in terms of the underlying relationships that can be modeled at the expense of a function transform to the output that now too must be understood along with the coefficients.

A decision tree (of modest size) may be understandable, a bagged decision tree requires a different perspective to interpret why an event is predicted to occur. Pushing further, the optimized blend of multiple models into a single prediction may beyond meaningful or timely interpretation.

Accuracy Trumps Explainability

In their book, Kuhn and Johnson are concerned with model accuracy at the expense of interpretation.

They comment:

As long as complex models are properly validated, it may be improper to use a model that is built for interpretation rather than predictive performance.

Interpretation is secondary to model accuracy and they site examples such as discriminating email into spam and non-spam and the evaluation of a house as examples of problems where this is the case. Medical examples are touched on twice and in both cases are used to defend the absolute need and desirability for accuracy of explainability, as long as the models are appropriately validated.

I’m sure that “but I validated my model” would be no defense at an inquest when a model makes predictions that result in loss of life. Nevertheless, there is do doubt that this is an important issue that requires careful consideration.

Summary

Whenever you are modeling a problem, you are making a decision on the trade-off between model accuracy and model interpretation.

You can use knowledge of this trade-off in the selection of methods you use to model your problem and be clear of your objectives when presenting results.

相關推薦

Model Prediction Accuracy Versus Interpretation in Machine Learning

Tweet Share Share Google Plus In their book Applied Predictive Modeling, Kuhn and Johnson commen

機器學習筆記1 - Hello World In Machine Learning

之間 項目 圍棋 gpu 強勁 大量數據 特殊 轉換成 [1] 前言 Alpha Go在16年以4:1的戰績打敗了李世石,17年又以3:0的戰績戰勝了中國圍棋天才柯潔,這真是科技界振奮人心的進步。伴隨著媒體的大量宣傳,此事變成了婦孺皆知的大事件。大家又開始激烈的討論機器人什

Data Leakage in Machine Learning 機器學習訓練中的資料洩漏

refer to:  https://www.kaggle.com/dansbecker/data-leakage There are two main types of leakage: Leaky Predictors and a Leaky Validation Strategies. L

Top 4 Steps for Data Preprocessing in Machine Learning

Data Processing in the machine learning is a data mining technique. In this process, the raw data gathered and you analyze the data to find a way to transf

How Facebook Uses Bayesian Optimization to Conduct Better Experiments in Machine Learning Models

How Facebook Uses Bayesian Optimization to Conduct Better Experiments in Machine Learning ModelsHyperparameter optimization is a key aspect of the lifecycl

[Research] Help relating to a theorem in machine learning | AITopics

This is related to a theorem that I have proved and its relation (or not) to an existing result. Essentially, I have shown that PAC-learning is undecidable

Regularization in Machine Learning: Connect the dots

Following are the various steps we will walk together and try gaining an understanding. In this post, we will consider Linear Regression as the algorithm w

Restoring balance in machine learning datasets

If you want to teach a child what an elephant looks like, you have an infinite number of options. Take a photo from National Geographic, a stuffed animal o

Vectorization Implementation in Machine Learning

IntroductionIn machine learning filed, advanced players have the need to write their own cost function or optimization algorithm in achieving a more custom

Algorithmia Survey: Large Enterprises Have Taken the Lead in Machine Learning

Companies of all sizes are not satisfied with their machine learning process and various challenges to widespread adoption remain. SEATTLE, Oct. 16, 2018 (

Report: Large organizations are finding success in machine learning

Enterprises of all sizes are looking to leverage machine learning, but not everyone is finding immediate success. A newly released report revealed larger o

Five steps for getting started in machine learning: Top data scientists share their tips

If you want to carve out a career in machine learning then knowing where to start can be daunting. Not only is the technology built on college-level math,

A new course to teach people about fairness in machine learning

In my undergraduate studies, I majored in philosophy with a focus on ethics, spending countless hours grappling with the notion of fairness: both how to de

A Quick Introduction to Text Summarization in Machine Learning

A Quick Introduction to Text Summarization in Machine LearningText summarization refers to the technique of shortening long pieces of text. The intention i

Evolutionary Algorithms: the Next Big Thing in Machine Learning?

Sentient Technologies Asks Experts from Industry and Academia to Weigh In Sentient Technologies, a world leader in artificial intelligence (AI) produc

conversations in machine learning

© 2014-2018 Mighty AI. Mighty AI, the Mighty AI logo, Training Data as a Service, TDAAS and SPARE5 are trademarks or registered trademarks of Mighty AI, In

Embrace Randomness in Machine Learning

Tweet Share Share Google Plus Why Do You Get Different Results On Different Runs Of An Algorith

How Beginners Get It Wrong In Machine Learning

Tweet Share Share Google Plus The 5 Most Common Mistakes That Beginners Make And How To Avoid Th

Common Pitfalls In Machine Learning Projects

Tweet Share Share Google Plus In a recent presentation, Ben Hamner described the common pitfalls

5 Mistakes Programmers Make when Starting in Machine Learning

Tweet Share Share Google Plus There is no right way to get into machine learning. We all learn s