Machine Learning Algorithm Recipes in scikit

阿新 • • 發佈：2019-01-12

You have to get your hands dirty.

You can read all of the blog posts and watch all the videos in the world, but you’re not actually going to start really get machine learning until you start practicing.

The scikit-learn Python library

is very easy to get up and running. Nevertheless I see a lot of hesitation from beginners looking get started. In this blog post I want to give a few very simple examples of using scikit-learn for some supervised classification algorithms.

Scikit-Learn Recipes

You don’t need to know about and use all of the algorithms in scikit-learn, at least initially, pick one or two (or a handful) and practice with only those.

In this post you will see 5 recipes of supervised classification algorithms applied to small standard datasets that are provided with the scikit-learn library.

The recipes are principled. Each example is:

Standalone: Each code example is a self-contained, complete and executable recipe.
Just Code: The focus of each recipe is on the code with minimal exposition on machine learning theory.

Simple: Recipes present the common use case, which is probably what you are looking to do.
Consistent: All code example are presented consistently and follow the same code pattern and style conventions.

The recipes do not explore the parameters of a given algorithm. They provide a skeleton that you can copy and paste into your file, project or python REPL and start to play with immediately.

These recipes show you that you can get started practicing with scikit-learn right now. Stop putting it off.

Need help with Machine Learning in Python?

Take my free 2-week email course and discover data prep, algorithms and more (with code).

Click to sign-up now and also get a free PDF Ebook version of the course.

Logistic Regression

Logistic regression fits a logistic model to data and makes predictions about the probability of an event (between 0 and 1).

This recipe shows the fitting of a logistic regression model to the iris dataset. Because this is a mutli-class classification problem and logistic regression makes predictions between 0 and 1, a one-vs-all scheme is used (one model per class).

Logistic Regression Python

# Logistic Regression
from sklearn import datasets
from sklearn import metrics
from sklearn.linear_model import LogisticRegression
# load the iris datasets
dataset = datasets.load_iris()
# fit a logistic regression model to the data
model = LogisticRegression()
model.fit(dataset.data, dataset.target)
print(model)
# make predictions
expected = dataset.target
predicted = model.predict(dataset.data)
# summarize the fit of the model
print(metrics.classification_report(expected, predicted))
print(metrics.confusion_matrix(expected, predicted))

12345678910111213141516

# Logistic Regressionfromsklearn importdatasetsfromsklearn importmetricsfromsklearn.linear_model importLogisticRegression# load the iris datasetsdataset=datasets.load_iris()# fit a logistic regression model to the datamodel=LogisticRegression()model.fit(dataset.data,dataset.target)print(model)# make predictionsexpected=dataset.targetpredicted=model.predict(dataset.data)# summarize the fit of the modelprint(metrics.classification_report(expected,predicted))print(metrics.confusion_matrix(expected,predicted))

Naive Bayes

Naive Bayes uses Bayes Theorem to model the conditional relationship of each attribute to the class variable.

This recipe shows the fitting of an Naive Bayes model to the iris dataset.

Gaussian Naive Bayes Python

# Gaussian Naive Bayes
from sklearn import datasets
from sklearn import metrics
from sklearn.naive_bayes import GaussianNB
# load the iris datasets
dataset = datasets.load_iris()
# fit a Naive Bayes model to the data
model = GaussianNB()
model.fit(dataset.data, dataset.target)
print(model)
# make predictions
expected = dataset.target
predicted = model.predict(dataset.data)
# summarize the fit of the model
print(metrics.classification_report(expected, predicted))
print(metrics.confusion_matrix(expected, predicted))

12345678910111213141516

# Gaussian Naive Bayesfromsklearn importdatasetsfromsklearn importmetricsfromsklearn.naive_bayes importGaussianNB# load the iris datasetsdataset=datasets.load_iris()# fit a Naive Bayes model to the datamodel=GaussianNB()model.fit(dataset.data,dataset.target)print(model)# make predictionsexpected=dataset.targetpredicted=model.predict(dataset.data)# summarize the fit of the modelprint(metrics.classification_report(expected,predicted))print(metrics.confusion_matrix(expected,predicted))

k-Nearest Neighbor

The k-Nearest Neighbor (kNN) method makes predictions by locating similar cases to a given data instance (using a similarity function) and returning the average or majority of the most similar data instances. The kNN algorithm can be used for classification or regression.

This recipe shows use of the kNN model to make predictions for the iris dataset.

k-Nearest Neighbor Python

# k-Nearest Neighbor
from sklearn import datasets
from sklearn import metrics
from sklearn.neighbors import KNeighborsClassifier
# load iris the datasets
dataset = datasets.load_iris()
# fit a k-nearest neighbor model to the data
model = KNeighborsClassifier()
model.fit(dataset.data, dataset.target)
print(model)
# make predictions
expected = dataset.target
predicted = model.predict(dataset.data)
# summarize the fit of the model
print(metrics.classification_report(expected, predicted))
print(metrics.confusion_matrix(expected, predicted))

12345678910111213141516

# k-Nearest Neighborfromsklearn importdatasetsfromsklearn importmetricsfromsklearn.neighbors importKNeighborsClassifier# load iris the datasetsdataset=datasets.load_iris()# fit a k-nearest neighbor model to the datamodel=KNeighborsClassifier()model.fit(dataset.data,dataset.target)print(model)# make predictionsexpected=dataset.targetpredicted=model.predict(dataset.data)# summarize the fit of the modelprint(metrics.classification_report(expected,predicted))print(metrics.confusion_matrix(expected,predicted))

Classification and Regression Trees

Classification and Regression Trees (CART) are constructed from a dataset by making splits that best separate the data for the classes or predictions being made. The CART algorithm can be used for classification or regression.

This recipe shows use of the CART model to make predictions for the iris dataset.

Decision Tree Classifier Python

# Decision Tree Classifier
from sklearn import datasets
from sklearn import metrics
from sklearn.tree import DecisionTreeClassifier
# load the iris datasets
dataset = datasets.load_iris()
# fit a CART model to the data
model = DecisionTreeClassifier()
model.fit(dataset.data, dataset.target)
print(model)
# make predictions
expected = dataset.target
predicted = model.predict(dataset.data)
# summarize the fit of the model
print(metrics.classification_report(expected, predicted))
print(metrics.confusion_matrix(expected, predicted))

12345678910111213141516

# Decision Tree Classifierfromsklearn importdatasetsfromsklearn importmetricsfromsklearn.tree importDecisionTreeClassifier# load the iris datasetsdataset=datasets.load_iris()# fit a CART model to the datamodel=DecisionTreeClassifier()model.fit(dataset.data,dataset.target)print(model)# make predictionsexpected=dataset.targetpredicted=model.predict(dataset.data)# summarize the fit of the modelprint(metrics.classification_report(expected,predicted))print(metrics.confusion_matrix(expected,predicted))

For more information see the API reference for CART for details on configuring the algorithm parameters. Also see the Decision Tree section of the user guide.

Support Vector Machines

Support Vector Machines (SVM) are a method that uses points in a transformed problem space that best separate classes into two groups. Classification for multiple classes is supported by a one-vs-all method. SVM also supports regression by modeling the function with a minimum amount of allowable error.

This recipe shows use of the SVM model to make predictions for the iris dataset.

Support Vector Machine Python

# Support Vector Machine
from sklearn import datasets
from sklearn import metrics
from sklearn.svm import SVC
# load the iris datasets
dataset = datasets.load_iris()
# fit a SVM model to the data
model = SVC()
model.fit(dataset.data, dataset.target)
print(model)
# make predictions
expected = dataset.target
predicted = model.predict(dataset.data)
# summarize the fit of the model
print(metrics.classification_report(expected, predicted))
print(metrics.confusion_matrix(expected, predicted))

12345678910111213141516

# Support Vector Machinefromsklearn importdatasetsfromsklearn importmetricsfromsklearn.svm importSVC# load the iris datasetsdataset=datasets.load_iris()# fit a SVM model to the datamodel=SVC()model.fit(dataset.data,dataset.target)print(model)# make predictionsexpected=dataset.targetpredicted=model.predict(dataset.data)# summarize the fit of the modelprint(metrics.classification_report(expected,predicted))print(metrics.confusion_matrix(expected,predicted))

For more information see the API reference for SVM for details on configuring the algorithm parameters. Also see the SVM section of the user guide.

Summary

In this post you have seen 5 self-contained recipes demonstrating some of the most popular and powerful supervised classification problems.

Each example is less than 20 lines that you can copy and paste and start using scikit-learn, right now. Stop reading and start practicing. Pick one recipe and run it, then start to play with the parameters and see what effect that has on the results.

Frustrated With Python Machine Learning?

Develop Your Own Models in Minutes

…with just a few lines of scikit-learn code

Covers self-study tutorials and end-to-end projects like:
Loading data, visualization, modeling, tuning, and much more…

Finally Bring Machine Learning To
Your Own Projects

Skip the Academics. Just Results.

Machine Learning Algorithm Recipes in scikit

Tweet Share Share Google Plus You have to get your hands dirty. You can read all of the blog pos

How to Tune a Machine Learning Algorithm in Weka

Tweet Share Share Google Plus Weka is the perfect platform for learning machine learning. It pro

[Machine Learning & Algorithm] 隨機森林（Random Forest）

閱讀目錄回到頂部 1 什麼是隨機森林？　　作為新興起的、高度靈活的一種機器學習演算法，隨機森林（Random Forest，簡稱RF）擁有廣泛的應用前景，從市場營銷到醫療保健保險，既可以用來做市場營銷模擬的建模，統計客戶來源，保留和流失，也可用來預測疾病的風險和病患

6 Steps To Write Any Machine Learning Algorithm From Scratch: Perceptron Case Study

This goes back to what I originally stated. If you don't understand the basics, don't tackle an algorithm from scratch. For the Perceptron, let's go ahead

Active Machine Learning Now Available in the VenioOne Platform

Venio Systems, the fastest growing eDiscovery technology provider, is excited to announce the availability of VenioOne Continuous Active Learning (CAL) in

Deploy any machine learning model serverless in AWS

When a machine learning model goes into production, it is very likely to be idle most of the time. There are a lot of use cases, where a model only needs t

Analytics, Machine Learning and AI in the Renewable Energy Sector

aitopics.org uses cookies to deliver the best possible experience. By continuing to use this site, you consent to the use of cookies. Learn more » I und

Analytics, machine learning aid hospice in helping patients

Infinity Hospice is using a population health management platform to improve services to Medicare patients. The end-of-life care organization, which serves

How can AI, Blockchain And Machine Learning Go Hand-in-Hand?

The last few years have seen exponential growth in new technologies. It seems that the world is now opening up to new ideas and experiments. Exponential te

Artificial intelligence helps track down mysterious cosmic radio bursts: Machine learning algorithm also helps search for new ki

Researchers at Breakthrough Listen, a SETI project led by the University of California, Berkeley, have now used machine learning to discover 72 new fast r

Machine Learning Algorithm Recipes in scikit

Scikit-Learn Recipes

Need help with Machine Learning in Python?

Logistic Regression

Naive Bayes

k-Nearest Neighbor

Classification and Regression Trees

Support Vector Machines

Summary

Frustrated With Python Machine Learning?

Develop Your Own Models in Minutes

Finally Bring Machine Learning To
Your Own Projects

Machine Learning Algorithm Recipes in scikit

How to Tune a Machine Learning Algorithm in Weka

[Machine Learning & Algorithm] 隨機森林（Random Forest）

6 Steps To Write Any Machine Learning Algorithm From Scratch: Perceptron Case Study

Active Machine Learning Now Available in the VenioOne Platform

Deploy any machine learning model serverless in AWS

Analytics, Machine Learning and AI in the Renewable Energy Sector

Analytics, machine learning aid hospice in helping patients

How can AI, Blockchain And Machine Learning Go Hand-in-Hand?

Artificial intelligence helps track down mysterious cosmic radio bursts: Machine learning algorithm also helps search for new ki

New open-source Machine Learning Framework written in Java

What Can Machine Learning Really Predict in Education?

How to Implement a Machine Learning Algorithm

Step Methodology To The Best Machine Learning Algorithm

How to Learn a Machine Learning Algorithm

6 Questions To Understand Any Machine Learning Algorithm

The Best Machine Learning Algorithm

How To Investigate Machine Learning Algorithm Behavior

[Python & Machine Learning] 學習筆記之scikit-learn機器學習庫

Meet Kalimdor -- A machine learning library written in TypeScript

Machine Learning Algorithm Recipes in scikit

Scikit-Learn Recipes

Need help with Machine Learning in Python?

Logistic Regression

Naive Bayes

k-Nearest Neighbor

Classification and Regression Trees

Support Vector Machines

Summary

Frustrated With Python Machine Learning?

Develop Your Own Models in Minutes

Finally Bring Machine Learning To Your Own Projects

相關推薦

Finally Bring Machine Learning To
Your Own Projects