How to Develop a Reusable Framework to Spot

Spot-checking algorithms is a technique in applied machine learning designed to quickly and objectively provide a first set of results on a new predictive modeling problem.

Unlike grid searching and other types of algorithm tuning that seek the optimal algorithm or optimal configuration for an algorithm, spot-checking is intended to evaluate a diverse set of algorithms rapidly and provide a rough first-cut result. This first cut result may be used to get an idea if a problem or problem representation is indeed predictable, and if so, the types of algorithms that may be worth investigating further for the problem.

Spot-checking is an approach to help overcome the “hard problem” of applied machine learning and encourage you to clearly think about the higher-order search problem being performed in any machine learning project.

In this tutorial, you will discover the usefulness of spot-checking algorithms on a new predictive modeling problem and how to develop a standard framework for spot-checking algorithms in python for classification and regression problems.

After completing this tutorial, you will know:

Spot-checking provides a way to quickly discover the types of algorithms that perform well on your predictive modeling problem.
How to develop a generic framework for loading data, defining models, evaluating models, and summarizing results.
How to apply the framework for classification and regression problems.

Let’s get started.

How to Develop a Reusable Framework for Spot-Check Algorithms in Python
Photo by Jeff Turner, some rights reserved.

Tutorial Overview

This tutorial is divided into five parts; they are:

Spot-Check Algorithms
Spot-Checking Framework in Python
Spot-Checking for Classification
Spot-Checking for Regression
Framework Extension

1. Spot-Check Algorithms

We cannot know beforehand what algorithms will perform well on a given predictive modeling problem.

This is the hard part of applied machine learning that can only be resolved via systematic experimentation.

Spot-checking is an approach to this problem.

It involves rapidly testing a large suite of diverse machine learning algorithms on a problem in order to quickly discover what algorithms might work and where to focus attention.

It is fast; it by-passes the days or weeks of preparation and analysis and playing with algorithms that may not ever lead to a result.
It is objective, allowing you to discover what might work well for a problem rather than going with what you used last time.
It gets results; you will actually fit models, make predictions and know if your problem can be predicted and what baseline skill may look like.

Spot-checking may require that you work with a small sample of your dataset in order to turn around results quickly.

Finally, the results from spot checking are a jumping-off point. A starting point. They suggest where to focus attention on the problem, not what the best algorithm might be. The process is designed to shake you out of typical thinking and analysis and instead focus on results.

You can learn more about spot-checking in the post:

Now that we know what spot-checking is, let’s look at how we can systematically perform spot-checking in Python.

2. Spot-Checking Framework in Python

In this section we will build a framework for a script that can be used for spot-checking machine learning algorithms on a classification or regression problem.

There are four parts to the framework that we need to develop; they are:

Load Dataset
Define Models
Evaluate Models
Summarize Results

Let’s take a look at each in turn.

Load Dataset

The first step of the framework is to load the data.

The function must be implemented for a given problem and be specialized to that problem. It will likely involve loading data from one or more CSV files.

We will call this function load_data(); it will take no arguments and return the inputs (X) and outputs (y) for the prediction problem.

# load the dataset, returns X and y elements
def load_dataset():
	X, y = None, None
	return X, y

1234	# load the dataset, returns X and y elementsdef load_dataset():X,y=None,NonereturnX,y

Define Models

The next step is to define the models to evaluate on the predictive modeling problem.

The models defined will be specific to the type predictive modeling problem, e.g. classification or regression.

The defined models should be diverse, including a mixture of:

Linear Models.
Nonlinear Models.
Ensemble Models.

Each model should be a given a good chance to perform well on the problem. This might be mean providing a few variations of the model with different common or well known configurations that perform well on average.

We will call this function define_models(). It will return a dictionary of model names mapped to scikit-learn model object. The name should be short, like ‘svm‘ and may include a configuration detail, e.g. ‘knn-7’.

The function will also take a dictionary as an optional argument; if not provided, a new dictionary is created and populated. If a dictionary is provided, models are added to it.

This is to add flexibility if you would like to have multiple functions for defining models, or add a large number of models of a specific type with different configurations.

# create a dict of standard models to evaluate {name:object}
def define_models(models=dict()):
	# ...
	return models

1234	# create a dict of standard models to evaluate {name:object}def define_models(models=dict()):# ...returnmodels

The idea is not to grid search model parameters; that can come later.

Instead, each model should be given an opportunity to perform well (i.e. not optimally). This might mean trying many combinations of parameters in some cases, e.g. in the case of gradient boosting.

Evaluate Models

The next step is the evaluation of the defined models on the loaded dataset.

The scikit-learn library provides the ability to pipeline models during evaluation. This allows the data to be transformed prior to being used to fit a model, and this is done in a correct way such that the transforms are prepared on the training data and applied to the test data.

We can define a function that prepares a given model prior to evaluation to allow specific transforms to be used during the spot checking process. They will be performed in a blanket way to all models. This can be useful to perform operations such as standardization, normalization, and feature selection.

We will define a function named make_pipeline() that takes a defined model and returns a pipeline. Below is an example of preparing a pipeline that will first standardize the input data, then normalize it prior to fitting the model.

# create a feature preparation pipeline for a model
def make_pipeline(model):
	steps = list()
	# standardization
	steps.append(('standardize', StandardScaler()))
	# normalization
	steps.append(('normalize', MinMaxScaler()))
	# the model
	steps.append(('model', model))
	# create pipeline
	pipeline = Pipeline(steps=steps)
	return pipeline

123456789101112

# create a feature preparation pipeline for a modeldef make_pipeline(model):steps=list()# standardizationsteps.append(('standardize',StandardScaler()))# normalizationsteps.append(('normalize',MinMaxScaler()))# the modelsteps.append(('model',model))# create pipelinepipeline=Pipeline(steps=steps)returnpipeline

This function can be expanded to add other transforms, or simplified to return the provided model with no transforms.

Now we need to evaluate a prepared model.

We will use a standard of evaluating models using k-fold cross-validation. The evaluation of each defined model will result in a list of results. This is because 10 different versions of the model will have been fit and evaluated, resulting in a list of k scores.

We will define a function named evaluate_model() that will take the data, a defined model, a number of folds, and a performance metric used to evaluate the results. It will return the list of scores.

The function calls make_pipeline() for the defined model to prepare any data transforms required, then calls the cross_val_score() scikit-learn function. Importantly, the n_jobs argument is set to -1 to allow the model evaluations to occur in parallel, harnessing as many cores as you have available on your hardware.

# evaluate a single model
def evaluate_model(X, y, model, folds, metric):
	# create the pipeline
	pipeline = make_pipeline(model)
	# evaluate model
	scores = cross_val_score(pipeline, X, y, scoring=metric, cv=folds, n_jobs=-1)
	return scores

1234567

# evaluate a single modeldef evaluate_model(X,y,model,folds,metric):# create the pipelinepipeline=make_pipeline(model)# evaluate modelscores=cross_val_score(pipeline,X,y,scoring=metric,cv=folds,n_jobs=-1)returnscores

It is possible for the evaluation of a model to fail with an exception. I have seen this especially in the case of some models from the statsmodels library.

It is also possible for the evaluation of a model to result in a lot of warning messages. I have seen this especially in the case of using XGBoost models.

We do not care about exceptions or warnings when spot checking. We only want to know what does work and what works well. Therefore, we can trap exceptions and ignore all warnings when evaluating each model.

The function named robust_evaluate_model() implements this behavior. The evaluate_model() is called in a way that traps exceptions and ignores warnings. If an exception occurs and no result was possible for a given model, a None result is returned.

# evaluate a model and try to trap errors and and hide warnings
def robust_evaluate_model(X, y, model, folds, metric):
	scores = None
	try:
		with warnings.catch_warnings():
			warnings.filterwarnings("ignore")
			scores = evaluate_model(X, y, model, folds, metric)
	except:
		scores = None
	return scores

12345678910

# evaluate a model and try to trap errors and and hide warningsdef robust_evaluate_model(X,y,model,folds,metric):scores=Nonetry:with warnings.catch_warnings():warnings.filterwarnings("ignore")scores=evaluate_model(X,y,model,folds,metric)except:scores=Nonereturnscores

Finally, we can define the top-level function for evaluating the list of defined models.

We will define a function named evaluate_models() that takes the dictionary of models as an argument and returns a dictionary of model names to lists of results.

The number of folds in the cross-validation process can be specified by an optional argument that defaults to 10. The metric calculated on the predictions from the model can also be specified by an optional argument and defaults to classification accuracy.

For a full list of supported metrics, see this list:

Any None results are skipped and not added to the dictionary of results.

Importantly, we provide some verbose output, summarizing the mean and standard deviation of each model after it was evaluated. This is helpful if the spot checking process on your dataset takes minutes to hours.

# evaluate a dict of models {name:object}, returns {name:score}
def evaluate_models(X, y, models, folds=10, metric='accuracy'):
	results = dict()
	for name, model in models.items():
		# evaluate the model
		scores = robust_evaluate_model(X, y, model, folds, metric)
		# show process
		if scores is not None:
			# store a result
			results[name] = scores
			mean_score, std_score = mean(scores), std(scores)
			print('>%s: %.3f (+/-%.3f)' % (name, mean_score, std_score))
		else:
			print('>%s: error' % name)
	return results

123456789101112131415

# evaluate a dict of models {name:object}, returns {name:score}def evaluate_models(X,y,models,folds=10,metric='accuracy'):results=dict()forname,model inmodels.items():# evaluate the modelscores=robust_evaluate_model(X,y,model,folds,metric)# show processifscores isnotNone:# store a resultresults[name]=scoresmean_score,std_score=mean(scores),std(scores)print('>%s: %.3f (+/-%.3f)'%(name,mean_score,std_score))else:print('>%s: error'%name)returnresults

Note that if for some reason you want to see warnings and errors, you can update the evaluate_models() to call the evaluate_model() function directly, by-passing the robust error handling. I find this useful when testing out new methods or method configurations that fail silently.

Summarize Results

Finally, we can evaluate the results.

Really, we only want to know what algorithms performed well.

Two useful ways to summarize the results are:

Line summaries of the mean and standard deviation of the top 10 performing algorithms.
Box and whisker plots of the top 10 performing algorithms.

The line summaries are quick and precise, although assume a well behaving Gaussian distribution, which may not be reasonable.

The box and whisker plots assume no distribution and provide a visual way to directly compare the distribution of scores across models in terms of median performance and spread of scores.

We will define a function named summarize_results() that takes the dictionary of results, prints the summary of results, and creates a boxplot image that is saved to file. The function takes an argument to specify if the evaluation score is maximizing, which by default is True. The number of results to summarize can also be provided as an optional parameter, which defaults to 10.

The function first orders the scores before printing the summary and creating the box and whisker plot.

# print and plot the top n results
def summarize_results(results, maximize=True, top_n=10):
	# check for no results
	if len(results) == 0:
		print('no results')
		return
	# determine how many results to summarize
	n = min(top_n, len(results))
	# create a list of (name, mean(scores)) tuples
	mean_scores = [(k,mean(v)) for k,v in results.items()]
	# sort tuples by mean score
	mean_scores = sorted(mean_scores, key=lambda x: x[1])
	# reverse for descending order (e.g. for accuracy)
	if maximize:
		mean_scores = list(reversed(mean_scores))
	# retrieve the top n for summarization
	names = [x[0] for x in mean_scores[:n]]
	scores = [results[x[0]] for x in mean_scores[:n]]
	# print the top n
	print()
	for i in range(n):
		name = names[i]
		mean_score, std_score = mean(results[name]), std(results[name])
		print('Rank=%d, Name=%s, Score=%.3f (+/- %.3f)' % (i+1, name, mean_score, std_score))
	# boxplot for the top n
	pyplot.boxplot(scores, labels=names)
	_, labels = pyplot.xticks()
	pyplot.setp(labels, rotation=90)
	pyplot.savefig('spotcheck.png')

1234567891011121314151617181920212223242526272829

# print and plot the top n resultsdef summarize_results(results,maximize=True,top_n=10):# check for no resultsiflen(results)==0:print('no results')return# determine how many results to summarizen=min(top_n,len(results))# create a list of (name, mean(scores)) tuplesmean_scores=[(k,mean(v))fork,vinresults.items()]# sort tuples by mean scoremean_scores=sorted(mean_scores,key=lambdax:x[1])# reverse for descending order (e.g. for accuracy)ifmaximize:mean_scores=list(reversed(mean_scores))# retrieve the top n for summarizationnames=[x[0]forxinmean_scores[:n]]scores=[results[x[0]]forxinmean_scores[:n]]# print the top nprint()foriinrange(n):name=names[i]mean_score,std_score=mean(results[name]),std(results[name])print('Rank=%d, Name=%s, Score=%.3f (+/- %.3f)'%(i+1,name,mean_score,std_score))# boxplot for the top npyplot.boxplot(scores,labels=names)_,labels=pyplot.xticks()pyplot.setp(labels,rotation=90)pyplot.savefig('spotcheck.png')

Now that we have specialized a framework for spot-checking algorithms in Python, let’s look at how we can apply it to a classification problem.

3. Spot-Checking for Classification

We will generate a binary classification problem using the make_classification() function.

The function will generate 1,000 samples with 20 variables, with some redundant variables and two classes.

# load the dataset, returns X and y elements
def load_dataset():
	return make_classification(n_samples=1000, n_classes=2, random_state=1)

123	# load the dataset, returns X and y elementsdef load_dataset():returnmake_classification(n_samples=1000,n_classes=2,random_state=1)

As a classification problem, we will try a suite of classification algorithms, specifically:

Linear Algorithms

Logistic Regression
Ridge Regression
Stochastic Gradient Descent Classifier
Passive Aggressive Classifier

I tried LDA and QDA, but they sadly crashed down in the C-code somewhere.

Nonlinear Algorithms

k-Nearest Neighbors
Classification and Regression Trees
Extra Tree
Support Vector Machine
Naive Bayes

Ensemble Algorithms

AdaBoost
Bagged Decision Trees
Random Forest
Extra Trees
Gradient Boosting Machine

Further, I added multiple configurations for a few of the algorithms like Ridge, kNN, and SVM in order to give them a good chance on the problem.

The full define_models() function is listed below.

# create a dict of standard models to evaluate {name:object}
def define_models(models=dict()):
	# linear models
	models['logistic'] = LogisticRegression()
	alpha = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
	for a in alpha:
		models['ridge-'+str(a)] = RidgeClassifier(alpha=a)
	models['sgd'] = SGDClassifier(max_iter=1000, tol=1e-3)
	models['pa'] = PassiveAggressiveClassifier(max_iter=1000, tol=1e-3)
	# non-linear models
	n_neighbors = range(1, 21)
	for k in n_neighbors:
		models['knn-'+str(k)] = KNeighborsClassifier(n_neighbors=k)
	models['cart'] = DecisionTreeClassifier()
	models['extra'] = ExtraTreeClassifier()
	models['svml'] = SVC(kernel='linear')
	models['svmp'] = SVC(kernel='poly')
	c_values = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
	for c in c_values:
		models['svmr'+str(c)] = SVC(C=c)
	models['bayes'] = GaussianNB()
	# ensemble models
	n_trees = 100
	models['ada'] = AdaBoostClassifier(n_estimators=n_trees)
	models['bag'] = BaggingClassifier(n_estimators=n_trees)
	models['rf'] = RandomForestClassifier(n_estimators=n_trees)
	models['et'] = ExtraTreesClassifier(n_estimators=n_trees)
	models['gbm'] = GradientBoostingClassifier(n_estimators=n_trees)
	print('Defined %d models' % len(models))
	return models

123456789101112131415161718192021222324252627282930

# create a dict of standard models to evaluate {name:object}def define_models(models=dict()):# linear

How to Develop a Reusable Framework to Spot

Tweet Share Share Google Plus Spot-checking algorithms is a technique in applied machine learnin

Ask HN: How to develop a core competency?

I have always been obsessed with being an alpha programmer (someone that can do everything). Its been five years and I have been able to do a few but have

How to Develop a Neural Machine Translation System from Scratch

Tweet Share Share Google Plus Develop a Deep Learning Model to Automatically Translate from Germ

How to Develop a Deep Learning Photo Caption Generator from Scratch

Tweet Share Share Google Plus Develop a Deep Learning Model to Automatically Describe Photograph

How to adopt a strategic approach to AI projects

California‑based Farmers Insurance has invested aggressively in AI in recent years. One project frees up time for claim adjusters by using image recognitio

How to clone a GitHub project to go work space > LinxLabs

In this post I’ll explain how you can clone a project from git hub to your go work space If you don’t have a project on GitHub to play with ; then please

How to add a text filter to Django Admin

Support: Yes, for now…The problem with search fieldsDjango Admin search fields are great — throw a bunch of fields in search_fields and Django will handle

Ask HN: Want to join a “HackersEscapeWinter” trip to Lisbon or Tel

Winter in (northern) Europe can be depressive - grey & cold. How about a "HackersEscapeWinter" travel group - spending winter in cities with nice weath

Caused by: java.sql.SQLException: Unable to open a test connection to the given database報錯無法開啟到給定資料庫

在啟動hive或則其他要連線資料庫的時候，都有可能遇到這個問題 [[email protected] bin]$ ./hive Caused by: java.sql.SQLException: Unable to open a test connection to the

An owner of this repository has limited the ability to open a pull request to users that are collaborators on this repository.

git 無法發起：pull request，提示：An owner of this repository has limited the ability to open a pull request to users that are collaborators on this repositor

How to Develop a Reusable Framework to Spot

Tutorial Overview

1. Spot-Check Algorithms

2. Spot-Checking Framework in Python

Load Dataset

Define Models

Evaluate Models

Summarize Results

3. Spot-Checking for Classification

Linear Algorithms

Nonlinear Algorithms

Ensemble Algorithms

How to Develop a Reusable Framework to Spot

Ask HN: How to develop a core competency?

How to Develop a Neural Machine Translation System from Scratch

How to Develop a Deep Learning Photo Caption Generator from Scratch

How to adopt a strategic approach to AI projects

How to clone a GitHub project to go work space > LinxLabs

How to add a text filter to Django Admin

Ask HN: Want to join a “HackersEscapeWinter” trip to Lisbon or Tel

Caused by: java.sql.SQLException: Unable to open a test connection to the given database報錯無法開啟到給定資料庫

An owner of this repository has limited the ability to open a pull request to users that are collaborators on this repository.

共享檔案--安裝VMware tool時,出現The path va " " appears to be a valid path to the gcc binary.

How to Remove A Service Entry From Win10 Service List

WPF:How to display a Bitmap on Image control

How to write a robust system level service - some key learning - 如何寫好一個健壯的系統級服務

【轉】How to initialize a two-dimensional array in Python?

How to Have a Healthy Relationship --shanbei 為單身節寫

How to force a log switch-強制切換日誌

HOW TO BECOMING A FREELANCE ENGINEERING CONSULTANT.

jquery ----> How to Create a Basic Plugin (翻譯)

How to Create a Perl Based Custom Monitor on NetScaler

How to Develop a Reusable Framework to Spot

Tutorial Overview

1. Spot-Check Algorithms

2. Spot-Checking Framework in Python

Load Dataset

Define Models

Evaluate Models

Summarize Results

3. Spot-Checking for Classification

Linear Algorithms

Nonlinear Algorithms

Ensemble Algorithms

相關推薦