Clever Application Of A Predictive Model

阿新 • • 發佈：2019-01-12

What if you could use a predictive model to find new combinations of attributes that do not exist in the data but could be valuable.

In Chapter 10 of Applied Predictive Modeling, Kuhn and Johnson provide a case study that does just this. It’s a fascinating and creative example of how to use a predictive model.

In this post we will discover this less obvious use of a predictive model and the types of experimental design to which it belongs.

Wet Concrete
Photo by Official U.S. Navy Page, some rights reserved

Compressive Strength of Concrete Mixtures

The problem modeled in the case study is the compressive strength of different concrete mixtures. Each record in the data is described by the amounts of ingredients of a concrete mixture, such as:

Cement
Fly ash
Blast furnace slag
Water
Superplasticizer
Coarse aggregate
Fine aggregate

The property of interest from the resulting mixture is the compressive strength of the concrete. Strong concrete with less or cheaper ingredients is desirable.

Refer to Chapter 10 of Applied Predictive Modeling

¬†for deeper insight into the problem.

Predictive Model

Many complex machine learning methods are spot checked on this regression problem, such as:

Linear Regression
Radial bias function Support Vector Machines (SVM)
Neural Networks
MARS
Regression Trees (CART and conditional inference trees)
Bagged and Boosted decision trees

Model accuracy was considered in terms of the RMSE and the R^2 of the predictions. Some of the better performing methods were Neural Networks, Boosted Decision Trees, Cubist and Random Forest.

Optimizing Compressing Strength

This is the clever part of the case study.

After accurate models were created and selected (Neural Networks and Cubist models), the models were used to locate new mixture quantities that resulted in improved concrete compressing strength.

This involved using a direct search method (also called pattern search) called the Nelder Mead algorithm to search the parameter space for a combination of mixture quantities that when passed to the predictive model, predicted a concrete compressing strength greater than any in the dataset.

A number of new mixtures were discovered and plotted in a projected domain relative to the provided data. These new mixtures represent the basis for actual commercial experiments that could be performed in order to find an improved concrete mixture.

Response Surface Methodology

The approach is related to a specific type of experimental design called Response Surface Methodology (RSM).

RSM is used when you want to develop, improve or optimize a process for a new or existing product. It’s commonly used for industrial settings. It is used for problems where the relationship between the inputs and the output are not well understood and need to be estimated.

Designed experiments are performed in order to collect examples of the inputs and the response variable or variables. The inputs variables may be quantities or timings in a process and the output or response variable is something desirable from the result like strength or quality.

The statistical model is constructed to approximate the relationship between the independent variables and the dependent variable, and finally an optimization process explores new combinations of inputs to maximize the output variable.

A critical step prior to performing the designed experiments is to reduce the number of variables to only those factors known to influence the response variable. This is a form of feature selection with which we are very familiar in machine learning.

Simple models are used to model the functional relationship, such as first or second order polynomials. The method is called response surface because of the continuous nature of the response surface for many problems and how it can be plotted as a surface in two-dimensions.

Surrogate Model

Surrogate modeling is when the model constructed in RSM is used in place of a simulation of the problem. For example, in aviation, you can design and build aircraft wings, design them in software and test them in simulators and model the results of experiments or simulation results and estimate new designs to test.

The models may be more elaborate to capture the complex non-linear relationships between the inputs and response variable. For example Support Vector Machines and Neural Networks may be used. Additionally, more powerful direct search methods may be used that use stochastic processes, such as simulated annealing or evolutionary algorithms.

The over-all process may be something like

Reduce the number of variables involved
Design experiments and execute them sequentially to collect source data to model
Construct a surrogate model from the experimental data
Apply a search method to the variables using the surrogate model
Sequentially perform experiments based on the optimized predictions of the surrogate model
Iterate Steps 3 to 5 until a stopping condition is met

Summary

In this post you discovered a clever way to use a predictive model.

In the case study you learned of an example of using machine learning algorithms to model the results of concrete mixture experiments, search the parameter space for mixers with optimal compressive strength that may be taken as the basis for further experiments.

You learned that this type of experimental design is called Response Surface Methodology and is used for industrial problems domains for processes like the concrete mixture example. You also learned that the predictive model is this case study is called a surrogate model.

This is a powerful method that you could use in other domains that have large computation overhead for performing simulations.

Resources

Below are some books you may want to look at to learn more about this approach to experimental design and optimization.

Clever Application Of A Predictive Model

Compressive Strength of Concrete Mixtures

Predictive Model

Optimizing Compressing Strength

Response Surface Methodology

Surrogate Model

Summary

Resources

Clever Application Of A Predictive Model

Build a predictive model on Watson Studio using CSV data set from Tweets

1---A Combined Model of Random Forest and Multilayer Perceptron to Forecast Expressway Traffic Flow

A Mathematical Model Captures the Political Impact of Fake News

An engine that classifies the content of a Reddit post: an application of Natural Language…

The cart before the horse: A new model of cause and effect

Use Auto Scaling to Improve the Fault Tolerance of an Application Behind a Load Balancer

A Relational Model of Data for Large Shared Data Banks 1970

Lowest Common Ancestor of a Binary Search Tree & a Binary Tree

Most efficient way to get the last element of a stream

Leetcode 17. Letter Combinations of a Phone number

leetcode_017 Letter Combinations of a Phone Number

POJ 2553 The Bottom of a Graph（強連通分量）

Lowest Common Ancestor of a Binary Search Tree

E - Fantasy of a Summation LightOJ1213

LeetCode235:Lowest Common Ancestor of a Binary Search Tree

[LeetCode] 331. Verify Preorder Serialization of a Binary Tree Java

The Bottom of a Graph

17. Letter Combinations of a Phone Number

Letter Combinations of a Phone Number

Clever Application Of A Predictive Model

Compressive Strength of Concrete Mixtures

Predictive Model

Optimizing Compressing Strength

Response Surface Methodology

Surrogate Model

Summary

Resources

相關推薦