1. 程式人生 > >Clever Application Of A Predictive Model

Clever Application Of A Predictive Model

What if you could use a predictive model to find new combinations of attributes that do not exist in the data but could be valuable.

In Chapter 10 of Applied Predictive Modeling, Kuhn and Johnson provide a case study that does just this. It’s a fascinating and creative example of how to use a predictive model.

In this post we will discover this less obvious use of a predictive model and the types of experimental design to which it belongs.

wet concrete

Wet Concrete
Photo by Official U.S. Navy Page, some rights reserved

Compressive Strength of Concrete Mixtures

The problem modeled in the case study is the compressive strength of different concrete mixtures. Each record in the data is described by the amounts of ingredients of a concrete mixture, such as:

  • Cement
  • Fly ash
  • Blast furnace slag
  • Water
  • Superplasticizer
  • Coarse aggregate
  • Fine aggregate

The property of interest from the resulting mixture is the compressive strength of the concrete. Strong concrete with less or cheaper ingredients is desirable.

Refer to Chapter 10 of Applied Predictive Modeling

 for deeper insight into the problem.

Predictive Model

Many complex machine learning methods are spot checked on this regression problem, such as:

  • Linear Regression
  • Radial bias function Support Vector Machines (SVM)
  • Neural Networks
  • MARS
  • Regression Trees (CART and conditional inference trees)
  • Bagged and Boosted decision trees

Model accuracy was considered in terms of the RMSE and the R^2 of the predictions. Some of the better performing methods were Neural Networks, Boosted Decision Trees, Cubist and Random Forest.

Optimizing Compressing Strength

This is the clever part of the case study.

After accurate models were created and selected (Neural Networks and Cubist models), the models were used to locate new mixture quantities that resulted in improved concrete compressing strength.

This involved using a direct search method (also called pattern search) called the Nelder Mead algorithm to search the parameter space for a combination of mixture quantities that when passed to the predictive model, predicted a concrete compressing strength greater than any in the dataset.

A number of new mixtures were discovered and plotted in a projected domain relative to the provided data. These new mixtures represent the basis for actual commercial experiments that could be performed in order to find an improved concrete mixture.

Response Surface Methodology

The approach is related to a specific type of experimental design called Response Surface Methodology (RSM).

RSM is used when you want to develop, improve or optimize a process for a new or existing product. It’s commonly used for industrial settings. It is used for problems where the relationship between the inputs and the output are not well understood and need to be estimated.

Designed experiments are performed in order to collect examples of the inputs and the response variable or variables. The inputs variables may be quantities or timings in a process and the output or response variable is something desirable from the result like strength or quality.

The statistical model is constructed to approximate the relationship between the independent variables and the dependent variable, and finally an optimization process explores new combinations of inputs to maximize the output variable.

A critical step prior to performing the designed experiments is to reduce the number of variables to only those factors known to influence the response variable. This is a form of feature selection with which we are very familiar in machine learning.

Simple models are used to model the functional relationship, such as first or second order polynomials. The method is called response surface because of the continuous nature of the response surface for many problems and how it can be plotted as a surface in two-dimensions.

Surrogate Model

Surrogate modeling is when the model constructed in RSM is used in place of a simulation of the problem. For example, in aviation, you can design and build aircraft wings, design them in software and test them in simulators and model the results of experiments or simulation results and estimate new designs to test.

The models may be more elaborate to capture the complex non-linear relationships between the inputs and response variable. For example Support Vector Machines and Neural Networks may be used. Additionally, more powerful direct search methods may be used that use stochastic processes, such as simulated annealing or evolutionary algorithms.

The over-all process may be something like

  1. Reduce the number of variables involved
  2. Design experiments and execute them sequentially to collect source data to model
  3. Construct a surrogate model from the experimental data
  4. Apply a search method to the variables using the surrogate model
  5. Sequentially perform experiments based on the optimized predictions of the surrogate model
  6. Iterate Steps 3 to 5 until a stopping condition is met

Summary

In this post you discovered a clever way to use a predictive model.

In the case study you learned of an example of using machine learning algorithms to model the results of concrete mixture experiments, search the parameter space for mixers with optimal compressive strength that may be taken as the basis for further experiments.

You learned that this type of experimental design is called Response Surface Methodology and is used for industrial problems domains for processes like the concrete mixture example. You also learned that the predictive model is this case study is called a surrogate model.

This is a powerful method that you could use in other domains that have large computation overhead for performing simulations.

Resources

Below are some books you may want to look at to learn more about this approach to experimental design and optimization.

相關推薦

Clever Application Of A Predictive Model

Tweet Share Share Google Plus What if you could use a predictive model to find new combinations

Build a predictive model on Watson Studio using CSV data set from Tweets

In the era that we currently live in, all the focus has shifted towards data. Each day, the amount of data that is generated and co

1---A Combined Model of Random Forest and Multilayer Perceptron to Forecast Expressway Traffic Flow

北郵大水比寫的,明顯就是造假   隨機森林與多層感知器相結合的高速公路交通流預測模型 隨機森林與多層組合模型感知器 A.隨機森林演算法  and it is an extension of Bagging algorithm 在迴歸預測問題中, 隨機森林演

A Mathematical Model Captures the Political Impact of Fake News

This story is for Medium members.Continue with FacebookContinue with GoogleMedium curates expert stories from leading publishers exclusively for members (w

An engine that classifies the content of a Reddit post: an application of Natural Language…

An engine that classifies the content of a Reddit post: an application of Natural Language Processing“white text on black background” by Lauren Peng on Uns

The cart before the horse: A new model of cause and effect

But in many cases, this one-way relationship between cause and effect fails to accurately describe reality. In a recent paper in Nature Communications, sc

Use Auto Scaling to Improve the Fault Tolerance of an Application Behind a Load Balancer

Amazon Web Services is Hiring. Amazon Web Services (AWS) is a dynamic, growing business unit within Amazon.com. We are currently hiring So

A Relational Model of Data for Large Shared Data Banks 1970

大型共享資料庫的資料關係模型 未來的資料庫使用者一定是和資料在機器中的儲存(即資料庫的內部模式)相互隔離的。而通過提示服務來提供資訊是一個不太令人滿意的解決方法。當資料可得內部模式表示發生改變,甚至資料外部表示的多個方面發生改變的時候,終端使用者和大多數的應用程式的活動都不

Lowest Common Ancestor of a Binary Search Tree & a Binary Tree

max 結果 solution 返回 分析 else 當前 n) 如果 235. Lowest Common Ancestor of a Binary Search Tree 題目鏈接:https://leetcode.com/problems/lowest-common-

Most efficient way to get the last element of a stream

val lang ted reduce class ret return imp pretty Do a reduction that simply returns the current value:Stream<T> stream; T last = str

Leetcode 17. Letter Combinations of a Phone number

res bsp self. col join lee num nat leetcode 求給出的數字串,如果按照電話鍵盤的編譯方式,可以給出多少那些對應的數字組合。例如: Input:Digit string "23" Output: ["ad", "ae", "af"

leetcode_017 Letter Combinations of a Phone Number

like present class digits div all dfs hat upload Given a digit string, return all possible letter combinations that the number could repr

POJ 2553 The Bottom of a Graph(強連通分量)

margin target 代碼 not push ret dsm ng- http POJ 2553 The Bottom of a Graph 題目鏈接 題意:給定一個有向圖,求出度為0的強連通分量 思路:縮點搞就可以 代碼: #include <

Lowest Common Ancestor of a Binary Search Tree

tor cnblogs span || ces while 宋體 tco earch       3 /  5 1 / \ / 6 2 0 8 /

E - Fantasy of a Summation LightOJ1213

too gree time color lib print lose -- use E - Fantasy of a Summation Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 128000/

LeetCode235:Lowest Common Ancestor of a Binary Search Tree

itself 一個 post 特性 || arc order amp ear Given a binary search tree (BST), find the lowest common ancestor (LCA) of two given nodes in t

[LeetCode] 331. Verify Preorder Serialization of a Binary Tree Java

sep find with har ted 分支 input enc equal 題目: One way to serialize a binary tree is to use pre-order traversal. When we encounter a non-nu

The Bottom of a Graph

ive limit rtai assume ted can hab spa mean                poj——The Bottom of a Graph

17. Letter Combinations of a Phone Number

leetcode lan esc ber des let bsp nat leet https://leetcode.com/problems/letter-combinations-of-a-phone-number/#/description 17. Letter C

Letter Combinations of a Phone Number

elf cal con rep python lis commons wiki san Given a digit string, return all possible letter combinations that the number could represent