Solving the Kaggle Telco Customer Churn challenge in minutes with AuDaS

阿新 • • 發佈：2018-12-28

Solving the Kaggle Telco Customer Churn challenge in minutes using AuDaS

AuDaS is the automated Data Scientist developed by Mind Foundry which aims to allow anyone, with or without a background in Data Science to easily build and deploy quality controlled Machine Learning pipelines. AuDaS empowers Business Analysts and Data Scientists by allowing them to easily insert their domain expertise in the model building process and extract actionable insights.

In this tutorial we are going to see how we can build a classification pipeline in minutes using AuDaS. The goal is to predict Telco customer churn using data from Kaggle. In this case, a customer churns when they decide to cancel their subscription or not renew it. This is costly for Telcos because it is more expensive to acquire new customers than retain existing ones. With a predictive model, a Telco can anticipate which of its customers are most likely to churn and take the appropriate decisions to retain them.

In this tutorial we will follow the standard Data Science process:

Data Preparation
Pipeline Construction and tuning
Interpretation and deployment

Data Preparation

First we are going to load the data into AuDaS which in this case is a simple csv with 21 columns and 3333 rows:

Each row represents a customer and each column an attribute which includes the number of voice mails, total minutes (day/night), total calls (day/night), etc.

AuDaS automatically scans the data, detects the type of each column and provides data preparation advice highlighted by the light bulbs. This is where the Business Analyst of Data Scientist can introduce their domain knowledge by acting on the relevant advice with the appropriate answers.

In this case, we know that the missing values in “number voice mail messages” column should actually be filled in with 0 and can be done very easily by simply clicking:

Once we have done this as well as dropped the Customer id and phone numbers which don’t contain any predictive power, we are able to get a full audit trail of all the data preparation steps. We can also easily revert to pre-prepared versions of the data if we wish to do so.

AuDaS also automatically generates nice visualisations of the data for you.

You can also enrich the data set by joining other data sets (if applicable). AuDaS supports all the other data preparation steps you would normally do. Now that we are happy with the data we are going to build our Classification pipeline.

Processing the data

AuDaS allows you to quickly set up your classification process for which you only need to select the target column and specify the model validation framework and scoring metrics.

AuDaS then launches and searches the solution space of possible pipelines (feature engineering and machine learning models) and their associated hyper-parameters using OPTaaS, our Bayesian Optimizer. AuDaS also keeps an audit trail of all the pipelines it has evaluated which you can query if required.

I have previously written about the intuitions and advantages of using Bayesian Optimization. OPTaaS is also available as an API and you can contact me for a key.

Deploying the solution

Once AuDaS has found the best pipeline, it will run final quality checks on 10% of the data which was held out right from the start and was never used during any of the model training. The performance metrics on this 10% hold out are presented at the end and AuDaS provides full transparency of the final pipeline it has chosen (feature engineering, models, hyper-parameter values).

The model can then be integrated into your website, products or business process via an automatically generated RESTful API. The feature relevance is provided by LIME and for our final model, the main feature that predicts churn is the total day charge.

A more complete tutorial can be found here. If you are interested in trying AuDaS, please don’t hesitate to reach out!

Team and Resources

Mind Foundry is an Oxford University spin-out founded by Professors Stephen Roberts and Michael Osborne who have 35 person years in data analytics. The Mind Foundry team is composed of over 30 world class Machine Learning researchers and elite software engineers, many former post-docs from the University of Oxford. Moreover, Mind Foundry has a privileged access to over 30 Oxford University Machine Learning PhDs through its spin-out status. Mind Foundry is a portfolio company of the University of Oxford and its investors include Oxford Sciences Innovation, the Oxford Technology and Innovations Fund, the University of Oxford Innovation Fund and Parkwalk Advisors.

Solving the Kaggle Telco Customer Churn challenge in minutes with AuDaS

Solving the Kaggle Telco Customer Churn challenge in minutes using AuDaSAuDaS is the automated Data Scientist developed by Mind Foundry which aims to allow

Real time prediction of telco customer churn using Watson Machine Learning from Cognos dashboard

Summary Cognos 11 is not only positioned toward the professional report author, but specifically toward power users and data scientis

Deploying a Python serverless function in minutes with GCP

A few questionsWhat is Cloud Functions?Cloud Functions is a managed service for serverless functions. The acronym describing such a service is FaaS (Functi

Create a REST API in Minutes With Pyramid and Ramses

This is a guest blog post from Chris Hart of Brandicted—a technologist from the great city of Montreal. This tutorial is meant for beginners. If you get

The Julia Challenge in C++

The Julia Challenge in C++Or how to write a minimal template expression engine in C++.Recently, some folks of the Julia community were boasting about the e

customer case studies in the AWS cloud

Deloitte adopted an infrastructure-as-code approach to reduce deployment times for its ConvergeHEALTH Miner solution, which enables orga

Choose unique values for the 'webAppRootKey' context-param in your web.xml files!

pear http unique syn all 題解錯誤信息不同的 sync 在Tomcat的server.xml中配置兩個context，出現其中一個不能正常啟動，交換配置順序，另一個又不能正常啟動，即始終只有第二個配置能啟動的情況。如果單獨部署，都沒有問題。報錯大

leetcode-Evaluate the value of an arithmetic expression in Reverse Polish Notation

ret i++ value reverse alua style 執行掃描 span leetcode 逆波蘭式求解 Evaluate the value of an arithmetic expression in Reverse Polish Notation. Va

[React] Render Elements Outside the Current React Tree using Portals in React 16

wrap wrapper http att sso ood tps ant ref By default the React Component Tree directly maps to the DOM Tree. In some cases when you have

The resource configuration is not modifiable in this context.

fig 顯示 als 使用報錯 context 後臺相同 onf 項目中使用了Jersey RESTful 框架, 更新代碼後服務能正常起來, 在頁面登錄時驗證碼不顯示後臺報錯 java.lang.IllegalStateException: The resour

The Usage of Lambda and Heap in the C++ STL

ner class eap cto con c++ stl nts been nta The Usage of Lambda and Heap in the C++ STL Heap In c++ STL, the heap had been implemented as

find out the installed and runing tomcat version in Linux

post server node hit ctu num for ots IT To find out the Tomcat version, find this file – version.sh for *nix or version.bat for Windows.

Sharepoint 2013 issue : The solution cannot be deployed. The feature 'XXXX' uses the directory "XXXX Feature" in the solution.

current stsadm tac cto tps pre try mage erro The Issue: I have a weird error when trying to Add a solution via PowerShell: Add-SPSolutio

Solving the Kaggle Telco Customer Churn challenge in minutes with AuDaS

Solving the Kaggle Telco Customer Churn challenge in minutes using AuDaS

Data Preparation

Processing the data

Deploying the solution

Team and Resources

Solving the Kaggle Telco Customer Churn challenge in minutes with AuDaS

Real time prediction of telco customer churn using Watson Machine Learning from Cognos dashboard

Deploying a Python serverless function in minutes with GCP

Create a REST API in Minutes With Pyramid and Ramses

The Julia Challenge in C++

customer case studies in the AWS cloud

Choose unique values for the 'webAppRootKey' context-param in your web.xml files!

leetcode-Evaluate the value of an arithmetic expression in Reverse Polish Notation

[React] Render Elements Outside the Current React Tree using Portals in React 16

The resource configuration is not modifiable in this context.

The Usage of Lambda and Heap in the C++ STL

find out the installed and runing tomcat version in Linux

Sharepoint 2013 issue : The solution cannot be deployed. The feature 'XXXX' uses the directory "XXXX Feature" in the solution.

translation of the paper sequence and structure conservation in a protein core

Choose unique values for the 'webAppRootKey' context-param in your web.xml files! 錯誤的解決

ios之error: The sandbox is not in sync with the Podfile.lock.

Errors occurred during the build Errors running builder 'In

2017.10.22 VC助手 All instances of the lincense "XX" are in use.

Solving the SQL Server Multiple Cascade Path Issue with a Trigger （轉載）

CocoaPods報錯：The dependency `AFNetworking ` is not used in any concrete target

Solving the Kaggle Telco Customer Churn challenge in minutes with AuDaS

Solving the Kaggle Telco Customer Churn challenge in minutes using AuDaS

Data Preparation

Processing the data

Deploying the solution

Team and Resources

相關推薦