1. 程式人生 > >ANZ Chengdu Data Science Competition——BASELINE 澳新銀行存款大資料建模預測

ANZ Chengdu Data Science Competition——BASELINE 澳新銀行存款大資料建模預測

# -*- coding: utf-8 -*-
"""
Created on Fri Nov  9 09:58:21 2018

@author: Lenovo
"""

import lightgbm as lgb
import pandas as pd
from sklearn.model_selection import StratifiedKFold
from sklearn.metrics import roc_auc_score

bat = pd.read_csv('bank-additional-train.csv')

columns = list(bat.columns)
#bar['']

shuzhitype = ['age' ,'duration', 'campaign', 'emp.var.rate','pdays', 'cons.conf.idx', 'cons.price.idx', 'euribor3m', 'nr.employed']

for i in columns:
    if i not in shuzhitype:
        bat[i] = bat[i].astype('category')
#        print(i)
bat['emp.var.rate'] = bat['emp.var.rate']*bat['emp.var.rate']
#print(bat)
skf = StratifiedKFold(n_splits=10, shuffle=True, random_state=2018)
y = bat.y.astype('category')
x_tr = bat.drop(['y'],axis=1)
params = {
    'task': 'train',
    'boosting_type': 'gbdt',  # GBDT演算法為基礎
    'objective': 'binary',  # 因為要完成預測使用者是否被處罰行為
    'metric': 'auc',  # 評判指標
    #'max_bin': 255,  # 大會有更準的效果,更慢的速度
    'learning_rate': 0.1,  # 學習率
    'num_leaves': 32,  # 大會更準,但可能過擬合
    'max_depth': -1,  # 小資料集下限制最大深度可防止過擬合,小於0表示無限制
    'feature_fraction': 0.8,  # 防止過擬合
    'bagging_freq': 5,  # 防止過擬合
    'bagging_fraction': 0.8,  # 防止過擬合
#    'min_data_in_leaf': 21,  # 防止過擬合
#    'min_sum_hessian_in_leaf': 3.0,  # 防止過擬合
    'header': True  # 資料集是否帶表頭
}

for k,(train_idx,val_idx) in enumerate(skf.split(x_tr,y)):
#    print(k)
#    print(train_idx)
    x_tr_1,y_1,x_val_tr_2,y_val_2 = x_tr.iloc[train_idx],y.iloc[train_idx],x_tr.iloc[val_idx],y.iloc[val_idx]
#    print(y_1)
    train1 = lgb.Dataset(x_tr_1,y_1)
    
    val1 = lgb.Dataset(x_val_tr_2,y_val_2)
#    print(y_val_2)
    model = lgb.train(params,train1,  # 指明哪些特徵的分類特徵
        valid_sets=[val1],
        num_boost_round=2000000,        
        early_stopping_rounds=300)
    
    bm2 = roc_auc_score(y_val_2,model.predict(x_val_tr_2,num_iteration=model.best_iteration))
    
#lgb.Dataset()

Competition background

ANZ is a leading bank in Australia and New Zealand and in Asia. From 7 November 2018 we'll officially open the ANZ Chengdu Data Science Competition

This is the first data science competition we've organised of this type. We hope you're excited too because the prize is a good one: in addition to the cash prizes we're 

offering internships/employment opportunities with ANZ to the winners and/or highly regarded entrants!  

The challenge: predict (using provided datasets) whether the client will subscribe to a term deposit.  

Participants are expected to undertake a thorough analysis of the dataset and build a prediction model to solve this business problem.

We are looking for participants who can design a customer response model with high differentiation power and high precision by analyzing customer's multi-dimensional banking information and customer behavior characteristics, using data analysis and advanced machine learning algorithms. 

Our expert judging panel look forward to receiving your submission by 18 November 2018.  

Up to 10 groups or individuals will be selected as finalists to deliver a presentation of their findings. Winners will be announced on 23 November 2018.  

Please explore this site for instructions about the competition and how to enter, as well as information about ANZ, why we're focused on data and digital, what it's like to work with us and our proud history as a market leader bank in Australia and New Zealand and beyond dating back more than 180 years.     


Awards

1.Cash Prize 

First Prize: ¥10,000 for 1 Team/Person Second Prize: ¥6,000 for 1 Team/Person Third Prize: ¥4,000 for 1 Team/Person 

2.Career Opportunities The team/persons who place and any highly regarded entrants will be given the opportunity to partake in an interview for internship or job opportunities with ANZ. Interviews are conducted and assessed at the individual level. 

Note: All awards and opportunities are offered in accordance with the competition terms and conditions. 


Time schedule

Competition Opens – 7th November 2018 2pm UTC 

Competition Closes – 18th November 2018 5pm UTC 

Judging of Entries – 19th to 21st November 2018

Finalists announced – 21st November 2018 After reviewing submitted source codes and solution reports from the online entries, the judging panel will select up to 10 finalists (teams or individuals). 

Finalists Presentations – 22nd to 23rd November 2018 All finalists will be invited for final presentation either through attending the ANZ Chengdu office or remote dial-in.

Winners announced – 23rd November 2018

Career opportunities – winners and/or highly regarded entrants who are invited to interview for internship or job opportunities will be notified within 3 weeks after the competition closes. Dates of interviews will be held at the discretion of ANZ and on consultation with the individual.


Who can enter

The captain is responsible for the team member, who is agree with the< DC competition's cheating management regulations >and the other pertinent rules.

Entries are open to all worldwide university students whose degree (can be Bachelor/Master/PhD of any major) was/will be obtained between June 2018 and June 2020. China domestic students must be full-time university students. You can enter as an individual or as a team. Team entries are to be a maximum of three (3) persons and all team members must satisfy the entrant's requirements. You can only enter the competition once – either as an individual or in a team. Finalists will be asked to provide the following documents to confirm they meet entry requirements:  

University Student ID or Graduation Certificate. 

Contact DC if you have questions:

The team must be built from 1 people minimum to 3 people。 can't build a new team in the last 3 days of the first stage, but can join other teams。 can't build a new team in the last 3 days of the last stage, and can't join the team.。 after the competition enters the historical stage, new teams can be created, but the participating teams cannot add new players or non-dissolvable teams. Note: Members of the defense team are only members who are active during the period.。


Scoring criteria

Online submission: 

Entrants are required to submit a notebook file (for example: R Markdown or Jupyter Notebook or word) to reflect their analysis process and outcome. 

The minimum requirements of submission are: 1. Submitted document covers at least preliminary analysis, modelling/algorithm development, model validation and selected model performance metrics 2. Codes delivered at least cover model development and validation parts 3. Model outcome in submitted document is reproducible from delivered codes 4. Methodology selected is sound and fit for purpose 5. Entrants need to choose their own model performance metrics (i.e. accuracy, precision, recall, AUC, RMSE, etc.). 6. Each team is allowed to submit the result file (i.e., notebook file) multiple times. Only the last submission will be delivered to the judging panel.

All full and complete entries will be judged by a judging panel in accordance with the following criteria:

A maximum of 10 entries with the highest scores will be selected as finalists. You must score a minimum of 60 points on your submission for the chance to progress as a finalist. A minimum of 3 finalists are required.

Presentation: The finalists will be invited to present their entry in English to the judging panel at the ANZ Chengdu office (or remote dial in). If the finalist is a team, it is desirable for all members to attend the presentation but attendance of at least one team member will be accepted.  

You may prepare a PowerPoint to accompany your presentation but this is not mandatory. The judging panel will then determine the top 3 winners. 

Things to note for the presentation: 1. Each finalist will have up to 6 minutes to present their entry 2. There will then be 4 minutes for Q&A 3. It's suggested finalists cover the following in their presentation:    a. Data analysis outcome and business insights    b. Methodology used and the reason for choosing this methodology    c. Model performance and metrics 4. Presentation and related materials (if any) are to be in English.

The presentation will be assessed on the following criteria:

Final Scoring Your final score will be weighted as follows: 

1. Online submissions 80% 2. Presentation 20%  

The top 3 highest scores will be awarded the cash prizes as well as an invitation to interview for intern/job opportunities with ANZ.  

Entries who do not place but are highly regarded by the judging panel may also be invited to interview for intern/job opportunities with ANZ.   

Note: All entries and presentations will be judged by a judging panel comprised of ANZ Bank Reporting and Analytics leaders and data scientists from China and Australia and Data Castle representative.

  

 


 

About ANZ

 

Focused on data digital

ANZ's is focused on building digital ecosystems to help our customers get on top of their money and their business. We're always excited to collaborate with our colleagues – and others in the industry – to create innovative products and services.   

We're building better and more convenient banking solutions and data is a big part of how we're going about it.   

Data is a key focus area for ANZ - we have analysts and scientists leveraging the value of data every day to help us make better decisions for our customers and people. Our business leaders look to our data people to ethically explore opportunities, challenge and transform ideas. We invest in great platforms and tools to drive value creation.   

Life as a data specialist in ANZ means bringing human and artificial intelligence together to develop innovative and creative solutions that help people and communities thrive.  

The ANZ Chengdu Data Science Competition is an exciting opportunity to identify intelligent innovators to join ANZ in this data and digital journey.      

ANZ and China Today ANZ boasts more than 800 people working in the ANZ Chengdu Service Centre. The team includes experts who manage a range of complex functions including technologies used to support ANZ's business globally.    

The Service Centre in Chengdu first opened in 2011 and ANZ remains the only foreign bank to establish an operations centre in Chengdu.   

In fact ANZ has long had a presence in China, since 1986. And in 2010, when Australia and New Zealand Bank (China) Company Limited (ANZ China) was established, ANZ became the first Australian bank to be locally incorporated in China. Today, it has branches in Shanghai, Beijing, Guangzhou, Chongqing, Hangzhou, Chengdu and Qingdao.   

ANZ (Australia and New Zealand Banking Group Limited) has a proud heritage of more than 180 years. And today is among the top 4 banks in Australia, the largest banking group in New Zealand and among the top 50 banks in the world.    

We operate in 34 markets across Australia, New Zealand, Asia, Pacific, America and the Middle East. Our 46,000 employees serve millions of customers in retail, commercial and institutional.  

A great place to work ANZ offers a range of exciting opportunities for intelligent and innovative technologists; especially software engineers and data scientists.   

ANZ people enjoy collaborating with colleagues. We value diversity and know we provide the best for customers in Australia and New Zealand and across the globe when we work together and challenge each other. We're curious and like to learn. ANZ is a place where employees can thrive.  

Up for the challenge? We'd love to hear from you!   

Visit anz.com.au/careers/