Loan Prediction III--A practice

阿新 • • 發佈：2018-12-13

專案連結：https://datahack.analyticsvidhya.com/contest/practice-problem-loan-prediction-iii/ 參考文獻：《Python for Data Analysis》參考連結：https://pandas.pydata.org/pandas-docs/stable/index.html 使用工具：Jupyter Notebook 今天進行資料清洗&初步資料整合，後期會逐漸把建立好的預測貼上來，時間週期：10天。反省：A.資料結構認知過於倉促，實際上，預測若符合實際應用，還需進一步探索 B.填充缺失資料時，除了數值型資料用中位數填充，文字型資料用ffill方法填充且改變了原資料，之後要警惕小心使用，否則要備份原資料（一）資料結構認知在這裡插入圖片描述

初步整理：從平均放款資料、信用歷史、自僱、收入角度（二）初步整理

import pandas as pd
import numpy as np
data=pd.read_csv("C:\\Users\\lx\\Desktop\\prediction\\train_u6lujuX_CVtuZ9i.csv", 
                 index_col="Loan_ID")
import matplotlib.pyplot as plt
%matplotlib inline

#（1）檢視資料:eg.辦貸款且尚未畢業的女性名單
data.loc[(data['Gender']=='Female')
         &(data['Education']=='Not Graduate')
         &(data['Loan_Status']=='Y'),
        ['Gender','Education','Loan_Status']]

返回值如下：在這裡插入圖片描述

#（2）查詢缺失值
def missing(x):
    return sum(x.isnull())
print('Missing values from every column:')
print(data.apply(missing,axis=0))
print('\nMissing values from every row:')
print (data.apply(missing,axis=1).head())

返回值如下：在這裡插入圖片描述

#（3）補全缺失值：用平均數替換缺失值
data.fillna(data.mean(),inplace=True)
print（data）
#補全缺失值：向前填充方法
data.fillna(method='ffill',inplace=True)
print（data）

#（4）檢查缺失值是否被補全
print (data.apply(missing, axis=0))

返回值如下：在這裡插入圖片描述由上，缺失值補齊

#（5）用'Gender','Married','Self_Employed'這幾組的平均數剔掉缺失值，檢視一下每組的平均‘LoanAmount’
#作出資料透視表pivot_table
Graphics=data.pivot_table(values=['LoanAmount'],index=['Gender','Married','Self_Employed'],aggfunc=np.mean)
print(Graphics)

返回值如下：在這裡插入圖片描述

由上，自僱的女性最受青睞，貸款最多，而且十分突出

#（5）考慮信用歷史的影響
#作出交叉表crosstab
pd.crosstab(data['Credit_History'],data['Loan_Status'],margins=True)

返回值如下：在這裡插入圖片描述

#轉化成百分比
def percConvert(ser):
  return ser/float(ser[-1])
pd.crosstab(data["Credit_History"],data["Loan_Status"],margins=True).apply(percConvert, axis=1)

返回值如下：在這裡插入圖片描述 如上，有信用歷史獲得貸款的機率為：79.58%，然而無信用歷史獲得貸款的機率僅為：7.87%

#（6）考慮是否為自僱人士
pd.crosstab(data['Self_Employed'],data['Loan_Status'],margins=True)

返回值如下：在這裡插入圖片描述

#轉化成百分比
def percConvert(ser):
  return ser/float(ser[-1])
pd.crosstab(data["Self_Employed"],data["Loan_Status"],margins=True).apply(percConvert, axis=1)

返回值如下：在這裡插入圖片描述 由上，自僱人士獲得貸款的機率為：68.29%，然而非自僱人士獲得貸款的機率為：68.60%，貸款狀態不由自僱（Y/N）影響

#（7）考慮收入的影響
#重塑DataFrame
prop_rates = pd.DataFrame([1000, 5000, 12000], index=['Rural','Semiurban','Urban'],columns=['rates'])
prop_rates

返回值如下：在這裡插入圖片描述

data_merged = data.merge(right=prop_rates, how='inner',left_on='Property_Area',right_index=True, sort=False)
data_merged.pivot_table(values='Credit_History',index=['Property_Area','rates'], aggfunc=len)

返回值如下：在這裡插入圖片描述

#排序DataFrame
data_sorted = data.sort_values(['ApplicantIncome','CoapplicantIncome'], ascending=False)
data_sorted[['ApplicantIncome','CoapplicantIncome']].head(20)

返回值如下：在這裡插入圖片描述

#箱型圖&直方圖
data.boxplot(column="ApplicantIncome",by="Loan_Status")
data.hist(column="ApplicantIncome",by="Loan_Status",bins=30)

返回值如下：在這裡插入圖片描述 由上，貸款狀態的分佈大致相同，不由收入的高低影響

Loan Prediction III--A practice

專案連結：https://datahack.analyticsvidhya.com/contest/practice-problem-loan-prediction-iii/ 參考文獻：《Python

every day a practice —— morning

In 25 years, Panda Express has transformed from a single restaurant in a southern California mall to a 2000-location empire around the world. &n

every day a practice —— morning（2）

Two years at sea have fostered a close relationship between the two fellow sailors as they cross the globe, through warm weather and cold. 兩年的環球航海生活，

every day a practice —— morning（3）

"WeChat does not store any chat histories. They are stored only on users' phones, computers or other devices," Tencent said in a statement on its own WeCha

every day a practice —— morning（4）

If there’s one thing New Yorkers love more than discovering a new secret remedy, it’s telling other New Yorkers about it.與發現新的祕方相比，紐約人更熱衷的事情是去和其他人分享祕方。 &n

every day a practice —— morning（5）

Huawei has not been accused of wrongdoing. As an administrative subpoena, the Treasury document does not indicate that the Chinese company is part of

【石頭的專欄】A practice stone

專欄達人授予成功建立個人部落格專欄

every day a practice —— morning（7）

It is probably because Willow was the last link to her parents and a pastime that goes back to her own childhood. It really does feel like the end of an er

LTV prediction for a recurring subscription with R

LTV prediction for a recurring subscription with RCustomers lifetime value (LTV or CLV) is one of the cornerstones of product analytics because we need to

銀行貸款預測分析（Loan Prediction）

貸款資料的預測分析，通過使用python來分析申請人哪些條件對貸款有影響，並預測哪些客戶更容易獲得銀行貸款。提出問題：哪些客戶更容易獲得銀行貸款？匯入資料 import numpy as np import pandas as pd from matplo

java hdu A+B for Input-Output Practice (III)

是否 for left bigint desc accep 以及 next sed A+B for Input-Output Practice (III) Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 6

Hdu 1091 A+B for Input-Output Practice (III)

bsp time blog for in sse log pro sca ces A+B for Input-Output Practice (III) Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 6553

讀書 |《金礦III：精益領導者的軟實力》Lead with Respect: A Novel of Lean Practice

目錄：第1章領導到基層去學習第2章將成功與價值掛鉤第3章員工自主解決問題的管理方式第4章人人蔘與改善第5章學習的真諦第6章培養領導幹部第7章加強團隊合作 &nbs

A+B for Input-Output Practice (III) --JAVA

題目： Your task is to Calculate a + b. Input Input contains multiple test cases. Each test case contains a pair of integers a and b, one

Hdu 1094 A+B for Input-Output Practice (VI)

and highlight tar turn multi for in sca ger contain A+B for Input-Output Practice (VI) Time Limit: 2000/1000 MS (Java/Others) Memory L

Hdu 1090 A+B for Input-Output Practice (II)

miss should sam ota main clu sub logs pac A+B for Input-Output Practice (II) Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 6553

Hdu 1096 A+B for Input-Output Practice (VIII)

accep each sca mit amp brush for esc cpp A+B for Input-Output Practice (VIII) Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 655

Hdu 1092 A+B for Input-Output Practice (IV)

amp miss sse tput pro mis calculate star des A+B for Input-Output Practice (IV) Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 6

LeetCode - 557. Reverse Words in a String III

within return leetcode clas iii turn array spl bsp Given a string, you need to reverse the order of characters in each word within a sent

557. Reverse Words in a String III

eve out test doc cte with leetcode pac input Given a string, you need to reverse the order of characters in each word within a sentence w

Loan Prediction III--A practice

相關推薦