Python-深入淺出資料分析-直方圖
阿新 • • 發佈:2020-08-21
目錄
在閱讀前,讀一下Python-深入淺出資料分析-總結會更好點,以後遇到問題比如程式碼執行不了,再讀讀也行,>-_-<
數字的分佈
首先書中數落的一下Excel的直方圖功能,其實呢,Excel2016已經改進了很多了
Python中的直方圖和箱形圖
用不了幾行程式碼,是不是比Excel簡單美觀點
import numpy as np import pandas as pd import matplotlib.pyplot as plt %matplotlib inline df = pd.read_csv('./hfda_ch09_employees.csv', skiprows=1, names =['staff_num', 'received', 'negotiated', 'gender', 'year']) fig = plt.figure(figsize=(12, 6)) ax1 = fig.add_subplot(1, 3, 1) l = ax1.boxplot(df['received'].values) ax2 = fig.add_subplot(1, 2, 2) l = ax2.hist(df['received'], bins=50)
各種情況的對比
分析不同維度的數字分佈
fig, ((ax1, ax2), (ax3, ax4), (ax5, ax6)) = plt.subplots(nrows=3, ncols=2, figsize=(16,16)) ax1.hist(df['received'][df['year']==2007], bins=50) ax1.set_title('year=2007') ax2.hist(df['received'][df['year']==2008], bins=50) ax2.set_title('year=2008') ax3.hist(df['received'][df['gender']=='M'], bins=50) ax3.set_title('gender=M') ax4.hist(df['received'][df['gender']=='F'], bins=50) ax4.set_title('gender=F') ax5.hist(df['received'][df['negotiated']], bins=50) ax5.set_title('negotiated=TRUE') ax6.hist(df['received'][~df['negotiated']], bins=50) ax6.set_title('negotiated=FALSE')