Python程式設計入門學習筆記(九)
阿新 • • 發佈:2018-12-25
## Python第四課 ### 新的資料格式:CSV - 純文字,使用某個字符集,比如ACSII,Unicode,EBCDIC或GB2312(簡體中文環境)等; - 由記錄組成(典型的是每行一條記錄); - 每條記錄被分隔符(英語:Delimiter)分隔為欄位(英語:Field(computer science))(典型分隔符有逗號、分號或製表符;有時分隔符可以包括可選的空格); - 每條記錄都有同樣的欄位序列。 #### pandas ```python import pandas as pd import numpy as np ``` ```python f = open('K:/Code/jupyter-notebook/Python Study/成績表.csv') df = pd.read_csv(f) ``` ```python #head預設讀取前5行 df.head() ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>學號</th> <th>姓名</th> <th>性別</th> <th>年齡</th> <th>班級</th> <th>計算機</th> <th>英語</th> <th>數學</th> <th>語文</th> <th>物理</th> <th>化學</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>1</td> <td>張小文</td> <td>男</td> <td>20</td> <td>1002</td> <td>56</td> <td>62</td> <td>86</td> <td>85</td> <td>86</td> <td>75</td> </tr> <tr> <th>1</th> <td>2</td> <td>李清</td> <td>女</td> <td>19</td> <td>1001</td> <td>94</td> <td>65</td> <td>85</td> <td>90</td> <td>84</td> <td>75</td> </tr> <tr> <th>2</th> <td>3</td> <td>孫明</td> <td>男</td> <td>19</td> <td>1003</td> <td>74</td> <td>85</td> <td>80</td> <td>84</td> <td>86</td> <td>91</td> </tr> <tr> <th>3</th> <td>4</td> <td>陳平</td> <td>男</td> <td>8</td> <td>1003</td> <td>85</td> <td>75</td> <td>78</td> <td>73</td> <td>86</td> <td>81</td> </tr> <tr> <th>4</th> <td>5</td> <td>劉東</td> <td>男</td> <td>20</td> <td>1001</td> <td>88</td> <td>74</td> <td>77</td> <td>65</td> <td>85</td> <td>71</td> </tr> </tbody> </table> </div> ```python type(df) ``` pandas.core.frame.DataFrame ### DataFrame ```python # 列名 print(df.columns) # 索引 print(df.index) ``` Index(['學號', '姓名', '性別', '年齡', '班級', '計算機', '英語', '數學', '語文', '物理', '化學'], dtype='object') RangeIndex(start=0, stop=8, step=1) ```python df.loc[0] ``` 學號 1 姓名 張小文 性別 男 年齡 20 班級 1002 計算機 56 英語 62 數學 86 語文 85 物理 86 化學 75 Name: 0, dtype: object ```python # 篩選數學成績大於80的 df[df.數學 > 80] ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>學號</th> <th>姓名</th> <th>性別</th> <th>年齡</th> <th>班級</th> <th>計算機</th> <th>英語</th> <th>數學</th> <th>語文</th> <th>物理</th> <th>化學</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>1</td> <td>張小文</td> <td>男</td> <td>20</td> <td>1002</td> <td>56</td> <td>62</td> <td>86</td> <td>85</td> <td>86</td> <td>75</td> </tr> <tr> <th>1</th> <td>2</td> <td>李清</td> <td>女</td> <td>19</td> <td>1001</td> <td>94</td> <td>65</td> <td>85</td> <td>90</td> <td>84</td> <td>75</td> </tr> </tbody> </table> </div> ```python df[df.數學 < 70] ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>學號</th> <th>姓名</th> <th>性別</th> <th>年齡</th> <th>班級</th> <th>計算機</th> <th>英語</th> <th>數學</th> <th>語文</th> <th>物理</th> <th>化學</th> </tr> </thead> <tbody> <tr> <th>7</th> <td>8</td> <td>黃佳</td> <td>女</td> <td>20</td> <td>1002</td> <td>81</td> <td>78</td> <td>58</td> <td>84</td> <td>90</td> <td>82</td> </tr> </tbody> </table> </div> ```python # 複雜篩選 df[(df.語文 >= 80) & (df.數學 >= 80) & (df.英語 >= 80)] ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>學號</th> <th>姓名</th> <th>性別</th> <th>年齡</th> <th>班級</th> <th>計算機</th> <th>英語</th> <th>數學</th> <th>語文</th> <th>物理</th> <th>化學</th> </tr> </thead> <tbody> <tr> <th>2</th> <td>3</td> <td>孫明</td> <td>男</td> <td>19</td> <td>1003</td> <td>74</td> <td>85</td> <td>80</td> <td>84</td> <td>86</td> <td>91</td> </tr> </tbody> </table> </div> ### 排序 ```python df.sort_values(['數學','語文']).head() ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>學號</th> <th>姓名</th> <th>性別</th> <th>年齡</th> <th>班級</th> <th>計算機</th> <th>英語</th> <th>數學</th> <th>語文</th> <th>物理</th> <th>化學</th> </tr> </thead> <tbody> <tr> <th>7</th> <td>8</td> <td>黃佳</td> <td>女</td> <td>20</td> <td>1002</td> <td>81</td> <td>78</td> <td>58</td> <td>84</td> <td>90</td> <td>82</td> </tr> <tr> <th>6</th> <td>7</td> <td>王大力</td> <td>男</td> <td>18</td> <td>1003</td> <td>85</td> <td>85</td> <td>75</td> <td>78</td> <td>84</td> <td>69</td> </tr> <tr> <th>4</th> <td>5</td> <td>劉東</td> <td>男</td> <td>20</td> <td>1001</td> <td>88</td> <td>74</td> <td>77</td> <td>65</td> <td>85</td> <td>71</td> </tr> <tr> <th>5</th> <td>6</td> <td>嚴雲峰</td> <td>男</td> <td>19</td> <td>1001</td> <td>84</td> <td>87</td> <td>77</td> <td>80</td> <td>70</td> <td>81</td> </tr> <tr> <th>3</th> <td>4</td> <td>陳平</td> <td>男</td> <td>8</td> <td>1003</td> <td>85</td> <td>75</td> <td>78</td> <td>73</td> <td>86</td> <td>81</td> </tr> </tbody> </table> </div> ### 訪問 ```python # 按照索引定位 df.loc[1] ``` 學號 2 姓名 李清 性別 女 年齡 19 班級 1001 計算機 94 英語 65 數學 85 語文 90 物理 84 化學 75 Name: 1, dtype: object ### 索引 ```python scores = { '英語': [90,70,89], '數學': [64,78,48], '姓名': ['wang','li','sun'] } df = pd.DataFrame(scores, index = ['one','two','three']) df ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>英語</th> <th>數學</th> <th>姓名</th> </tr> </thead> <tbody> <tr> <th>one</th> <td>90</td> <td>64</td> <td>wang</td> </tr> <tr> <th>two</th> <td>70</td> <td>78</td> <td>li</td> </tr> <tr> <th>three</th> <td>89</td> <td>48</td> <td>sun</td> </tr> </tbody> </table> </div> ```python df.index ``` Index(['one', 'two', 'three'], dtype='object') ```python df.loc['one'] ``` 英語 90 數學 64 姓名 wang Name: one, dtype: object ```python # 實實在在的所謂的第幾行,當索引不是數字索引時使用 df.iloc[0] ``` 英語 90 數學 64 姓名 wang Name: one, dtype: object ```python # 合併了loc和iloc的功能 df.ix[0] ``` c:\python\python36\lib\site-packages\ipykernel_launcher.py:1: DeprecationWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexing See the documentation here: http://pandas.pydata.org/pandas-docs/stable/indexing.html#ix-indexer-is-deprecated """Entry point for launching an IPython kernel. 英語 90 數學 64 姓名 wang Name: one, dtype: object ```python df.loc[:2] ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>學號</th> <th>姓名</th> <th>性別</th> <th>年齡</th> <th>班級</th> <th>計算機</th> <th>英語</th> <th>數學</th> <th>語文</th> <th>物理</th> <th>化學</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>1</td> <td>張小文</td> <td>男</td> <td>20</td> <td>1002</td> <td>56</td> <td>62</td> <td>86</td> <td>85</td> <td>86</td> <td>75</td> </tr> <tr> <th>1</th> <td>2</td> <td>李清</td> <td>女</td> <td>19</td> <td>1001</td> <td>94</td> <td>65</td> <td>85</td> <td>90</td> <td>84</td> <td>75</td> </tr> <tr> <th>2</th> <td>3</td> <td>孫明</td> <td>男</td> <td>19</td> <td>1003</td> <td>74</td> <td>85</td> <td>80</td> <td>84</td> <td>86</td> <td>91</td> </tr> </tbody> </table> </div> ```python df.iloc[:3] ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>學號</th> <th>姓名</th> <th>性別</th> <th>年齡</th> <th>班級</th> <th>計算機</th> <th>英語</th> <th>數學</th> <th>語文</th> <th>物理</th> <th>化學</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>1</td> <td>張小文</td> <td>男</td> <td>20</td> <td>1002</td> <td>56</td> <td>62</td> <td>86</td> <td>85</td> <td>86</td> <td>75</td> </tr> <tr> <th>1</th> <td>2</td> <td>李清</td> <td>女</td> <td>19</td> <td>1001</td> <td>94</td> <td>65</td> <td>85</td> <td>90</td> <td>84</td> <td>75</td> </tr> <tr> <th>2</th> <td>3</td> <td>孫明</td> <td>男</td> <td>19</td> <td>1003</td> <td>74</td> <td>85</td> <td>80</td> <td>84</td> <td>86</td> <td>91</td> </tr> </tbody> </table> </div> ```python # 訪問某一行,是錯誤的 # df[0] #訪問多行資料是可以使用切片的 df[:2] ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>學號</th> <th>姓名</th> <th>性別</th> <th>年齡</th> <th>班級</th> <th>計算機</th> <th>英語</th> <th>數學</th> <th>語文</th> <th>物理</th> <th>化學</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>1</td> <td>張小文</td> <td>男</td> <td>20</td> <td>1002</td> <td>56</td> <td>62</td> <td>86</td> <td>85</td> <td>86</td> <td>75</td> </tr> <tr> <th>1</th> <td>2</td> <td>李清</td> <td>女</td> <td>19</td> <td>1001</td> <td>94</td> <td>65</td> <td>85</td> <td>90</td> <td>84</td> <td>75</td> </tr> </tbody> </table> </div> ```python # dataFrame中的陣列 df.values ``` array([[1, '張小文', '男', 20, 1002, 56, 62, 86, 85, 86, 75], [2, '李清', '女', 19, 1001, 94, 65, 85, 90, 84, 75], [3, '孫明', '男', 19, 1003, 74, 85, 80, 84, 86, 91], [4, '陳平', '男', 8, 1003, 85, 75, 78, 73, 86, 81], [5, '劉東', '男', 20, 1001, 88, 74, 77, 65, 85, 71], [6, '嚴雲峰', '男', 19, 1001, 84, 87, 77, 80, 70, 81], [7, '王大力', '男', 18, 1003, 85, 85, 75, 78, 84, 69], [8, '黃佳', '女', 20, 1002, 81, 78, 58, 84, 90, 82]], dtype=object) ```python df.數學.values ``` array([86, 85, 80, 78, 77, 77, 75, 58], dtype=int64) ```python # 簡單的統計 df.數學.value_counts() ``` 77 2 78 1 75 1 58 1 86 1 85 1 80 1 Name: 數學, dtype: int64 ```python new = df[['數學','語文']].head() new ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>數學</th> <th>語文</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>86</td> <td>85</td> </tr> <tr> <th>1</th> <td>85</td> <td>90</td> </tr> <tr> <th>2</th> <td>80</td> <td>84</td> </tr> <tr> <th>3</th> <td>78</td> <td>73</td> </tr> <tr> <th>4</th> <td>77</td> <td>65</td> </tr> </tbody> </table> </div> ```python new * 2 ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>數學</th> <th>語文</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>172</td> <td>170</td> </tr> <tr> <th>1</th> <td>170</td> <td>180</td> </tr> <tr> <th>2</th> <td>160</td> <td>168</td> </tr> <tr> <th>3</th> <td>156</td> <td>146</td> </tr> <tr> <th>4</th> <td>154</td> <td>130</td> </tr> </tbody> </table> </div> ### 重點 ```python def func(score): if score>=80: return '優秀' elif score>=70: return '良' elif score>=60: return '及格' else: return '不及格' df['數學分類'] = df.數學.map(func) ``` ```python df.head() ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>學號</th> <th>姓名</th> <th>性別</th> <th>年齡</th> <th>班級</th> <th>計算機</th> <th>英語</th> <th>數學</th> <th>語文</th> <th>物理</th> <th>化學</th> <th>數學分類</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>1</td> <td>張小文</td> <td>男</td> <td>20</td> <td>1002</td> <td>56</td> <td>62</td> <td>86</td> <td>85</td> <td>86</td> <td>75</td> <td>優秀</td> </tr> <tr> <th>1</th> <td>2</td> <td>李清</td> <td>女</td> <td>19</td> <td>1001</td> <td>94</td> <td>65</td> <td>85</td> <td>90</td> <td>84</td> <td>75</td> <td>優秀</td> </tr> <tr> <th>2</th> <td>3</td> <td>孫明</td> <td>男</td> <td>19</td> <td>1003</td> <td>74</td> <td>85</td> <td>80</td> <td>84</td> <td>86</td> <td>91</td> <td>優秀</td> </tr> <tr> <th>3</th> <td>4</td> <td>陳平</td> <td>男</td> <td>8</td> <td>1003</td> <td>85</td> <td>75</td> <td>78</td> <td>73</td> <td>86</td> <td>81</td> <td>良</td> </tr> <tr> <th>4</th> <td>5</td> <td>劉東</td> <td>男</td> <td>20</td> <td>1001</td> <td>88</td> <td>74</td> <td>77</td> <td>65</td> <td>85</td> <td>71</td> <td>良</td> </tr> </tbody> </table> </div> ```python # applymap對dataFrame中所有的資料進行操作的一個函式,非常重要 def func(number): return number + 10 # 等價 func = lambda number: number + 10 df.applymap(lambda x: str(x) + ' -').head(2) ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>學號</th> <th>姓名</th> <th>性別</th> <th>年齡</th> <th>班級</th> <th>計算機</th> <th>英語</th> <th>數學</th> <th>語文</th> <th>物理</th> <th>化學</th> <th>數學分類</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>1 -</td> <td>張小文 -</td> <td>男 -</td> <td>20 -</td> <td>1002 -</td> <td>56 -</td> <td>62 -</td> <td>86 -</td> <td>85 -</td> <td>86 -</td> <td>75 -</td> <td>優秀 -</td> </tr> <tr> <th>1</th> <td>2 -</td> <td>李清 -</td> <td>女 -</td> <td>19 -</td> <td>1001 -</td> <td>94 -</td> <td>65 -</td> <td>85 -</td> <td>90 -</td> <td>84 -</td> <td>75 -</td> <td>優秀 -</td> </tr> </tbody> </table> </div> ### 匿名函式 ```python [i+ 100 for i in range(10)] ``` [100, 101, 102, 103, 104, 105, 106, 107, 108, 109] ```python def func(x): return x + 100 ``` ```python list(map(func,range(10))) # 函式太簡單,不經常使用,或者沒有必要取名字就可以使用匿名函式lambda list(map(lambda x: x + 100,range(10))) ``` [100, 101, 102, 103, 104, 105, 106, 107, 108, 109] ```python # 根據多列生成新的一個列的操作,用apply函式 df['new_score'] = df.apply(lambda x: x.數學 + x.語文, axis = 1) ``` ```python #前幾行 df.head(2) #最後幾行 df.tail(2) ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>學號</th> <th>姓名</th> <th>性別</th> <th>年齡</th> <th>班級</th> <th>計算機</th> <th>英語</th> <th>數學</th> <th>語文</th> <th>物理</th> <th>化學</th> <th>數學分類</th> <th>new_score</th> </tr> </thead> <tbody> <tr> <th>6</th> <td>7</td> <td>王大力</td> <td>男</td> <td>18</td> <td>1003</td> <td>85</td> <td>85</td> <td>75</td> <td>78</td> <td>84</td> <td>69</td> <td>良</td> <td>153</td> </tr> <tr> <th>7</th> <td>8</td> <td>黃佳</td> <td>女</td> <td>20</td> <td>1002</td> <td>81</td> <td>78</td> <td>58</td> <td>84</td> <td>90</td> <td>82</td> <td>不及格</td> <td>142</td> </tr> </tbody> </table> </div> ### pandas中的dataFrame的操作,很大一部分和numpy中的二維陣列的操作是近似的 <h1 style="text-align:center">matplotlib繪圖 </h1> ```python df = df.drop(['new_score'],axis = 1) ``` ```python df.head(2) ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>學號</th> <th>姓名</th> <th>性別</th> <th>年齡</th> <th>班級</th> <th>計算機</th> <th>英語</th> <th>數學</th> <th>語文</th> <th>物理</th> <th>化學</th> <th>數學分類</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>1</td> <td>張小文</td> <td>男</td> <td>20</td> <td>1002</td> <td>56</td> <td>62</td> <td>86</td> <td>85</td> <td>86</td> <td>75</td> <td>優秀</td> </tr> <tr> <th>1</th> <td>2</td> <td>李清</td> <td>女</td> <td>19</td> <td>1001</td> <td>94</td> <td>65</td> <td>85</td> <td>90</td> <td>84</td> <td>75</td> <td>優秀</td> </tr> </tbody> </table> </div> ### 繪圖 ```python import numpy as np import matplotlib.pyplot as plt #這一行是必不可少的 %matplotlib inline ``` ```python x = np.linspace(0, 10, 100) y = np.sin(x) plt.plot(x, y) plt.plot(x, np.cos(x)) ``` [<matplotlib.lines.Line2D at 0x1b3061cc7f0>] ![png](output_48_1.png) ```python plt.plot(x, y, '--') ``` [<matplotlib.lines.Line2D at 0x1b3082c71d0>] ![png](output_49_1.png) ```python fig = plt.figure() plt.plot(x, y, '--') ``` [<matplotlib.lines.Line2D at 0x1b30832ca58>] ![png](output_50_1.png) ```python fig.savefig('K:/Code/jupyter-notebook/Python Study/first_figure.png') ``` ```python # 虛線樣式 plt.subplot(2,1,1) plt.plot(x, np.sin(x),'--') plt.subplot(2,1,2) plt.plot(x, np.cos(x),) ``` [<matplotlib.lines.Line2D at 0x1b308395198>] ![png](output_52_1.png) ```python # 點狀樣式 x = np.linspace(0,10,20) plt.plot(x, np.sin(x),'o') ``` [<matplotlib.lines.Line2D at 0x1b3084f4940>] ![png](output_53_1.png) ```python # color控制顏色 x = np.linspace(0,10,20) plt.plot(x, np.sin(x),'o',color= 'red') ``` [<matplotlib.lines.Line2D at 0x1b30855bef0>] ![png](output_54_1.png) ```python # 加label標籤 x = np.linspace(0, 10, 100) y = np.sin(x) plt.plot(x, y,'--',label='sin(x)') plt.plot(x, np.cos(x),'o',label='cos(x)') # legend控制label的顯示效果,loc是控制label的位置的顯示 plt.legend(loc= 1 ) ``` <matplotlib.legend.Legend at 0x1b309907198> ![png](output_55_1.png) ```python plt.legend? ##當遇到一個不熟悉的函式的時候,多使用?號,檢視函式的文件 ``` ```python # plot函式,可定義的引數非常多 x = np.linspace(0, 10, 20) y = np.sin(x) plt.plot(x,y,'-p',color = 'green', markersize = 10,linewidth = 4, markeredgecolor = 'orange', markeredgewidth=2) plt.ylim(-0.5,0.8) ``` (-0.5, 0.8) ![png](output_57_1.png) ```python # 具體引數可檢視文件 plt.plot? ``` ```python # ylim,xlim限定函式 plt.plot(x,y,'-p',color = 'green', markersize = 10,linewidth = 4, markeredgecolor = 'orange', markeredgewidth=2) plt.ylim(-0.5,1.2) plt.xlim(2,8) ``` (2, 8) ![png](output_59_1.png) ```python #散點圖函式 plt.scatter(x,y,s=100,c='red') ``` <matplotlib.collections.PathCollection at 0x1b309da0c88> ![png](output_60_1.png) ```python plt.style.use('classic') x = np.random.randn(100) y = np.random.randn(100) colors = np.random.randn(100) sizes = 1000 * np.random.randn(100) plt.scatter(x,y,c=colors,s=sizes,alpha=0.4) plt.colorbar() ``` c:\python\python36\lib\site-packages\matplotlib\collections.py:902: RuntimeWarning: invalid value encountered in sqrt scale = np.sqrt(self._sizes) * dpi / 72.0 * self._factor <matplotlib.colorbar.Colorbar at 0x1b309fe4f98> ![png](output_61_2.png) ### pandas本身自帶繪圖 ### 線性圖形 ```python import pandas as pd df = pd.DataFrame(np.random.randn(100,4).cumsum(0),columns=['A','B','C','D']) df.plot() ``` <matplotlib.axes._subplots.AxesSubplot at 0x1b30c0c88d0> ![png](output_64_1.png) ### 柱狀圖形 ```python df = pd.DataFrame(np.random.randint(10,50,(3,4)),columns=['A','B','C','D'],index = ['one','two','three']) df.plot.bar() ``` <matplotlib.axes._subplots.AxesSubplot at 0x1b30c284898> ![png](output_66_1.png) ```python df.B.plot.bar() ``` <matplotlib.axes._subplots.AxesSubplot at 0x1b30c16c9b0> ![png](output_67_1.png) ```python # 等價於上面的繪製 df.plot(kind = 'bar') ``` <matplotlib.axes._subplots.AxesSubplot at 0x1b30c190898> ![png](output_68_1.png) ```python # 進行累加 df.plot(kind = 'bar',stacked = True) ``` <matplotlib.axes._subplots.AxesSubplot at 0x1b30c223978> ![png](output_69_1.png) ### 直方圖 ```python df = pd.DataFrame(np.random.randn(100,4),columns=['A','B','C','D']) df.hist(column='A',grid=True,figsize=(10,5)) ``` array([[<matplotlib.axes._subplots.AxesSubplot object at 0x000001B30DE24DD8>]], dtype=object) ![png](output_71_1.png) ### 密度圖 ```python # 等價於df.plot(kind = 'kde') # 提示:執行前,需要安裝scipy庫,用pip install scipy命令,否則提示:ModuleNotFoundError: No module named 'scipy' df.plot.kde() ``` <matplotlib.axes._subplots.AxesSubplot at 0x1b30e082d30> ![png](output_73_1.png) ### matplotlib 繪製三維圖 ```python from mpl_toolkits.mplot3d import Axes3D from matplotlib import cm from matplotlib.ticker import LinearLocator, FormatStrFormatter import matplotlib.pyplot as plt import numpy as np fig = plt.figure() ax = fig.gca(projection='3d') #橫座標區間,內部不能重複 X = np.arange(-5, 5, 0.25) #縱座標區間,內部不能重複 Y = np.arange(-5, 5, 0.25) #生成網格 X, Y = np.meshgrid(X, Y) R = np.sqrt(X**2 + Y**2) Z = np.sin(R) #plot the surface z axis surf = ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=cm.jet, linewidth=0, antialiased=False) #Customize the ax.set_zlim(-1.01, 1.01) ax.zaxis.set_major_locator(LinearLocator(10)) ax.zaxis.set_major_formatter(FormatStrFormatter('%.02f')) # Add a color bar which maps values to colors fig.colorbar(surf, shrink=0.5, aspect=5) plt.show() ``` ![png](output_75_0.png)