numpy,scipy,matplotlib,pandas等簡明教程

阿新 • • 發佈：2019-01-16

基礎部分
numpy的主要物件是一個同類元素的多維陣列. 這是一個所有元素均為同種型別，並通過正整數元組來進行索引的元素(一般為數字)表. 在numpy中維度(dimensions)稱之為軸(axes). 數目稱之為秩(rank).
就比如，在3D空間中一個點的座標[1, 2, 1]就是一個秩為1的陣列，因為它僅有一個軸, 並且其長度為3. 又比如在下面的例子中，陣列的秩為2(有兩個維度)，第一個維度(軸)的長度為2，第二個維度(軸)的長度為3.

  [[1., 0., 0.],
   [0., 1., 2.]]

numpy的陣列類被稱之為ndarray, 我們也將它叫做array. 需要注意的是，numpy.array與python標準庫中的array.array是有區別的，後者僅處理一維陣列並且只提供了少量的功能. 對於ndarray物件而言，比較重要的屬性有：

  ndarray.ndim

  陣列中軸(維度)的個數，在python的世界裡，維度的個數是指秩

  ndarray.shape

  陣列的維度. 這是一個表示陣列在每一個維度上的大小的一個整數元組. 對於一個n行m列的矩陣而言，它的shape屬性就為(n, m). 那麼，這個元組的長度就必然為秩，或者為維度的個數，或為ndim屬性

   ndarray.size

  陣列中元素的總個數. 也就等於shape屬性元組中各個元素的乘積.

   ndarray.dtype

  一個用來描述陣列中元素型別的物件. 你能通過標準的python型別來建立或者直接指定dtype屬性. 另外numpy也提供了它自己的資料型別. 例如，numpy.int32, numpy.int16, 以及numpy.float64, 等等.

   ndarray.itemsize

  陣列中元素的位元組大小(bytes). 例如，一個型別為float64的陣列元素的itemsize為8(=64/8), 而一個型別為complex32的陣列元素的itemsize為4(=32/8). 這個屬性等價於ndarray.dtype.itemsize

  ndarray.data

  包含了實際陣列元素的緩衝區. 通常我們不需要用這個屬性，原因是，我們會用索引(功能)訪問陣列中元素.

……

Pandas

import numpy as np
import pandas as pd
import matplotlib.pyplot as 
 plt
import warnings

warnings.filterwarnings('ignore')
plt.rcParams['font.sans-serif'] = ['SimHei']  # 用來正常顯示中文標籤
plt.rcParams['axes.unicode_minus'] = False


def test1():
    # for each Series, it includes `index`, so merging them into `DataFrame`, as corresponding index-value into a row
    # print(pd.Index([3]*4))
    # print(pd.Index(range(4)))
    # print(pd.date_range('20180201', periods=4))  # DatetimeIndex, default 'D' (calendar daily), `stride` as daily
    # print(pd.period_range('20180101', '2018-01-04'))  # PeriodIndex
    # print(pd.Index(data=[i for i in 'ABCDEF']))
    # print(list(pd.RangeIndex(10)))
    # s = pd.Series(10)  # scalar
    # s = pd.Series(data=[1, 2, 3], index=[10, 20, 20]) # array-like, and non-unique index values are allowed
    s = pd.Series({'a': 10, 10: 'AA'}, index=['aa', 10])  # dict
    print(s)  # print(s[:])

    # df = pd.DataFrame(data=np.random.randn(4, 3), index=pd.RangeIndex(1, 5), columns=['A', 'B', 'C'])  # ndarray
    # df = pd.DataFrame(data={'A': np.array(range(1, 4))**2, 'B': pd.Timestamp('20180206'),
    #                         'C': pd.Series(data=['MLee', 'python', 'Pearson']), 'D': 126,
    #                         'E': pd.Categorical(values=['Upper', 'Middle', 'Lower'], categories=['Middle', 'Lower']),
    #                         'F': 'Laplace'}, index=pd.RangeIndex(3), columns=['A', 'B'])  # dict
    df = pd.DataFrame(data={'A': np.array(range(1, 4))**2, 'B': pd.Timestamp('20180206'),
                            'C': pd.Series(data=['MLee', 'python', 'Pearson']), 'D': [126, 10, 66],
                            'E': pd.Categorical(values=['Upper', 'Middle', 'Lower'], categories=['Middle', 'Lower']),
                            'F': 'Laplace'}, index=pd.RangeIndex(3), columns=pd.Index([i for i in 'FEDCBA']))  # dict
    print(df)
    # print(df.dtypes)
    # print(df.index)
    # print(df.columns)
    # print(df.values)  # numpy.ndarray
    # print('*'*126)
    # print(df.info())
    # print('*'*126)
    # print(df.describe())
    # print(df.transpose())
    # print(df.sort_index(axis=0, ascending=False))
    # print(df.sort_index(axis=1))
    # print(df.sort_values(by='D'))

    print('*'*126)
    df = pd.DataFrame(data=np.arange(24).reshape(6, 4), index=pd.date_range('20180201', periods=6),
                      columns=pd.Index([i for i in 'ABCD']))
    # df = pd.DataFrame(data=np.arange(24).reshape(6, 4))
    # print(df[0])
    # print(df[0:1])  # select rows require to use the slice
    print(df[:])
    print(df['A'])  # print(df.A)
    print(df[0:2][['A', 'B']])
    print(df['20180201':'20180202'][['A', 'B']])
    # select rows require to use the `slice`, while select columns require to use the `list`
    # print(df[['A', 'B']])
    # print(df[0:2])  # exclude 3th row
    # print(df['20180201':'20180203'])  # include index `20180203` row
    # print(df[0:1])
    # print(df.loc['20180201']) # enable to get only one
    # For row & column, df.loc requires to use `index` and `column`, while df.iloc requires to use `slice` and `slice`,
    # in particular, df.ix supports mixed-selection
    print(df.loc['20180201':'20180202'][['A', 'B']])
    print(df.loc['20180201':'20180202', ['A', 'B']])  # print(df['20180201':'20180202', ['A', 'B']])  # error
    # equivalent to df.iloc[0, 0]
    print(df.iloc[0:1, 0:1])  # use index(both row and column, necessarily all like 0, 1, 2, ...)
    print(df.iloc[[0, 2, 4], 0:2])
    print(df.ix[0:2, 0:2])
    # print(df.ix['20180201':'20180202', 0:2])
    # print(df.ix[0:2, ['A', 'B']])
    # print(df.ix['20180201':'20180202', ['A', 'B']])
    print('**')
    print(df['B'][df.A>4])  # print(df.B[df.A>4])
    df.B[df.A > 4] = np.nan
    print(df)
    df['E'] = 0
    print(df)
    df['F'] = pd.Series(data=range(6), index=pd.date_range('20180201', periods=6))
    print(df)

    print('*'*12)
    df = pd.DataFrame(data=np.arange(24).reshape(6, 4), index=pd.date_range('20180201', periods=6), columns=pd.Index([i for i in 'ABCD']))
    # print(df)
    # df.dropna()
    # df.fillna()


def test2():
    dataset_training = pd.read_csv('C:/users/myPC/Desktop/ml/Titanic/train.csv')
    # print(dataset_training)
    print(dataset_training.Survived.value_counts())
    deceased = dataset_training.Pclass[dataset_training.Survived == 0].value_counts(sort=True)
    survived = dataset_training.Pclass[dataset_training.Survived == 1].value_counts(sort=True)
    # print(deceased, survived, sep='\n')
    df = pd.DataFrame({'Survived': survived, 'Deceased': deceased})
    print(df)
    df.plot(kind='bar', stacked=True)
    plt.title('Distribution of SES')
    plt.xlabel('Class')
    plt.ylabel('Numbers')
    plt.show()


def test3():
    id = ['1001', '1008', '1102', '1001', '1003', '1101', '1126', '1007']
    name = ['Shannon', 'Gauss', 'Newton', 'Leibniz', 'Taylor', 'Lagrange', 'Laplace', 'Fourier']
    country = ['America', 'Germany', 'Britain', 'Germany', 'Britain', 'France', 'France', 'France']
    iq = [168, 180, 172, 228, 182, 172, 160, 186]
    sq = [180, 194, 160, 274, 150, 200, 158, 180]
    eq = [144, 152, 134, 166, 118, 144, 156, 128]
    dataset = list(zip(id, name, country, iq, sq, eq))
    df = pd.DataFrame(data=dataset, columns=['Id', 'Name', 'Country', 'IQ', 'SQ', 'EQ'])
    df.to_csv('persons.csv', index=True, header=True)
    df = pd.read_csv('persons.csv', usecols=range(1, 7))
    print(df)
    # print(df.info())
    # print(df[df.IQ == df.IQ.max()])
    print(df.sort_values(by='IQ', axis=0, ascending=False))  # df.head(1)
    plt.subplot2grid((1, 3), (0, 0))
    df.IQ.plot()
    df.SQ.plot()
    df.EQ.plot()
    for i in range(df.shape[0]):
        plt.annotate(s=df.ix[i, 'Name'], xy=(i, df.ix[i, 'IQ']), xytext=(1, 1), xycoords='data', textcoords='offset points')
    plt.subplot2grid((1, 3), (0, 1), colspan=2)
    df[['IQ', 'SQ', 'EQ']].plot(kind='bar')
    # df['IQ'].plot(kind='bar')
    # df['SQ'].plot(kind='bar')
    # df['EQ'].plot(kind='bar')
    for i in range(df.shape[0]):
        plt.annotate(s=df.ix[i, 'Name'], xy=(i, df.ix[i, 'IQ']), xytext=(1, 1), xycoords='data', textcoords='offset points')
    plt.show()


def test4():
    data_train = pd.read_csv(r"C:\Users\myPC\Desktop\ml\Titanic\train.csv")
    # plt.subplot2grid((2, 3), (0, 0))  # 在一張大圖裡分列幾個小圖

    survived = data_train.Pclass[data_train.Survived == 1].value_counts()
    deceased = data_train.Pclass[data_train.Survived == 0].value_counts()
    pd.DataFrame({'Survived': survived, 'deceased': deceased}).plot(kind='bar', stacked=True)

    # print(data_train.Sex[data_train.Survived == 1].value_counts())

    print(data_train.groupby(by='Survived').count())

    # data_train.Survived.value_counts().plot(kind='bar')  # 柱狀圖
    # plt.title("獲救情況 (1為獲救)")
    # plt.ylabel("人數")

    # plt.subplot2grid((2, 3), (0, 1))
    # data_train.Pclass.value_counts().plot(kind="bar")
    # plt.ylabel("人數")
    # plt.title("乘客等級分佈")
    #
    # plt.subplot2grid((2, 3), (0, 2))
    # plt.scatter(data_train.Survived, data_train.Age)
    # plt.ylabel("年齡")  # 設定縱座標名稱
    # plt.grid(b=True, which='major', axis='y')
    # plt.title("按年齡看獲救分佈 (1為獲救)")x
    #
    # plt.subplot2grid((2, 3), (1, 0), colspan=2)
    # data_train.Age[data_train.Pclass == 1].plot(kind='kde')
    # data_train.Age[data_train.Pclass == 2].plot(kind='kde')
    # data_train.Age[data_train.Pclass == 3].plot(kind='kde')
    # plt.xlabel("年齡")  # plots an axis lable
    # plt.ylabel("密度")
    # plt.title("各等級的乘客年齡分佈")
    # plt.legend(('頭等艙', '2等艙', '3等艙'), loc='best')  # sets our legend for our graph.
    #
    # plt.subplot2grid((2, 3), (1, 2))
    # data_train.Embarked.value_counts().plot(kind='bar')
    # plt.title("各登船口岸上船人數")
    # plt.ylabel("人數")
    plt.show()


def test5():
    url = r'http://s3.amazonaws.com/assets.datacamp.com/course/dasi/present.txt'
    present = pd.read_table(url, sep=' ')
    # print(present)
    # present.set_index(keys=['year'], inplace=True)
    # print(present)
    print(present.columns)
    print(present.index)
    print(present.dtypes)
    # present.boys.plot(kind='kde')
    # present.girls.plot(kind='kde')
    present.set_index(keys=['year'], inplace=True)
    kinds = ['line', 'bar', 'barh', 'hist', 'box', 'kde', 'density', 'area', 'pie', 'scatter', 'hexbin']
    # plt.figure()
    # for i in range(len(kinds)):
    #     plt.subplot2grid(shape=(2, 3), loc=(i//3, i % 3))
    #     present[:10].plot(kind=kinds[i], subplots=True)
    present[:].plot(x='boys', y='girls', kind=kinds[-1])
    plt.legend(loc='upper right')
    plt.show()


def test6():
    s = pd.Series(data=np.random.randn(1000), index=pd.date_range('20180101', periods=1000))
    print(s)
    s = np.exp(s.cumsum())
    s.plot(style='m*', logy=True)
    plt.show()


def test7():
    df = pd.read_csv('http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data',
                     names=['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'name'])
    # print(df)
    # df.boxplot(by='name')
    # df.plot(kind='kde')
    # df.ix[:, :-1].plot(kind='hist')
    setosa = df[df.name == 'Iris-setosa']
    versicolor = df[df.name == 'Iris-versicolor']
    virginica = df[df.name == 'Iris-virginica']
    # plt.subplot2grid(shape=(1, 3), loc=(0, 0))
    plt.subplot(131)
    pd.DataFrame.plot(setosa)
    # setosa.plot(title='setosa', subplots=True)
    # plt.subplot2grid(shape=(1, 3), loc=(0, 1))
    # versicolor.plot(title='versicolor', subplots=True)
    # pd.DataFrame.plot(versicolor)
    # plt.subplot2grid(shape=(1, 3), loc=(0, 2))
    # virginica.plot(title='virginica', subplots=True)
    # pd.DataFrame.plot(data=virginica)
    plt.show()
    # df.sepal_length.plot(kind='hist')
    # plt.show()


def test8():
    import numpy as np
    import pandas as pd
    from sklearn.ensemble import RandomForestRegressor
    from sklearn import preprocessing
    from sklearn import linear_model

    dataset_training = pd.read_csv('C:/users/myPC/Desktop/ml/Titanic/train.csv')
    dataset_test = pd.read_csv('C:/users/myPC/Desktop/ml/Titanic/test.csv')
    passenger_id = dataset_test['PassengerId']
    # for the feature `Fare` in the `test_data`, only one missed
    dataset_test.loc[dataset_test.Fare.isnull(), 'Fare'] = 0.0
    # drop the irrelevant features
    dataset_training.drop(labels=['PassengerId', 'Name', 'Ticket'], axis=1, inplace=True)
    dataset_test.drop(columns=['PassengerId', 'Name', 'Ticket'], inplace=True)
    # predict `age` which is missed by others' features
    dataset_training_age = dataset_training[['Pclass', 'SibSp', 'Parch', 'Fare', 'Age']]
    dataset_test_age = dataset_test[['Pclass', 'SibSp', 'Parch', 'Fare', 'Age']]
    age_known0 = dataset_training_age[dataset_training_age.Age.notnull()].as_matrix()  # get the `ndarray`
    age_unknown0 = np.array(dataset_training_age[dataset_training_age.Age.isnull()])
    age_unknown1 = dataset_test_age[dataset_test_age.Age.isnull()].as_matrix()
    training_data_age = age_known0[:, :-1]
    training_target_age = age_known0[:, -1]
    rfr = RandomForestRegressor(n_estimators=1000, n_jobs=-1, random_state=0)  # enable to fit them by the 1000 trees
    rfr.fit(training_data_age, training_target_age)
    predicts = rfr.predict(age_unknown0[:, :-1])
    dataset_training.ix[dataset_training.Age.isnull(), 'Age'] = predicts  # fill the `age` which is missed
    # fit model(RandomForestRegressor) by the `training data`
    dataset_test.loc[dataset_test.Age.isnull(), 'Age'] = rfr.predict(age_unknown1[:, :-1])
    dataset_training.ix[dataset_training.Cabin.notnull(), 'Cabin'] = 'Yes'  # fill the `Cabin` as `Yes` which `notnull`
    dataset_training.ix[dataset_training.Cabin.isnull(), 'Cabin'] = 'No'  # else, `No`
    dataset_test.ix[dataset_test.Cabin.notnull(), 'Cabin'] = 'Yes'
    dataset_test.ix[dataset_test.Cabin.isnull(), 'Cabin'] = 'No'
    # dummy some fields whose types of [`object`, `category`] to eliminate relation between categories
    dataset_training_dummies = pd.get_dummies(dataset_training, columns=['Pclass', 'Sex', 'Cabin', 'Embarked'])
    dataset_test_dummies = pd.get_dummies(dataset_test, columns=['Pclass', 'Sex', 'Cabin', 'Embarked'])
    ss = preprocessing.StandardScaler()  # standardize some features which have some differences
    dataset_training_dummies['Age'] = ss.fit_transform(dataset_training_dummies.Age.reshape(-1, 1))
    dataset_training_dummies['Fare'] = ss.fit_transform(dataset_training_dummies.Fare.reshape(-1, 1))
    dataset_test_dummies['Age'] = ss.fit_transform(dataset_test_dummies.Age.reshape(-1, 1))
    dataset_test_dummies['Fare'] = ss.fit_transform(dataset_test_dummies.Fare.reshape(-1, 1))
    # get all processed samples
    print(dataset_training_dummies)
    dataset_training_dummies = dataset_training_dummies.filter(regex='Age|SibSp|Parch|Fare|Pclass_*|Sex_*|Cabin_*|Embarked_*|Survived').as_matrix()
    # print(data_training_dummies.info())
    training_data = dataset_training_dummies[:, 1:]
    training_target = dataset_training_dummies[:, 0:1]
    lr = linear_model.LogisticRegression(C=1.0, penalty='l1', tol=1e-5)
    from sklearn import model_selection

    print(model_selection.cross_val_score(lr, training_data, training_target, cv=4))
    lr.fit(training_data, training_target)
    predicts = lr.predict(dataset_test_dummies)
    ans = pd.DataFrame({'PassengerId': passenger_id, 'Survived': predicts.astype(np.int32)})
    # print(ans)
    # ans.to_csv('C:/users/myPC/Desktop/ml/Titanic/submission.csv', index=False)  # ignore label-index
    # print(pd.DataFrame({'features': list(dataset_test_dummies[1:]), 'coef': list(lr.coef_.T)}))


def test9():
    import numpy as np
    import numpy.linalg as nla
    import scipy.linalg as sla

    a = np.random.randint(20, size=(3, 4))
    print(a)
    print(np.diag(a))
    U, Sigma, V_H = nla.svd(a)  # 其中U, V為酉矩陣，Sigma為一個由奇異值組成的對角矩陣(但返回的是一個由奇異值組成的向量形式)
    Sigma = np.concatenate((np.diag(Sigma), np.zeros((U.shape[0], V_H.shape[1]-U.shape[1]))), axis=1)
    print(U)
    print(Sigma)
    print(V_H)
    print(U.dot(Sigma.dot(V_H)))


def google():
    import tensorflow as tf

    a = tf.constant((1, 1))
    b = tf.constant((2, 2))
    ans = a + b
    sess = tf.Session()
    # print(type(sess.run(ans)))
    print(sess.run(ans))


if __name__ == '__main__':
    # test9()
    # google()
    test8()
    # pd.concat()
    # df.drop()
    # df = pd.DataFrame({'A': [1, 2, 3], 'B': [np.nan, 1, np.nan], 'C': [10, 111, 1111], 'D': ['good', 'common', 'bad']})
    # df.loc[df.B.notnull(), 'B'] = 'Yes'  # prior to judge `isnull()`, or leading to all values as the `Yes`
    # df.loc[df.B.isnull(), 'B'] = 'No'
    # print(df.ix[:, 'B'])
    # df.ix[df.B.isnull(), 'B'] = [0, 0]
    # print(pd.get_dummies(df, columns=['D', 'B']))
    # print(df)
    # print(df.filter(regex='A|D|B'))
    # df = pd.get_dummies(df, prefix=['M', 'L'])  # loss original data(variable)
    # print(df)

numpy,scipy,matplotlib,pandas等簡明教程

基礎部分 numpy的主要物件是一個同類元素的多維陣列. 這是一個所有元素均為同種型別，並通過正整數元組來進行索引的元素(一般為數字)表. 在numpy中維度(dimensions)稱之為軸(axes). 數目稱之為秩(rank). 就比如，在3D空間

win10 + Python3.7 + Eclipse 安裝numpy, scipy, matplotlib, pandas, GDAL, ospybook, pyproj, scikit_lear

環境：win10 + Python3.7 + Eclipse IDE + PyDev 注：* ：這幾個檔案下載地址為：https://www.lfd.uci.edu/~gohlke/pythonlibs/ 根據自己Python 的版本下載（以Python3.7為例）以管

簡述Python的Numpy,SciPy和Pandas,Matplotlib的區別

Numpy: 基礎的數學計算模組，以矩陣為主,純數學。 SciPy: 基於Numpy，提供方法(函式庫)直接計算結果，封裝了一些高階抽象和物理模型。比方說做個傅立葉變換，這是純數學的，用Numpy；做個濾波器，這屬於訊號處理模型了，在Scipy裡找。 Pandas: 提供了一套名為DataF

mac安裝numpy,scipy,matplotlib

mos cannot isp () args -c cycle fin edits p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 18.0px "Courier New"; color: #000000; background-

ubantu下安裝pip,python,pycharm,numpy,scipy,matplotlibm,pandas 以及sklearn

root orm jetbrains das current direct bee lan ase ubuntu 安裝 pip 及 pip 常用命令: https://blog.csdn.net/danielpei1222/article/details/62969815

Ubuntu環境下完美安裝python模組numpy,scipy,matplotlib

不同的ubuntu版本安裝過這三個模組幾次了，然而總是出現各種問題，最近一次是在ubuntu 16.04 LTS server版本安裝的，總的來說安裝的比較順利。先把pip安裝好 sudo apt-get install python-pip 接著是安裝

Python+Numpy+Scipy+Matplotlib+IPython（一）

>>>此文僅作個人提醒記錄。注意事項：（1）Numpy,Scipy安裝之前安裝執行環境Micro visual2008,。（有就不必了，老系統或者新安裝的精簡的系統估計需要。）（2）Numpy要在scipy之前安裝，

window 上部署sklearn（python+pip+numpy+scipy+matplotlib+sklearn）

環境：win10 64位 1.安裝python 下載地址：https://www.python.org/ftp/python/2.7.5/python-2.7.5.amd64.msi （可以支援最新的sklearn）安裝：直接執行測試：在cmd下，執行python

在windows下python,pip,numpy,scipy,matplotlib的安裝

系統：win7(64bit) 如果只需要安裝python，執行步驟一就可以了，不用管後面。如果還需要其它的庫，則只需要執行第二步，第一步可省略（因為在安裝anaconda的時間，python就自動裝好了）。一、先安裝python 先到https://www.python.org/downloads

安裝基於Python3 的NumPy, SciPy, matplotlib和Scikit-Learn

from http://www.th7.cn/Program/Python/201408/263786.shtml 軟體版本：Ubuntun 14.04, Python 3.4, NumPy 1.8.1, SciPy 0.14.0, Scikit-Learn 0.16 N

win10系統安裝numpy,scipy,matplotlib，基於python3.6

首先下載對應於已安裝的python版本的numpy，scipy，matplotlib庫檔案，提供連結http://www.lfd.uci.edu/~gohlke/pythonlibs/ 下載之後開啟dos命令列，進入庫檔案所在路徑，然後用命令列安裝庫檔案，注意，一定要先安裝

linux下安裝numpy,pandas,scipy,matplotlib,scikit-learn

我沒順序 sci apt 求解備註 .com sudo cond python在數據科學方面需要用到的庫： a。Numpy：科學計算庫。提供矩陣運算的庫。 b。Pandas：數據分析處理庫 c。scipy：數值計算庫。提供數值積分和常微分方程組求解算法。提供了一個非常廣

python-數據處理的包Numpy,scipy,pandas,matplotlib

基本功基礎 list 簡單的 pan 計算 and 處理圖像處理一，NumPy包（numeric python，數值計算) 該包主要包含了存儲單一數據類型的ndarry對象的多維數組和處理數組能力的函數ufunc對象。是其它包數據類型的基礎。只能處理簡單的數據分析能力

Python機器學習Numpy, Scipy, Pandas, Scikit-learn, Matplotlib, Keras, NN速查手冊

Python機器學習Numpy, Scipy, Pandas, Scikit-learn, Matplotlib, Keras, NN速查手冊 Numpy SciPy Scikit-Learn Pandas Keras Matp

Ubuntu18.04下安裝機器學習相關Python第三方庫numpy，scipy，pandas，matplotlib

本文主要講述在ubuntu18.04下是如何安裝numpy，scipy，pandas，matplotlib的一、numpy NumPy（Numeric Python）是用Python進行科學計算的基本軟體包。 NumPy是Pytho

資料處理包:Numpy,pandas,matplotlib,sklearn等記錄

Numpy 1.想實現用matplotlib對confusion matrix畫color map時，能夠根據每個元素佔每一行樣本總量而非總體樣本總量顯示color map顏色。可以使用兩個confusion matrix來作圖，第一個實際的confusion

python3.6下安裝（numpy，scipy，pandas，matplotlib，scikit-learn）

1、安裝numpy，到http://www.lfd.uci.edu/~gohlke/pythonlibs/下載相應的安裝檔案，使用pip install 本地的.whl檔案 2、安裝scipy 到上面網址下載相應版本scipy，同樣方法安裝 3、安裝scikit-le

Ubuntu16.04安裝Python的資料分析庫numpy，pandas，scipy,matplotlib

1. 安裝依賴庫 sudo apt-get install python-dev 2. 使用pip方式安裝 sudo pip install numpy sudo pip install scipy sudo pip install pandas sudo pi

python-數據分析與展示（Numpy、matplotlib、pandas）---2

所有圖片像素 rom 科學 ntp 變換 pyplot ota 筆記內容整理自mooc上北京理工大學嵩天老師python系列課程數據分析與展示，本人小白一枚，如有不對，多加指正 1.python自帶的圖像庫PIL 1.1常用API Image.open() I

matplotlib 使用簡明教程（三）-一些專業圖表簡介

這裡對一些不太瞭解領域的庫進行簡要的介（fan）紹（yi），感興趣的讀者可以自行了解。這些圖表都在 matplotlib.pyplot 中進行了封裝。統計、概率分佈 plt.boxplot(x,*

numpy,scipy,matplotlib,pandas等簡明教程

……

Pandas

相關推薦