Python Reference in Data Analysis / Mining Tools
阿新 • • 發佈:2019-04-26
ipy -c std row easy owin 位數 excel adaboost
If you are already familiar with the module/package loading methods of Python, the following table is relatively easy to find.
Python is referenced in the following table as a module. Some modules are not native modules. Please use pip install * to install;
Mechine Learning
|
Connector & IO
Database
Category | Python |
---|---|
MySQL | mysql-connector-python(Official) |
Oracle | cx_Oracle |
Redis | redis |
MongoDB | pymongo |
neo4j | py2neo |
Cassandra | cassandra-driver |
ODBC | pyodbc |
JDBC | Unknown[Jython Only] |
IO
Category | Python |
---|---|
excel | xlsxWriter, pandas.(from/to)_excel, openpyxl |
csv | csv.writer |
json | json |
picture | PIL |
Statistics
Category | Python |
---|---|
描述性統計匯總 | scipy.stats.descirbe |
均值 | scipy.stats.gmean(幾何平均數), scipy.stats.hmean(調和平均數), numpy.mean, numpy.nanmean, pandas.Series.mean |
中位數 | numpy.median, numpy.nanmediam, pandas.Series.median |
眾數 | scipy.stats.mode, pandas.Series.mode |
分位數 | numpy.percentile, numpy.nanpercentile, pandas.Series.quantile |
經驗累積函數(ECDF) | statsmodels.tools.ECDF |
標準差 | scipy.stats.std, scipy.stats.nanstd, numpy.std, pandas.Series.std |
方差 | numpy.var, pandas.Series.var |
變異系數 | scipy.stats.variation |
協方差 | numpy.cov, pandas.Series.cov |
(Pearson)相關系數 | scipy.stats.pearsonr, numpy.corrcoef, pandas.Series.corr |
峰度 | scipy.stats.kurtosis, pandas.Series.kurt |
偏度 | scipy.stats.skew, pandas.Series.skew |
直方圖 | numpy.histogram, numpy.histogram2d, numpy.histogramdd |
Regression (including statistics and machine learning)
類別 | Python |
---|---|
普通最小二乘法回歸(ols) | statsmodels.ols, sklearn.linear_model.LinearRegression |
廣義線性回歸(gls) | statsmodels.gls |
分位數回歸(Quantile Regress) | statsmodels.QuantReg |
嶺回歸 | sklearn.linear_model.Ridge |
LASSO | sklearn.linear_model.Lasso |
最小角回歸 | sklearn.linear_modle.LassoLars |
穩健回歸 | statsmodels.RLM |
Hypothetical Test
類別 | Python |
---|---|
t檢驗 | statsmodels.stats.ttest_ind, statsmodels.stats.ttost_ind, statsmodels.stats.ttost.paired; scipy.stats.ttest_1samp, scipy.stats.ttest_ind, scipy.stats.ttest_ind_from_stats, scipy.stats.ttest_rel |
ks檢驗(檢驗分布) | scipy.stats.kstest, scipy.stats.kstest_2samp |
wilcoxon(非參檢驗,差異檢驗) | scipy.stats.wilcoxon, scipy.stats.mannwhitneyu |
Shapiro-Wilk正態性檢驗 | scipy.stats.shapiro |
Pearson相關系數檢驗 | scipy.stats.pearsonr |
Time series
Category | Python |
---|---|
AR | statsmodels.ar_model.AR |
ARIMA | statsmodels.arima_model.arima |
VAR | statsmodels.var_model.var |
Python Reference in Data Analysis / Mining Tools