1. 程式人生 > >Python Reference in Data Analysis / Mining Tools

Python Reference in Data Analysis / Mining Tools

ipy -c std row easy owin 位數 excel adaboost

If you are already familiar with the module/package loading methods of Python, the following table is relatively easy to find.

Python is referenced in the following table as a module. Some modules are not native modules. Please use pip install * to install;

Mechine Learning

Category

Subcategory Python
LDA sklearn.discriminant_analysis.LinearDiscriminantAnalysis
QDA sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis
SVM (Support Vector Machine) Support Vector Classifier (SVC) sklearn.svm.SVC
Non-support vector classifier (nonSVC) sklearn.svm.NuSVC
Linear Support Vector Classifier (Lenear SVC) sklearn.svm.LinearSVC
Based on proximity K-proximity classifier sklearn.neighbors.KNeighborsClassifier
Radius proximity classifier sklearn.neighbors.RadiusNeighborsClassifier
Nearest Centroid Classifier sklearn.neighbors.NearestCentroid
Bayes Naive Bayes sklearn.naive_bayes.GaussianNB
Multinomial Naive Bayes sklearn.naive_bayes.MultinomialNB
Bernoulli Naive Bayes sklearn.naive_bayes.BernoulliNB
DecisionTree DecisionTree Classifier sklearn.tree.DecisionTreeClassifier
DecisionTree Regressor sklearn.tree.DecisionTreeRegressor
Assemble Method Bagging Random Forest Classifier sklearn.ensemble.RandomForestClassifier
Bagging Random Forest Regressor sklearn.ensemble.RandomForestRegressor
Boosting Gradient Boosting xgboost Module
Boosting AdaBoost sklearn.ensemble.AdaBoostClassifier
Cluster kmeans scipy.cluster.kmeans.kmeans
Hierarchical Cluster scipy.cluster.hierarchy.fcluster
DBSCAN sklearn.cluster.DBSCAN
Birch sklearn.cluster.Birch
K-Medoids Cluster

pyclust.KMedoids(Unknown reliability)

Association Rule Apriori Algorithm

apriori(Unknown reliability, not support py3),
PyFIM(Unknown reliability, unable to install with pip)

FP-Growth Algorithm

fp-growth(Unknown reliability, not support py3),
PyFIM(Unknown reliability, unable to install with pip)

Neural Network Neural Network neurolab.net, keras.*
Deep Learning keras.*


Connector & IO

Database

CategoryPython
MySQL mysql-connector-python(Official)
Oracle cx_Oracle
Redis redis
MongoDB pymongo
neo4j py2neo
Cassandra cassandra-driver
ODBC pyodbc
JDBC Unknown[Jython Only]

IO

CategoryPython
excel xlsxWriter, pandas.(from/to)_excel, openpyxl
csv csv.writer
json json
picture PIL


Statistics

CategoryPython
描述性統計匯總 scipy.stats.descirbe
均值 scipy.stats.gmean(幾何平均數), scipy.stats.hmean(調和平均數), numpy.mean, numpy.nanmean, pandas.Series.mean
中位數 numpy.median, numpy.nanmediam, pandas.Series.median
眾數 scipy.stats.mode, pandas.Series.mode
分位數 numpy.percentile, numpy.nanpercentile, pandas.Series.quantile
經驗累積函數(ECDF) statsmodels.tools.ECDF
標準差 scipy.stats.std, scipy.stats.nanstd, numpy.std, pandas.Series.std
方差 numpy.var, pandas.Series.var
變異系數 scipy.stats.variation
協方差 numpy.cov, pandas.Series.cov
(Pearson)相關系數 scipy.stats.pearsonr, numpy.corrcoef, pandas.Series.corr
峰度 scipy.stats.kurtosis, pandas.Series.kurt
偏度 scipy.stats.skew, pandas.Series.skew
直方圖 numpy.histogram, numpy.histogram2d, numpy.histogramdd

Regression (including statistics and machine learning)

類別Python
普通最小二乘法回歸(ols) statsmodels.ols, sklearn.linear_model.LinearRegression
廣義線性回歸(gls) statsmodels.gls
分位數回歸(Quantile Regress) statsmodels.QuantReg
嶺回歸 sklearn.linear_model.Ridge
LASSO sklearn.linear_model.Lasso
最小角回歸 sklearn.linear_modle.LassoLars
穩健回歸 statsmodels.RLM

Hypothetical Test

類別Python
t檢驗 statsmodels.stats.ttest_ind, statsmodels.stats.ttost_ind, statsmodels.stats.ttost.paired; scipy.stats.ttest_1samp, scipy.stats.ttest_ind, scipy.stats.ttest_ind_from_stats, scipy.stats.ttest_rel
ks檢驗(檢驗分布) scipy.stats.kstest, scipy.stats.kstest_2samp
wilcoxon(非參檢驗,差異檢驗) scipy.stats.wilcoxon, scipy.stats.mannwhitneyu
Shapiro-Wilk正態性檢驗 scipy.stats.shapiro
Pearson相關系數檢驗 scipy.stats.pearsonr

Time series

CategoryPython
AR statsmodels.ar_model.AR
ARIMA statsmodels.arima_model.arima
VAR statsmodels.var_model.var

Python Reference in Data Analysis / Mining Tools