Statistic summary in R

summary the statistics of data
visualize the statistics (boxplot and histogram)

view the data

library("AzureML")
ws <- workspace()
dat <- download.datasets(ws, "Automobile price data (Raw)")
head(dat)

results:

process the dataframe:

cols = c('price','bore','stroke','horsepower','peak.rpm')
## convert ? to an NA
dat[,cols] = lapply(dat[,cols], function(x) ifelse(x == '?', NA, x))
## remove rows with NAs
dat = dat[complete.cases(dat),]
## Covert character columns to numeric
dat[,cols] = lapply(dat[,cols], as.numeric)
str(dat)

complete.cases(dataframe) keep those rows without NA. lapply(dataframe, function) is used to manipulate the dataframe globally. The statement of the processed dataframe:

view the summary of the dataframe:

describe = function(df, col){
tmp = df[, col]
sumry = summary(tmp)
nms = names(sumry)
nms = c(nms, 'std')
out = c(sumry, sd(tmp))
names(out) = nms
out
}
describe(dat, 'horsepower')

results including median, Q1(25%), Q2(50%), Q3(75%) and std.

Visualize the statistics:

options(repos = c(CRAN = "http://cran.rstudio.com"))
install.packages('gridExtra')
plotstats = function(df, col, bins = 30){
require(ggplot2)
require(gridExtra)
dat = as.factor('')
## Compute bin width
bin.width = (max(df[, col]) - min(df[, col]))/ bins
## Plot a histogram
p1 = ggplot(df, aes_string(col)) +
geom_histogram(binwidth = bin.width)
## A simple boxplot
p2 = ggplot(df, aes_string(dat, col)) +
geom_boxplot() + coord_flip() + ylab('')
## Now stack the plots
grid.arrange(p2, p1, nrow = 2)
}

plotstats(dat, 'price')

results:

summary the statistics of data

visualize the statistics (boxplot and histogram)

view the data

process the dataframe:

view the summary of the dataframe:

Visualize the statistics:

Statistic summary in R

intersect for multiple vectors in R

print,cat打印格式及字符串引號格式，去掉字符串空格 in R

Match function in R

Beautiful and Powerful Correlation Tables in R

Customer Lifetime Value in R筆記

Logistic regression in R

How to write tidy SQL queries in R

OLS and Logistic Regression Models in R

Cross-Correlation of Currency Pairs In R (ccf)

style app in R using Shiny+Flexdashboard

MC, MCMC, Gibbs取樣原理&實現（in R）

Tune Machine Learning Algorithms in R (random forest case study)

Machine Learning Datasets in R (10 datasets you can use right now)

Better Understand Your Data in R Using Visualization (10 recipes you can use today)

How to Build an Ensemble Of Machine Learning Algorithms in R (ready to use boosting, bagging and stacking)

Super Fast Crash Course in R (for developers)

Penalized Regression in R

How To Get Started With Machine Learning Algorithms in R

Get Your Data Ready For Machine Learning in R with Pre

Statistic summary in R

summary the statistics of data

visualize the statistics (boxplot and histogram)

view the data

process the dataframe:

view the summary of the dataframe:

Visualize the statistics:

相關推薦