WeightedCLuster R package的使用

阿新 • • 發佈：2018-11-08

WeightedCLuster R package的使用

1. 本函式包的主要用途

權重資料的聚類（主要是state sequences and weighted data）和聚類結果的評估

2.函式的安裝

install.packages("WeightedCluster")
library(WeightedCluster)

3.資料的輸入和計算

匯入mvad資料，mvad 資料追蹤了712個個體在20世紀90年代自訓練至工作的程序

#資料輸入
data(mvad)
aggMvad <- wcAggregateCases(mvad[, 17:86]) #確定和合計確定狀態的序列
uniqueMvad <- mvad[aggMvad$aggIndex, 17:86] #只打印出包含單獨序列的資料

#creat a state sequence and calculate the Hamming distance 
mvad.seq <- seqdef(uniqueMvad, weights=aggMvad$aggWeights) #用seqdef()生成一個狀態序列
mvaddist <- seqdist(mvad.seq, method="HAM") #計算序列Hamming 距離

#用層次聚類進行聚類
averageClust <- hclust(as.dist(mvaddist), method="average", members=aggMvad$aggWeights) #注意hclust中引數members

#層次聚類結果的展示
clust4 <- cutree(averageClust , k=4)
seqdplot(mvad.seq, group = clust4, border=NA)

#用PAM的方法進行聚類計算
pamclust4 <- wcKMedoids(mvaddist, k=4, weights=aggMvad$weight)
#用下面的方法可以顯示質心序列（mediod sequences）
print(mvad.seq[unique(pamclust4$clustering), ], format="SPS")

#層次聚類質量的計算和展示
avgClustQual <- as.clustrange(averageClust, diss, weights=aggMvad$aggWeights, ncluster=10) #自動計算幾種聚類質量值(只使用與層次聚類的質量展示）
plot(avgClustQual) #將聚類質量值用影象展示出來
plot(avgClustQual, norm="zscore") #用standardized scores進行展示
summary(avgClustQual, max.rank=2) #Alternatively, we can retrieve the two best solutions according to each quality measure
plot(avgClustQual, stat=c("ASWw", "HG", "PBC", "HC"))

#測量分割的質量
clustqual4 <- wcClusterQuality(mvaddist, clust4, weights=aggMvad$weight)
clustqual4$stats
sil <- wcSilhouetteObs(mvaddist, clust4, weights=aggMvad$weight, measure="ASWw")
seqIplot(mvad.seq, group=clust4, sortv=sil)

補充說明檔案
hclust函式包注意事項：
1.聚類方法"centroid" 相對應使用的距離為平方歐式距離 squared Euclidean distances. 如：hc1ust.centroid <- hclust(dist(cent)^2, method = “cen”)

2.聚類方法"ward.D2" 相對應使用的距離為歐式距離 “Euclidean” distances.

3.聚類方法"average"(=UPGMA) 相對應使用的距離為 “bray”(=Bray-Curtis) distances.
Bray-Curtis 相異度（Bray-Curtis dissimilarity）是生態學中用來衡量不同樣地物種組成差異的測度
在這裡插入圖片描述

4.關於其中的members引數的說明:
If members != NULL, then d is taken to be a dissimilarity matrix between clusters instead of dissimilarities between singletons and members gives the number of observations per cluster. This way the hierarchical cluster algorithm can be ‘started in the middle of the dendrogram’, e.g., in order to reconstruct the part of the tree above a cut (see examples). Dissimilarities between clusters can be efficiently computed (i.e., without hclust itself) only for a limited number of distance/linkage combinations, the simplest one being squared Euclidean distance and centroid linkage.
根據上述描述，我們可以按照自己需要，隨意進行改動聚類樹的展現形式

參考檔案連結：
https://cran.r-project.org/web/packages/WeightedCluster/vignettes/WeightedCluster.pdf
https://cran.r-project.org/web/packages/WeightedCluster/vignettes/WeightedClusterPreview.pdf

WeightedCLuster R package的使用

WeightedCLuster R package的使用

1. 本函式包的主要用途

2.函式的安裝

3.資料的輸入和計算

WeightedCLuster R package的使用

R package, RBGL, graph包直接install.package()失敗的解決方案

Data Visualization with the Caret R package

Caret R Package for Applied Predictive Modeling

Tuning Machine Learning Models Using the Caret R Package

Feature Selection with the Caret R Package

Compare Models And Select The Best Using The Caret R Package

解決ubuntu 用anaconda 安裝R 語言後,無法安裝R語言package的問題

【R語言】Rstudio更變Package預設安裝路徑

Become a Better R Programmer with the Awesome ‘lobstr’ Package

The content of element type "package" must match "(result-types?,interceptors?,default-interceptor-r

如何檢視R語言package中自帶哪些資料集以及各個資料集的描述

How To Estimate Model Accuracy in R Using The Caret Package

R下載package的一些小問題

R包的安裝錯誤ERROR: dependency ‘plyr’ is not available for package ‘reshape2’

R 更換package安裝源

R數據可視化----ggplot2之標度、坐標軸和圖例詳解

LaTeX-手動安裝宏包（package）以及生成幫助文檔的整套流程

R語言筆記

R 調用 python

WeightedCLuster R package的使用

WeightedCLuster R package的使用

1. 本函式包的主要用途

2.函式的安裝

3.資料的輸入和計算

相關推薦