R:建立一個數據框以及因子的使用
> patientID <- c(1,2,3,4)
> patientID
[1] 1 2 3 4
> age <- c(25,34,28,52)
> diabetes <- c("Type1", "Type2", "Type1", "Type1")
> status <- c("Poor", "Improved", "Excellent", "Poor")
> patientdata <- data.frame(patientID, age, diabetes, status)
> patientdata
patientID age diabetes status
1 1 25 Type1 Poor
2 2 34 Type2 Improved
3 3 28 Type1 Excellent
4 4 52 Type1 Poor
>
讀取資料框中的元素:
> patientdata$age
[1] 25 34 28 52
> patientdata[c("age")]
age
1 25
2 34
3 28
4 52
> table(patientdata$diabetes, patientdata$status)
Excellent Improved Poor
Type1 1 0 2
Type2 0 1 0
>
因子的使用
> diabetes <- factor(diabetes)
> diabetes
[1] Type1 Type2 Type1 Type1
Levels: Type1 Type2
> status <- factor(status, ordered=TRUE)
> status
[1] Poor Improved Excellent Poor
Levels: Excellent < Improved < Poor
> str(patientdata)
'data.frame': 4 obs. of 4 variables:
$ patientID: num 1 2 3 4
$ age : num 25 34 28 52
$ diabetes : Factor w/ 2 levels "Type1","Type2": 1 2 1 1
$ status : Factor w/ 3 levels "Excellent","Improved",..: 3 2 1 3
> summary(patientdata)
patientID age diabetes status
Min. :1.00 Min. :25.00 Type1:3 Excellent:1
1st Qu.:1.75 1st Qu.:27.25 Type2:1 Improved :1
Median :2.50 Median :31.00 Poor :2
Mean :2.50 Mean :34.75
3rd Qu.:3.25 3rd Qu.:38.50
Max. :4.00 Max. :52.00
>