R語言去重複資料
阿新 • • 發佈:2018-12-27
本次總結來源網路,有多處參考
在R語言中,去掉重複資料的函式是:duplicated
刪掉所有列中資料一樣的:
>test <- data.frame(
x1 = c(1,2,3,4,5,1,3,5),
x2 = c("a","b","c","d","e","a","b","e"),
x3 = c("a","b","c","d","e","a","c","e"))
> test
x1 x2 x3
1 1 a a
2 2 b b
3 3 c c
4 4 d d
5 5 e e
6 1 a a
7 3 b c
8 5 e e
> test[!duplicated(test),] #刪掉所有列上都重複的
x1 x2 x3
1 1 a a
2 2 b b
3 3 c c
4 4 d d
5 5 e e
7 3 b c
選擇性的刪除重複的
> test[!duplicated(test[,c(2,3)]),]
x1 x2 x3
1 1 a a
2 2 b b
3 3 c c
4 4 d d
5 5 e e
7 3 b c