1. 程式人生 > >R語言去重複資料

R語言去重複資料

本次總結來源網路,有多處參考

在R語言中,去掉重複資料的函式是:duplicated

刪掉所有列中資料一樣的:

>test <- data.frame(
  x1 = c(1,2,3,4,5,1,3,5),
  x2 = c("a","b","c","d","e","a","b","e"),
  x3 = c("a","b","c","d","e","a","c","e"))
> test
  x1 x2 x3
1  1  a  a
2  2  b  b
3  3  c  c
4  4  d  d
5  5  e  e
6  1  a  a
7  3  b  c
8
5 e e > test[!duplicated(test),] #刪掉所有列上都重複的 x1 x2 x3 1 1 a a 2 2 b b 3 3 c c 4 4 d d 5 5 e e 7 3 b c

選擇性的刪除重複的

> test[!duplicated(test[,c(2,3)]),]
  x1 x2 x3
1  1  a  a
2  2  b  b
3  3  c  c
4  4  d  d
5  5  e  e
7  3  b  c