Spark基礎-scala學習(五、集合)
阿新 • • 發佈:2018-12-13
cti dset 函數式 hashset trees 不可變 buffer you als
集合
- scala的集合體系結構
- List
- LinkedList
- Set
- 集合的函數式編程
- 函數式編程綜合案例:統計多個文本內的單詞總數
scala的集合體系結構
- scala中的集合體系主要包括:Iterable、Seq、Set、Map。其中Iterable是所有集合trait的根trait。這個結構與java的集合體系非常相似
- scala中的集合是分成可變和不可變兩類集合的,其中可變集合就是說,集合的元素可以動態修改,而不可變集合的元素在初始化之後,就無法修改了。分別對應scala.collection.mutable和scala.collection.immutable兩個包
- Seq下包含了Range、ArrayBuffer、List等子trait。其中Range就代表了一個序列,通常可以使用“1 to 10”這種語法來產生一個Range。ArrayBuffer就類似於java中的ArrayList
List
- List代表一個不可變的列表
- List的創建,val list = List(1,2,3,4)
- List有head和tail,head代表List的第一個元素,tail代表第一個元素之後的所有元素,list.head,list.tail
- List有特殊的::操作符,可以用於將head和tail合並成一個List,0::list
- 案例:用遞歸函數來給List中每個元素都加上指定前綴,並打印
- 如果一個List只有一個元素,那麽它的head就是這個元素,它的tail為Nil
scala> def decorator(l:List[Int],prefix:String){ | if(l != Nil){ | println(prefix+l.head) | decorator(l.tail,prefix) | } | } decorator: (l: List[Int], prefix: String)Unit scala> val list = List(1,2,3,5) list: List[Int] = List(1, 2, 3, 5) scala> decorator(list,"hello ") hello 1 hello 2 hello 3 hello 5 scala> list.head res1: Int = 1 scala> list.tail res2: List[Int] = List(2, 3, 5) scala> 8::list res3: List[Int] = List(8, 1, 2, 3, 5)
LinkedList
- LinkedList代表一個可變的列表,使用elem可以引用其頭部,使用next可以引用其尾部
- val l = scala.collection.mutable.LinkedList(1,2,3,4,5);l.elem;l.next
- 案例:使用while循環while循環將列表中的每個元素都乘以2
scala> val list = scala.collection.mutable.LinkedList(1,2,3,5,6) scala> var currentList = list currentList: scala.collection.mutable.LinkedList[Int] = LinkedList(1, 2, 3, 5, 6) scala> while(currentList != Nil){ | currentList.elem | currentList.elem = currentList.elem * 2 | currentList = currentList.next | }
- 案例:使用while循環將列表中每隔一個元素就乘以2
scala> :paste
// Entering paste mode (ctrl-D to finish)
val list = scala.collection.mutable.LinkedList(1,2,3,4,5,6,7,8,9,10)
var currentList = list
var first = true
while(currentList != Nil && currentList.next != Nil){
if(first){currentList.elem = currentList.elem * 2;first = false}
currentList = currentList.next.next
currentList.elem = currentList.elem * 2
println(currentList.elem)
}
// Exiting paste mode, now interpreting.
<pastie>:11: warning: object LinkedList in package mutable is deprecated (since 2.11.0): low-level linked lists are deprecated
val list = scala.collection.mutable.LinkedList(1,2,3,4,5,6,7,8,9,10)
^
6
10
14
18
0
list: scala.collection.mutable.LinkedList[Int] = LinkedList(2, 2, 6, 4, 10, 6, 14, 8, 18, 10)
currentList: scala.collection.mutable.LinkedList[Int] = LinkedList()
first: Boolean = false
Set
- Set代表一個沒有重復元素的集合
- 將重復元素加入Set是沒有用的,比如val s = Set(1,2,3);s+1;s+4
- 而且Set是不保證插入順序的,也就是說,Set中的元素是亂序的,val s = new scala.collection.mutable.HashSetInt;s+=1;s+=2;s+=5
- LinkedHashSet會用一個鏈表維護插入順序,val s = new scala.collection.mutable.LinkedHashSetInt;i+=1;s+=2;s+=5
- SrotedSet會自動根據key來進行排序,val s = scala.collection.mutable.SortedSet("orange","apple","banana")
scala> val s = Set(1,2,3)
s: scala.collection.immutable.Set[Int] = Set(1, 2, 3)
scala> s+1
res0: scala.collection.immutable.Set[Int] = Set(1, 2, 3)
scala> s+4
res1: scala.collection.immutable.Set[Int] = Set(1, 2, 3, 4)
scala> val s = new scala.collection.mutable.HashSet[Int]();s+=1;s+=2;s+=5
s: scala.collection.mutable.HashSet[Int] = Set(1, 5, 2)
res2: s.type = Set(1, 5, 2)
scala> val s = new scala.collection.mutable.LinkedHashSet[Int]();s+=1;s+=2;s+=5
s: scala.collection.mutable.LinkedHashSet[Int] = Set(1, 2, 5)
res4: s.type = Set(1, 2, 5)
scala> val s = scala.collection.mutable.SortedSet("orange","apple","banana")
s: scala.collection.mutable.SortedSet[String] = TreeSet(apple, banana, orange)
集合的函數式編程
scala> List("Leo","Jen","Peter","Jack").map("name is " + _)
res7: List[String] = List(name is Leo, name is Jen, name is Peter, name is Jack)
scala> List("Hello World","You Me").flatMap(_.split(" "))
res8: List[String] = List(Hello, World, You, Me)
scala> List("I","have","a","beautiful","house").foreach(println(_))
I
have
a
beautiful
house
scala> List("Leo","Jen","Peter","Jack").zip(List(100,90,75,83))
res10: List[(String, Int)] = List((Leo,100), (Jen,90), (Peter,75), (Jack,83))
綜合案例統計多個文本內的單詞總數
scala> val lines1 = lines01.mkString
lines1: String = /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
scala> val lines2 = lines02.mkString
lines2: String = docker run -p 3307:3306 --name mysql3307 -v $PWD/conf:/etc/mysql/conf.d -v $PWD/logs:/logs -v $PWD/data:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=123456 -d mysql:5.7
scala> val lines = List(lines1,lines2)
lines: List[String] = List(/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)", docker run -p 3307:3306 --name mysql3307 -v $PWD/conf:/etc/mysql/conf.d -v $PWD/logs:/logs -v $PWD/data:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=123456 -d mysql:5.7)
scala> lines.flatMap(_.split(" ")).map((_,1)).map(_._2).reduceLeft(_ + _)
res11: Int = 21
scala> lines.flatMap(_.split(" ")).map((_,1)).map(_._2)
res12: List[Int] = List(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
scala> lines.flatMap(_.split(" ")).map((_,1))
res13: List[(String, Int)] = List((/usr/bin/ruby,1), (-e,1), ("$(curl,1), (-fsSL,1), (https://raw.githubusercontent.com/Homebrew/install/master/install)",1), (docker,1), (run,1), (-p,1), (3307:3306,1), (--name,1), (mysql3307,1), (-v,1), ($PWD/conf:/etc/mysql/conf.d,1), (-v,1), ($PWD/logs:/logs,1), (-v,1), ($PWD/data:/var/lib/mysql,1), (-e,1), (MYSQL_ROOT_PASSWORD=123456,1), (-d,1), (mysql:5.7,1))
Spark基礎-scala學習(五、集合)