Spark Python API函式:pyspark API(4)
文章目錄
• 1 countByKey
• 2 join
• 3 leftOuterJoin
• 4 rightOuterJoin
• 5 partitionBy
• 6 combineByKey
• 7 aggregateByKey
• 8 foldByKey
• 9 groupByKey
• 10 flatMapValues
• 11 mapValues
• 12 groupWith
• 13 cogroup
• 14 sampleByKey
• 15 subtractByKey
• 16 subtract
• 17 keyBy
• 18 repartition
• 19 coalesce
• 20 zip
• 21 zipWithIndex
• 22 zipWithUniqueId
countByKey
|
join
'A' , ( 1 , 8 )), ( 'A' , ( 1 , 6 )), ( 'B' , ( 3 , 7 ))] |
leftOuterJoin
|
rightOuterJoin
|
partitionBy
|
combineByKey
|
aggregateByKey
|
foldByKey
|
groupByKey
|
flatMapValues
|
mapValues
|
groupWith
|
cogroup
|
sampleByKey
|
subtractByKey
|
subtract
|
keyBy
|
repartition
|
coalesce
|
zip
|
zipWithIndex
|
zipWithUniqueId
|