spark二次排序到多次排序
阿新 • • 發佈:2019-01-28
資料示例:
1 5 6 91 5 6 7
1 5 6 8
2 4 7 5
3 6 3 3
1 5 3 3
1 5 2 4
2 4 3 7
實現需求:先按第一列排序,若第一列相同按照第二列排序,依次類推
scals實現:
class SeveralSortKey(val arr:Array[String]) extends Ordered[SeveralSortKey] with Serializable{ //重寫Ordered類的compare方法 override def compare(that: SeveralSortKey): Int = { val loop = new Breaks varresult:Int = -1 loop.breakable { for (i <- 0 until arr.length) { if (this.arr(i).toInt - that.arr(i).toInt != 0) { result = this.arr(i).toInt - that.arr(i).toInt loop.break() }else{ result = this.arr(i+1).toInt - that.arr(i+1).toInt } } } result } }
object SortDemo{ def main(args:Array[String]): Unit ={ val conf = new SparkConf().setAppName("soft").setMaster("local") val sc = new SparkContext(conf) val lines = sc.textFile("f://sort.txt") val pairs = lines.map(line =>( new SeveralSortKey(line.split(" ")),line )) valsoftPairs = pairs.sortByKey() var softedLines=softPairs.map(line=>line._2) softedLines.foreach(println) } }
注:當出現兩個一模一樣的行時,會報錯