1. 程式人生 > >Storm Trident示例ReducerAggregator

Storm Trident示例ReducerAggregator

bug thread 一個 fields pan part 分區合並 use core

ReducerAggregator首先在輸入流上運行全局重新分區操作(global)將同一批次的所有分區合並到一個分區中,然後在每個批次上運行的聚合功能,針對Batch操作。

省略部分代碼,省略部分可參考:https://blog.csdn.net/nickta/article/details/79666918

FixedBatchSpout spout = new FixedBatchSpout(new Fields("user", "score"), 3,    
                new Values("nickt1", 4),   
                new Values("nickt2", 7),    
                
new Values("nickt3", 8), new Values("nickt4", 9), new Values("nickt5", 7), new Values("nickt6", 11), new Values("nickt7", 5) ); spout.setCycle(false); TridentTopology topology = new TridentTopology(); topology.newStream(
"spout1", spout) .shuffle() .each(new Fields("user", "score"),new Debug("shuffle print:")) .parallelismHint(5) .aggregate(new Fields("score"), new ReducerAggregator<Integer>() { @Override
//每一個batch,初始調用 1次,空的batch也會調用1次 public Integer init() { return 0; } @Override //batch看的每個tuple調用1次 public Integer reduce(Integer curr, TridentTuple tuple) { return curr + tuple.getIntegerByField("score"); } }, new Fields("sum"))//對每個batch中的score求和 .each(new Fields("sum"),new Debug("sum print:")) .parallelismHint(5);

輸出:

[partition4-Thread-152-b-0-executor[37 37]]> DEBUG(shuffle print:): [nickt3, 8]
[partition0-Thread-68-b-0-executor[33 33]]> DEBUG(shuffle print:): [nickt1, 4]
[partition2-Thread-64-b-0-executor[35 35]]> DEBUG(shuffle print:): [nickt2, 7]
[partition1-Thread-80-b-1-executor[39 39]]> DEBUG(sum print:): [19]
[partition4-Thread-152-b-0-executor[37 37]]> DEBUG(shuffle print:): [nickt4, 9]
[partition2-Thread-64-b-0-executor[35 35]]> DEBUG(shuffle print:): [nickt5, 7]
[partition2-Thread-64-b-0-executor[35 35]]> DEBUG(shuffle print:): [nickt6, 11]
[partition2-Thread-66-b-1-executor[40 40]]> DEBUG(sum print:): [27]
[partition0-Thread-68-b-0-executor[33 33]]> DEBUG(shuffle print:): [nickt7, 5]
[partition3-Thread-56-b-1-executor[41 41]]> DEBUG(sum print:): [5]

Storm Trident示例ReducerAggregator