2018-08-06 期 MapReduce MRUnit安裝及單元測試
mrunit-1.1.0-hadoop2.jar
第三方依賴
MRUnit\apache-mrunit-1.1.0-hadoop1-bin\lib
二、在現有工程裏面配置MRUnit單元測試
1、新建一個userlib
2、將MRUnitLib添加到mr工程,如下圖:
3、解決jar包沖突
由於在MRUnitLib包中存在mockito-core-1.9.5.jar包,該包和E:\depslib\hadoop-2.4.1\share\hadoop\common\lib\mockito-all-1.8.5.jar沖突,因此需要將mrlib包中的mockito-all-1.8.5.jar移除
為此,工程MRUnit單元測試環境以及搭建完成。
三、進行MRUnit單元測試
這裏以前期編寫的WordCount MR程序為例來測試Mapper階段、Reducer、Job是否正常
測試用例代碼
1、Mapper階段測試
/**
* 對WordCountMapper進行MRUnit單元測試
* @throws Exception
*/
@Test
public void mapperTest() throws Exception {
//創建一個WordCountMapper的對象wordCountMapper
WordCountMapper wordCountMapper = new WordCountMapper();
//創建Map驅動driver MapDriver<K1, V1, K2, V2> 對應Mapper<K1, V1, K2, V2> 並指定運行的Mapper程序
MapDriver<LongWritable, Text, Text, IntWritable> mapDriver = new MapDriver<>(wordCountMapper);
//指定Map輸入數據
mapDriver.withInput(new LongWritable(1), new Text("Hello word"))
.withInput(new LongWritable(3), new Text("Hello java java is a good language"));
//指定Map輸出數據 -->我們期望輸出的數據
mapDriver.withOutput(new Text("Hello"), new IntWritable(1))
.withOutput(new Text("word"), new IntWritable(1))
.withOutput(new Text("Hello"), new IntWritable(1))
.withOutput(new Text("java"), new IntWritable(1))
.withOutput(new Text("java"), new IntWritable(1))
.withOutput(new Text("is"), new IntWritable(1))
.withOutput(new Text("a"), new IntWritable(1))
.withOutput(new Text("good"), new IntWritable(1))
.withOutput(new Text("language"), new IntWritable(1));
//執行單元測試 -->對比我們期望輸出的數據和Mapper實際輸出的數據是否一致,不一致則會報錯,一致則會通過
mapDriver.runTest();
}
測試結果:
測試結果通過
假設我們稍微調整一下Mapper程序輸出,在利用單元測試用例測試觀察看看是否通過
--部分mapper程序,這裏將
context.write(new Text(word), new IntWritable(1));
修改為
context.write(new Text(word), new IntWritable(2));
再次執行單元測試:
測試失敗 輸出部分信息如下:
Missing expected output (Hello, 1) at position 0, got (Hello, 2).
......
Missing expected output (language, 1) at position 8, got (language, 2).
通過分析該信息發現Mapper輸出和我們測試用例輸出不一致,
output (Hello, 1) at position 0, got (Hello, 2).
表示我們期望輸出為(Hello, 1),但是通過Mapper執行後輸出為(Hello, 2).,這樣我們就可以分析問題到底是測試用例問題還是Mapper業務邏輯存在問題,這裏顯然使我們認為的修改了Mapper的邏輯導致。
2、Reducer階段測試
/**
* 對WordCountReducer進行MRUnit單元測試
* @throws Exception
*/
@Test
public void reducerTest() throws Exception {
//創建一個WordCountReducer的對象wordCountReducer
WordCountReducer wordCountReducer = new WordCountReducer();
//創建Reduce驅動driver ReduceDriver<K3, V3, K4, V4> 對應Reducer<Text, IntWritable, Text, IntWritable> 並指定運行的Reducer程序
ReduceDriver<Text, IntWritable, Text, IntWritable> reduceDriver = new ReduceDriver<>(wordCountReducer);
//指定Reduce輸入數據
//構造一個輸入v3 list對象
List<IntWritable> v3list1 = new ArrayList<IntWritable>();
v3list1.add(new IntWritable(1));
v3list1.add(new IntWritable(1));
List<IntWritable> v3list2 = new ArrayList<IntWritable>();
v3list2.add(new IntWritable(1));
v3list2.add(new IntWritable(1));
v3list2.add(new IntWritable(1));
reduceDriver.withInput(new Text("Hello"), v3list1)
.withInput(new Text("Java"), v3list2);
//指定Reduce輸出數據
reduceDriver.withOutput(new Text("Hello"), new IntWritable(2));
reduceDriver.withOutput(new Text("Java"), new IntWritable(3));
//運行Rudece單元測試
reduceDriver.runTest();
}
執行單元測試:
測試結論通過
下面我們將
reduceDriver.withOutput(new Text("Hello"), new IntWritable(2));
修改為
reduceDriver.withOutput(new Text("Hello"), new IntWritable(1));
人為的模擬單元測試用例存在問題
再次測試發現不通過,日誌如下:
2018-08-01 16:11:28,769 ERROR [main] mrunit.TestDriver (Errors.java:record(57)) - Missing expected output (Hello, 1) at position 0, got (Hello, 2).
通過分析發現(Hello, 1)但是通過Reduc處理後輸出為 (Hello, 2).,所以測試失敗。這樣我們就可以利用這一特性來對我們Reducer和單元測試用例進行分析,發現問題所在。這裏由於是我們人為的修改了單元測試用例,因此問題主要在單元測試用例上。
3、Job任務測試,同時測試Mapper和Reducer
/**
* 對WordCount進行MRUnit進行整體的單元測試
* @throws Exception
*/
@Test
public void jobTest() throws Exception {
//創建一個WordCountMapper的對象wordCountMapper
WordCountMapper wordCountMapper = new WordCountMapper();
//創建一個WordCountReducer的對象wordCountReducer
WordCountReducer wordCountReducer = new WordCountReducer();
//創建驅動driver MapReduceDriver<K1, V1, K2, V2, K4, V4> 對應Mapper<K1, V1, K2, V2>和Reducer<K4,V4>,並指定運行的mapper和reducer
MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable> driver = new MapReduceDriver<>(wordCountMapper,wordCountReducer);
//指定Mapper的輸入
driver.withInput(new LongWritable(1), new Text("Hello word"))
.withInput(new LongWritable(1), new Text("Hello java java is a good language"));
//指定Reducer輸出 -->期望的輸出
driver.withOutput(new Text("Hello"), new IntWritable(2))
.withOutput(new Text("word"), new IntWritable(1))
.withOutput(new Text("java"), new IntWritable(2))
.withOutput(new Text("is"), new IntWritable(1))
.withOutput(new Text("a"), new IntWritable(1))
.withOutput(new Text("good"), new IntWritable(1))
.withOutput(new Text("language"), new IntWritable(1));
//運行單元測試 -->對比我們期望輸出的數據和Reducer階段實際輸出的數據是否一致,不一致則會報錯,一致則會通過
driver.runTest();
}
測試結果:
不通過,日誌如下:
2018-08-01 16:24:33,724 ERROR [main] mrunit.TestDriver (Errors.java:record(57)) - Missing expected output (word, 1) at position 1, got (a, 1).
2018-08-01 16:24:33,725 ERROR [main] mrunit.TestDriver (Errors.java:record(57)) - Missing expected output (java, 2) at position 2, got (good, 1).
2018-08-01 16:24:33,725 ERROR [main] mrunit.TestDriver (Errors.java:record(57)) - Missing expected output (a, 1) at position 4, got (java, 2).
2018-08-01 16:24:33,726 ERROR [main] mrunit.TestDriver (Errors.java:record(57)) - Missing expected output (good, 1) at position 5, got (language, 1).
2018-08-01 16:24:33,726 ERROR [main] mrunit.TestDriver (Errors.java:record(57)) - Missing expected output (language, 1) at position 6, got (word, 1).
分析發現
expected output (word, 1) at position 1, got (a, 1).
表示期望的(word,1)的位置實際上是got (a, 1).這與我們實際不符。
這就要分析到底是測試用例的問題還是Mapper或者Reducer的問題。通過深入分析發現,數據通過MapReduce處理後輸出的數據是按照Key進行排過序的,即MapReduce中數據Key會采用默認的排序規則進行排序。而我們測試用例裏面期望輸出的key單詞是沒有排過序的。
下面我們隊期望輸出的數據按照MapReduce默認的排序規則進行排序,如下:
//指定Reducer輸出 -->期望的輸出
driver.withOutput(new Text("Hello"), new IntWritable(2))
.withOutput(new Text("a"), new IntWritable(1))
.withOutput(new Text("good"), new IntWritable(1))
.withOutput(new Text("is"), new IntWritable(1))
.withOutput(new Text("java"), new IntWritable(2))
.withOutput(new Text("language"), new IntWritable(1))
.withOutput(new Text("word"), new IntWritable(1));
這樣期望輸出的key就是安裝MapReduce默認規則排序了
再次運行單元測試就通過了。
2018-08-06 期 MapReduce MRUnit安裝及單元測試