專題:如何寫測試——MapReduce
阿新 • • 發佈:2019-02-07
寫不寫測試是個人選擇問題,對於我自己而言,寫測試不是為了有X格,而是為了對程式碼更有信心。
MapReduce的測試確實沒有那麼方便,但是還是有辦法的。下面的內容主要加工自MRUnit Tutorial,Tutorial中另外還介紹了Counter的測試(也就是如何獲取Counter)和Configuration傳引數(如何在Mock中獲取conf物件)。
1. 基本功 - JUnit
如果不會這個我還能說什麼呢,還好很少有人不會。
座標junit:junit
import org.junit.*;
public class TestCases{
@Test
public void testXXX(){
assertEquals(1 == 1);
}
}
這一部分是程式碼功能性測試的基礎,一般來說是與環境不太相關的都可以用JUnit來做函式級的測試。這部分完成之後才有必要進行下面的Mapper、Reducer測試。
2. MapReduce Mock - MRUnit
座標
<dependency>
<groupId>org.apache.mrunit</groupId>
<artifactId>mrunit</artifactId>
<version >1.1.0</version>
<classifier>hadoop2</classifier>
<scope>test</scope>
</dependency>
注意:需要顯式地指定classifier來指定hadoop1還是hadoop2,兩者在API上是有區別的。
下面以測試WordCount為例說明如何對各個部分寫測試。
2.1 測試Mapper
- 初始化一個MapDriver
WordCount.Map mapper = new WordCount.Map();
mapDriver = MapDriver.newMapDriver(mapper);
- 給定輸入檢查輸出
@Test
public void testMapper() throws IOException {
mapDriver.withInput(new LongWritable(), new Text("a b a"))
.withAllOutput(Lists.newArrayList(
new Pair<Text, IntWritable>(new Text("a"), new IntWritable(1)),
new Pair<Text, IntWritable>(new Text("b"), new IntWritable(1)),
new Pair<Text, IntWritable>(new Text("a"), new IntWritable(1))
))
.runTest();
}
大部分情況下,測試不太可能寫得這麼優雅,比如遇到了float/double,這個時候就需要把結果取出來判斷(這種方式顯然才是更靈活的)。
@Test
public void testMpper2() throws IOException {
mapDriver.withInput(new LongWritable(), new Text(
"a b a"));
List<Pair<Text, IntWritable>> actual = mapDriver.run();
List<Pair<Text, IntWritable>> expected = Lists.newArrayList(
new Pair<Text, IntWritable>(new Text("a"), new IntWritable(1)),
new Pair<Text, IntWritable>(new Text("b"), new IntWritable(1)),
new Pair<Text, IntWritable>(new Text("a"), new IntWritable(1))
);
// apache commons-collection: 判斷元素相等,考慮了每個元素的頻次
assertTrue(CollectionUtils.isEqualCollection(actual, expected));
assertEquals(actual.get(0).getSecond().get(), 1);
}
2.2 測試Reducer
- 與Mapper類似,需要先初始化一個ReduceDriver
WordCount.Reduce reducer = new WordCount.Reduce();
reduceDriver = ReduceDriver.newReduceDriver(reducer);
- 給定輸入檢查輸出
@Test
public void testReducer() throws IOException {
List<IntWritable> values = Lists.newArrayList();
values.add(new IntWritable(1));
values.add(new IntWritable(1));
reduceDriver.withInput(new Text("a"), values);
reduceDriver.withOutput(new Text("a"), new IntWritable(2));
reduceDriver.runTest();
}
2.3 測試整個流程
- 需要初始化三個部分——
MapDriver
,ReduceDriver
和MapReduceDriver
WordCount.Map mapper = new WordCount.Map();
WordCount.Reduce reducer = new WordCount.Reduce();
mapDriver = MapDriver.newMapDriver(mapper);
reduceDriver = ReduceDriver.newReduceDriver(reducer);
mapReduceDriver = MapReduceDriver.newMapReduceDriver(mapper, reducer);
- 設定Map的輸入,檢查Reduce的輸出
@Test
public void testMapReduce() throws IOException {
mapReduceDriver.withInput(new LongWritable(), new Text("a b a"))
.withInput(new LongWritable(), new Text("a b b"))
.withAllOutput(Lists.newArrayList(
new Pair<Text, IntWritable>(new Text("a"), new IntWritable(3)),
new Pair<Text, IntWritable>(new Text("b"), new IntWritable(3))))
.runTest();
}
3. 附錄
- 完整工程
- pom依賴
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.6.0</version>
</dependency>
<dependency>
<groupId>org.apache.mrunit</groupId>
<artifactId>mrunit</artifactId>
<version>1.1.0</version>
<classifier>hadoop2</classifier>
</dependency>
</dependencies>
- 程式碼
package du00.tests;
import com.google.common.collect.Lists;
import org.apache.commons.collections.CollectionUtils;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mrunit.mapreduce.MapDriver;
import org.apache.hadoop.mrunit.mapreduce.MapReduceDriver;
import org.apache.hadoop.mrunit.mapreduce.ReduceDriver;
import org.apache.hadoop.mrunit.types.Pair;
import org.junit.*;
import static org.junit.Assert.*;
import java.io.IOException;
import java.util.List;
public class WordCountTest {
MapDriver<LongWritable, Text, Text, IntWritable> mapDriver;
ReduceDriver<Text, IntWritable, Text, IntWritable> reduceDriver;
MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable> mapReduceDriver;
@Before
public void setUp() {
WordCount.Map mapper = new WordCount.Map();
WordCount.Reduce reducer = new WordCount.Reduce();
mapDriver = MapDriver.newMapDriver(mapper);
reduceDriver = ReduceDriver.newReduceDriver(reducer);
mapReduceDriver = MapReduceDriver.newMapReduceDriver(mapper, reducer);
}
@Test
public void testMapper() throws IOException {
mapDriver.withInput(new LongWritable(), new Text("a b a"))
.withAllOutput(Lists.newArrayList(
new Pair<Text, IntWritable>(new Text("a"), new IntWritable(1)),
new Pair<Text, IntWritable>(new Text("b"), new IntWritable(1)),
new Pair<Text, IntWritable>(new Text("a"), new IntWritable(1))
))
.runTest();
}
/**
* 有時候結果會比較複雜,取出來抽取結果的一部分比較會是比較好的選擇。比如物件的某個欄位是double型別的。
*
* @throws IOException
*/
@Test
public void testMpper2() throws IOException {
mapDriver.withInput(new LongWritable(), new Text(
"a b a"));
List<Pair<Text, IntWritable>> actual = mapDriver.run();
List<Pair<Text, IntWritable>> expected = Lists.newArrayList(
new Pair<Text, IntWritable>(new Text("a"), new IntWritable(1)),
new Pair<Text, IntWritable>(new Text("b"), new IntWritable(1)),
new Pair<Text, IntWritable>(new Text("a"), new IntWritable(1))
);
// apache commons-collection: 判斷元素相等,考慮了每個元素的頻次
assertTrue(CollectionUtils.isEqualCollection(actual, expected));
assertEquals(actual.get(0).getSecond().get(), 1);
}
@Test
public void testReducer() throws IOException {
List<IntWritable> values = Lists.newArrayList();
values.add(new IntWritable(1));
values.add(new IntWritable(1));
reduceDriver.withInput(new Text("a"), values);
reduceDriver.withOutput(new Text("a"), new IntWritable(2));
reduceDriver.runTest();
}
@Test
public void testMapReduce() throws IOException {
mapReduceDriver.withInput(new LongWritable(), new Text("a b a"))
.withInput(new LongWritable(), new Text("a b b"))
.withAllOutput(Lists.newArrayList(
new Pair<Text, IntWritable>(new Text("a"), new IntWritable(3)),
new Pair<Text, IntWritable>(new Text("b"), new IntWritable(3))))
.runTest();
}
}