Skip to content
This repository has been archived by the owner on Mar 11, 2024. It is now read-only.
Alexander Kolb edited this page Dec 4, 2015 · 38 revisions

Testing Apache Flink's DataSet and DataStream API is almost identical, except for data streams working with time characteristics.
So there's no separate documentation for testing these API's. The chapter on input describes how to define timed input for streaming data flows.
Examples are made for testing both API's.

##Introduction

The Framework can be utilized directly by using the test environments, or the base classes for JUnit.

DataSetTestEnvironment env = 
		DataSetTestEnvironment.createTestEnvironment(1)

DataSet<Integer> dataSet = env.createTestDataSet(asList(1,2,3))
		.map((MapFunction<Integer,Integer>) (value) -> {return value + 1});

ExpectedOutput<Integer> expectedOutput = 
		new ExpectedOutput<Integer>().expectAll(asList(2,3,4))

OutputFormat<Integer> outputFormat = 
		env.createTestOutputFormat(new HamcrestVerifier(expectedOutput)))
dataSet.output(outputFormat)
env.executeTest()

Using one of the test bases for JUnit is easier and better readable:

class Test extends TestBase {

    @org.junit.Test
    public myTest() {
        DataSet<Integer> dataSet = createTestDataSet(asList(1,2,3))
            .map((MapFunction<Integer,Integer>) (value) -> {return value + 1});

        ExpectedOutput<Integer> expectedOutput = 
            new ExpectedOutput<Integer>().expectAll(asList(2,3,4))

        assertDataSet(dataSet, expectedOutput);
    }

}

##Documentation

  • ###Defining Input

  • ###Using Matchers

  • ###Running Tests

  • ###Input Output Translation

  • ###Expanding the Framework