OSPBench: Open Stream Processing Benchmark

This repository contains the code of the open stream processing benchmark.

All documentation can be found in our wiki.

It includes:

benchmark: benchmark pipeline implementations (docs).
data-stream-generator: data stream generator to generate input streams locally or on a DC/OS cluster (docs).
output-consumer: consumes the output of the processing job and metrics-exporter from Kafka and stores it on S3 (docs).
evaluator: computes performance metrics on the output of the output consumer (docs).
result analysis: Jupyter notebooks to visualize the results (docs).
deployment: deployment scripts to run the benchmark on an DC/OS setup on AWS (docs).
kafka-cluster-tools: Kafka scripts to start a cluster and read from a topic for local development (docs).
metrics-exporter: exports metrics of JMX and cAdvisor and writes them to Kafka (docs).

Currently the benchmark includes Apache Spark (Spark Streaming and Structured Streaming), Apache Flink and Kafka Streams.

References, Publications and Talks

Are you having issues with anything related to the project? Do you wish to use this project or extend it? The fastest way to contact me is through:

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
benchmark		benchmark
data-stream-generator		data-stream-generator
deployment		deployment
evaluator		evaluator
kafka-cluster-tools		kafka-cluster-tools
metrics-exporter		metrics-exporter
output-consumer		output-consumer
result-analysis		result-analysis
LICENSE		LICENSE
README.md		README.md