Skip to content

Practice 8 ‐ Benchmarking

Oszkár Semeráth edited this page Nov 6, 2024 · 3 revisions

Caution

This page is under construction. Please come back later.

Introduction

The goal of this laboratory session is to gain practical experience with benchmarking software.

In the lab session, we will use the same running example as in the previous ones; the example DFD specification of the Document Similarity Estimation algorithm is visible below.

Data-flow Diagram of the Document Similarity Estimation

In this lab session, we will measure the performance of the java implementation of this algorithm using tools.

For this lab session, we will use two tools:

Tasks

Task 0 - Preparations

Pull practice 8 materials. You don't need to push your changes in this lab since we are not focusing on CI/CD.

git clone https://github.com/ftsrg-edu/ase-labs.git
cd ase-labs
git switch practice-8

Task 1 - Simple benchmarking

Use System.currentTimeMillis() to measure the execution time of the computeScalarProduct, tokenize, calculateOccurrenceVector, collectShingles and computeOccurrences methods. Log the execution time to the console using a logger.

Example: Add the following code snippets to the beginning and end of the computeScalarProduct method:

        long startTime = System.currentTimeMillis();
        long endTime = System.currentTimeMillis();
        logger.info("Scalar product computed in {} ms", endTime - startTime);

Add similar code snippets to the other methods.

Run the SimilarityApp with different combinations of input files found in the benchmark/src/jmh/resources/texts folder. Log the execution times to the console.

Task 2 - IntelliJ Profiler

Create a Run Configuration for the DiversityApp with Pride1.txt and Sense1.txt. as input files, and start it with IntelliJ Profiler.

Analyze the results of the profiler. What are the most time-consuming methods? What are the most memory-consuming objects?

Repeat the profiling with larger input files (e.g. Pride6.txt and Sense6.txt). Try it with word granularity as well.

Task 3 - Microbenchmarking with JMH

Add the following method to the ShinglerBenchmark class:

    
    @Benchmark
    @BenchmarkMode(Mode.SampleTime)
    public void testTokenize(Blackhole blackhole) {
        BaseTokenizer tokenizer = new BaseTokenizer();
        String document = "This is a test document";
        TokenizedDocument tokenizedDocument = tokenizer.tokenize(document, true);
        blackhole

Run the benchmark using the following command:

./gradlew jmh