Skip to content

A words counter with multiple implementations and benchmarks for each implementation

Notifications You must be signed in to change notification settings

germandiagogomez/words-counter-benchmarks-game

Repository files navigation

Words counter benchmarks

This is a series of a increasingly more performant “words counters” (not fancy, just separating “words” by space gaps) implemented making use of Modern C++20 features such as span, string_view, filesystem, ranges and others.

Where appropriate, parallel hash maps and other dependencies such as Abseil containers, Boost.Future (which allow continuations) or Asio with C++20 coroutines are used.

Feel free to contribute your own implementation if you wish, though at the moment there is no documentation: look at words counter program. You can author your own free function for your implementation.

(Unscientific) Performance comparisons are shown for each implementation below on the 2 machines I tried myself.

So far, there are:

  • A monothread version
  • A simple futures multithreaded version
  • A futures version with thread pools for more load balancing
  • A futures version with memory mapped files and thread pools balancing
  • A coroutine-based read file implementation (Linux Only, Asio + uring)

You can run the benchmarks and experiment freely with those.

The words counter program includes internal profiling for the different stages of BoostFutureMemMappedFileWordsCounter words counter implementation when enabled via meson options.

You can run all implementation run-words-counter-implementations.sh. The results will be stored in words-counter-results/ directory.

If you want to experiment further with the binary itself, just check the command lines inside the script. You have a --help option in the binary program itself to figure out how it works.

WARNING: the program has been tested under MacOS and Linux on my machines. If you have any trouble, please contact me.

Requirements

  • gcc12 in Linux or gcc13 in MacOS
  • conan 1.62.0 (but not 2.x)
  • python3
  • ninja build system
  • meson build system at least 1.3.0
  • CMake (for Conan subprojects compilation)
  • git lfs (optional, but needed to run the benchmarks or to have sample data to run implementations against, unless you have your own data set)

Linux

# This can take a while since Conan will download packages and probably build
meson setup -Dbuildtype=release build-release

# Please, note that for the first time the compilation stage could look stuck.
# This is because it needs to download the file resources needed to use the program and uncompress.
meson compile -C build-release

# Run benchmarks
meson test --benchmark -v -C build-release

# Run implementations
./run-words-counter-implementations.sh

MacOS

# This can take a while since Conan will download packages and probably build
meson setup -Denable_internal_profiling=true -Dconan_profile=gcc13_macos --native-file meson/native/compilers/gcc13_macos.ini -Dbuildtype=release build-release

# Please, note that for the first time the compilation stage could look stuck.
# This is because it needs to download the file resources needed to use the program and uncompress.
meson compile -C build-release

ulimit -n 10240
# Run benchmarks
meson test --benchmark -v -C build-release

# Run implementations
./run-words-counter-implementations.sh

Benchmark run results example

Benchmark run example

Results running words counter implementations

Results for each implementation (MacOS)

Macbook Pro late 2019.
2,4 GHz Intel Core i9 de 8 cores with multithreading.
32 GB 2667 MHz DDR4.
APPLE SSD AP0512N.

MonoThreadWordsCounter

MiB read: 2041.65
Execution time: 62.867s
Processing speed: 32.48 MiB/s
Number of files processed: 1024
Words/s: 5437369.1

AsyncWordsCounter

MiB read: 2041.65
Execution time: 22.136s
Processing speed: 92.23 MiB/s
Number of files processed: 1024
Words/s: 15442314.8

ExecutorBasedFutureWordsCounter

MiB read: 2037.65
Execution time: 17.919s
Processing speed: 113.71 MiB/s
Number of files processed: 1024
Words/s: 19076459.7

BoostFutureMemMappedFileWordsCounter

Count words threads: 5. Split words threads: 4. Merge results threads: 7

        Split words/nthreads: (2.92412675s (normalized: 0.264142732457078)
        Count words/nthreads: 3.9666942s (normalized: 0.3583201189242713)
        Merge results/nthreads: 4.179431571428571s (normalized: 0.37753714861865073)
        

MiB read: 2037.65
Execution time: 6.93s
Processing speed: 294.03 MiB/s
Number of files processed: 1024
Words/s: 49326271.6

Results for each implementation (Linux)

Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz. 32 GB RAM DDR4 2400 Mhz. Kingston SA400S3 SSD.

MonoThreadWordsCounter

MiB read: 2041.66
Execution time: 49.897s
Processing speed: 40.92 MiB/s
Number of files processed: 1024
Words/s: 6851865.5

AsyncWordsCounter

MiB read: 2041.66
Execution time: 35.707s
Processing speed: 57.18 MiB/s
Number of files processed: 1024
Words/s: 9574804.2

ExecutorBasedFutureWordsCounter

MiB read: 2037.66
Execution time: 24.673s
Processing speed: 82.59 MiB/s
Number of files processed: 1024
Words/s: 13856747.6

BoostFutureMemMappedFileWordsCounter

MiB read: 2037.66
Execution time: 16.694s
Processing speed: 122.06 MiB/s
Number of files processed: 1024
Words/s: 20479665.3

ThreadPoolWithCoroutine (only Linux)

MiB read: 2037.66
Execution time: 13.162s
Processing speed: 154.81 MiB/s
Number of files processed: 1024
Words/s: 25975348.2

Results for each implementation (Linux)

Odroid H3+ Intel(R) Pentium(R) Silver N6005 @ 2.00GHz 4 cores/threads 64 GB DDR4 MTS=2933 Samsung_SSD_970_EVO_Plus_1TB

MonoThreadWordsCounter

MiB read: 2041.56
Execution time: 60.504s
Processing speed: 33.74 MiB/s
Number of files processed: 1024
Words/s: 5652766.8

AsyncWordsCounter

MiB read: 2041.56
Execution time: 45.504s
Processing speed: 44.87 MiB/s
Number of files processed: 1024
Words/s: 7516152.4

ExecutorBasedFutureWordsCounter

MiB read: 2037.56
Execution time: 34.849s
Processing speed: 58.47 MiB/s
Number of files processed: 1024
Words/s: 9814198.4

BoostFutureMemMappedFileWordsCounter

MiB read: 2037.56
Execution time: 26.485s
Processing speed: 76.93 MiB/s
Number of files processed: 1024
Words/s: 12913536.0

ThreadPoolWithCoroutine

MiB read: 2037.56
Execution time: 18.253s
Processing speed: 111.63 MiB/s
Number of files processed: 1024
Words/s: 18737467.6

About

A words counter with multiple implementations and benchmarks for each implementation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published