This is a series of a increasingly more performant “words counters” (not fancy, just separating “words”
by space gaps) implemented making use of Modern C++20 features such as
span
, string_view
, filesystem
, ranges
and others.
Where appropriate, parallel hash maps and other dependencies such as
Abseil containers, Boost.Future
(which allow continuations) or Asio with
C++20 coroutines are used.
Feel free to contribute your own implementation if you wish, though at the moment there is no documentation: look at words counter program. You can author your own free function for your implementation.
(Unscientific) Performance comparisons are shown for each implementation below on the 2 machines I tried myself.
So far, there are:
- A monothread version
- A simple futures multithreaded version
- A futures version with thread pools for more load balancing
- A futures version with memory mapped files and thread pools balancing
- A coroutine-based read file implementation (Linux Only, Asio + uring)
You can run the benchmarks and experiment freely with those.
The words counter program includes internal profiling for the different stages of
BoostFutureMemMappedFileWordsCounter
words counter implementation when
enabled via meson options.
You can run all implementation run-words-counter-implementations.sh
.
The results will be stored in words-counter-results/
directory.
If you want to experiment further with the binary itself, just check the command lines inside
the script. You have a --help
option in the binary program itself to figure out how it works.
WARNING: the program has been tested under MacOS and Linux on my machines. If you have any trouble, please contact me.
- gcc12 in Linux or gcc13 in MacOS
- conan 1.62.0 (but not 2.x)
- python3
- ninja build system
- meson build system at least 1.3.0
- CMake (for Conan subprojects compilation)
- git lfs (optional, but needed to run the benchmarks or to have sample data to run implementations against, unless you have your own data set)
# This can take a while since Conan will download packages and probably build
meson setup -Dbuildtype=release build-release
# Please, note that for the first time the compilation stage could look stuck.
# This is because it needs to download the file resources needed to use the program and uncompress.
meson compile -C build-release
# Run benchmarks
meson test --benchmark -v -C build-release
# Run implementations
./run-words-counter-implementations.sh
# This can take a while since Conan will download packages and probably build
meson setup -Denable_internal_profiling=true -Dconan_profile=gcc13_macos --native-file meson/native/compilers/gcc13_macos.ini -Dbuildtype=release build-release
# Please, note that for the first time the compilation stage could look stuck.
# This is because it needs to download the file resources needed to use the program and uncompress.
meson compile -C build-release
ulimit -n 10240
# Run benchmarks
meson test --benchmark -v -C build-release
# Run implementations
./run-words-counter-implementations.sh
Macbook Pro late 2019. 2,4 GHz Intel Core i9 de 8 cores with multithreading. 32 GB 2667 MHz DDR4. APPLE SSD AP0512N.
MiB read: 2041.65 Execution time: 62.867s Processing speed: 32.48 MiB/s Number of files processed: 1024 Words/s: 5437369.1
MiB read: 2041.65 Execution time: 22.136s Processing speed: 92.23 MiB/s Number of files processed: 1024 Words/s: 15442314.8
MiB read: 2037.65 Execution time: 17.919s Processing speed: 113.71 MiB/s Number of files processed: 1024 Words/s: 19076459.7
Count words threads: 5. Split words threads: 4. Merge results threads: 7 Split words/nthreads: (2.92412675s (normalized: 0.264142732457078) Count words/nthreads: 3.9666942s (normalized: 0.3583201189242713) Merge results/nthreads: 4.179431571428571s (normalized: 0.37753714861865073) MiB read: 2037.65 Execution time: 6.93s Processing speed: 294.03 MiB/s Number of files processed: 1024 Words/s: 49326271.6
Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz. 32 GB RAM DDR4 2400 Mhz. Kingston SA400S3 SSD.
MiB read: 2041.66 Execution time: 49.897s Processing speed: 40.92 MiB/s Number of files processed: 1024 Words/s: 6851865.5
MiB read: 2041.66 Execution time: 35.707s Processing speed: 57.18 MiB/s Number of files processed: 1024 Words/s: 9574804.2
MiB read: 2037.66 Execution time: 24.673s Processing speed: 82.59 MiB/s Number of files processed: 1024 Words/s: 13856747.6
MiB read: 2037.66 Execution time: 16.694s Processing speed: 122.06 MiB/s Number of files processed: 1024 Words/s: 20479665.3
MiB read: 2037.66 Execution time: 13.162s Processing speed: 154.81 MiB/s Number of files processed: 1024 Words/s: 25975348.2
Odroid H3+ Intel(R) Pentium(R) Silver N6005 @ 2.00GHz 4 cores/threads 64 GB DDR4 MTS=2933 Samsung_SSD_970_EVO_Plus_1TB
MiB read: 2041.56 Execution time: 60.504s Processing speed: 33.74 MiB/s Number of files processed: 1024 Words/s: 5652766.8
MiB read: 2041.56 Execution time: 45.504s Processing speed: 44.87 MiB/s Number of files processed: 1024 Words/s: 7516152.4
MiB read: 2037.56 Execution time: 34.849s Processing speed: 58.47 MiB/s Number of files processed: 1024 Words/s: 9814198.4
MiB read: 2037.56 Execution time: 26.485s Processing speed: 76.93 MiB/s Number of files processed: 1024 Words/s: 12913536.0
MiB read: 2037.56 Execution time: 18.253s Processing speed: 111.63 MiB/s Number of files processed: 1024 Words/s: 18737467.6