-
Notifications
You must be signed in to change notification settings - Fork 9
WIP: Benchmarks
turbolytics edited this page Jan 6, 2025
·
4 revisions
This section aims to provide estimations on the type of performance to expect from sqlflow from various different use cases.
Name | Throughput | Max RSS Memory | Peak Memory Usage |
---|---|---|---|
Simple Aggregation Memory | 45,000 msgs / sec | 230 MiB | 130 MiB |
Simple Aggregation Disk | 36,000 msgs / sec | 256 MiB | 102 MiB |
Enrichment | 13,000 msgs /sec | 368 MiB | 124 MiB |
CSV Disk Join | 11,500 msgs /sec | 312 MiB | 152 MiB |
CSV Memory Join | 33,200 msgs / sec | 300 MiB | 107 MiB |
In Memory Tumbling Window | 44,000 msgs / sec | 198 MiB | 96 MiB |
Each test loads 1MM records into kafka. Each test executes sql-flow consumer until each message is processed. Each test captures the maximum resident memory during the benchmark, and the average throughput of message ingestion.
System
Hardware:
Hardware Overview:
Model Name: MacBook Pro
Model Identifier: MacBookPro18,3
Model Number: Z15G001X2LL/A
Chip: Apple M1 Pro
Total Number of Cores: 10 (8 performance and 2 efficiency)
Memory: 32 GB
Activation Lock Status: Enabled
Performs a simple aggregate. Output is significantly smaller than input.
./benchmark/simple-agg-disk.sh
./benchmark/simple-agg-mem.sh
Performs an enrichment. Output is 1:1 records with input, but each output record is enhanced with additional information.
./benchmark/enrich.sh
./benchmark/csv.filesystem.join.yml
./benchmark/csv.mem.join.yml
Tumbling window that aggregates count of cities.
./benchmark/tumbling-window.sh