Codebase | Description |
---|---|
attendance | Attendance service |
dashboard | Dashboard service |
tasks | Background/Cron jobs |
grafana | Grafana dashboard |
scripts | Seeding, Benchmarking |
Prerequisites: Docker, Docker Compose, NodeJS npm/yarn
- Build & start containers
$ docker compose up -d --build --remove-orphans
- Seed database
$ cd scripts/
$ yarn install & yarn seed
- Bench mark
$ cd scripts/
$ yarn benchmark
Requirement:
1000 schools, 10000 attendance records per school per day
=> 10 millions records / day
- Server capacity: ~ 1500 rps
Assume all 1000 schools are in the same timezone, which will result peak traffic at 2 period of time (check-in/check-out).
Each period last approx 1 hour so our server has to be able to handle 5 million reqs/hour ~ 1400 rps
- Write-heavy system.
- Data of attendances is of type append-only.
- Redis as a temporary storage for business logic purposes (e.g: check if users already checked in or not). It will be flushed everyday by
tasks
- Attendances records will be produced to kafka by
attendance-service
and then be consumed bydashboard-service
- If consuming process goes wrong, records will be send to Dead Letter Queue (just a mongodb collection) for further processing.
- There are 2 tasks: flush redis and process dead letter queue.
- Redis memory
10 millions records per day require quite a lot of Redis storage.
=> Use hashmap to abstract the key:Using plain key-value: checkin:12345678 true checkin:12345612 true checkin:12345645 true Using hashmap: checkin:123456 78 true 12 true 45 true
Using hashmap with some appropriate configurations result much lower storage cost. With 10 millions records, the first strategy will eat up ~ 1.6 GB, while the second result ~ 400 MB.
Implementation: redis-repo.ts
Reference: Redis Memory Optimization
- Bulk pattern
Using bulk pattern to buffer and bulk write/publish can reduce the Round-trip cost.
Usecases: use Redis Pipeline to perfom write, use Kafka batch to send messages.
Implementation: bulk.ts