Implementation of Online Round Task Backend | DEV Challenge XX
- Expr - simple, lightweight, and fast expression evaluator for Go.
- Bbolt - key/value database in a single file. Uses B+tree as underlying storage, which is fast and memory efficient.
- Golang - high performance language.
- Gin Web Framework - help build API fast: router, request validation, building response.
- Postman - Useful tool to make API functional tests and run it with newman-docker
- Final round. Webhook support (test covered with Postman and webhook-tester docker image).
- Final round. Support MAX, MIN, AVG, SUM functions. (covered by Postman tests)
- Final round. Support EXTERNAL_REF function. (covered by Postman tests)
- Final round. Recursive subscription and webhook between sheets and API instances. (covered by Postman tests)
- Set cell value
- Get cell value
- Get cell list
- Support multiple sheets
- Evaluate formula expression.
- Case-insensitive cell and sheet names
- Support
+
,-
,*
,/
,^
operators - Support parentheses (e.g.
((1+2)*3)
) - High performance. RPS >1800 with 6 CPU Apple M2 and >1300 with 4 CPU Intel i5.
- Check referencing cells on evaluation errors
- Prevent circular references
- Support digits and strings
- Support number and float as cell name (e.g.
1
,4.5
). Let's define that5=100
and5.5=250
. Enjoy! - Permanent storage on disk
docker compose up
curl -i http://127.0.0.1:8080/healthcheck
curl -i -X POST http://127.0.0.1:8080/api/v1/sheet1/cell1 --data '{"value": "1"}'
Run API functional tests with Postman:
docker compose run postman
Example results:
┌─────────────────────────┬───────────────────┬──────────────────┐
│ │ executed │ failed │
├─────────────────────────┼───────────────────┼──────────────────┤
│ iterations │ 2 │ 0 │
├─────────────────────────┼───────────────────┼──────────────────┤
│ requests │ 300 │ 0 │
├─────────────────────────┼───────────────────┼──────────────────┤
│ test-scripts │ 500 │ 0 │
├─────────────────────────┼───────────────────┼──────────────────┤
│ prerequest-scripts │ 398 │ 0 │
├─────────────────────────┼───────────────────┼──────────────────┤
│ assertions │ 622 │ 0 │
├─────────────────────────┴───────────────────┴──────────────────┤
│ total run duration: 22.5s │
├────────────────────────────────────────────────────────────────┤
│ total data received: 32.96kB (approx) │
├────────────────────────────────────────────────────────────────┤
│ average response time: 17ms [min: 6ms, max: 321ms, s.d.: 19ms] │
└────────────────────────────────────────────────────────────────┘
Application has >75% unit test coverage. Run unit tests:
docker compose run unit
Example results of unit tests and coverage:
ok devChallengeExcel 0.286s coverage: 100.0% of statements
devChallengeExcel/ApiController.go:29: NewApiController 100.0%
...
devChallengeExcel/SheetRepository.go:32: SetCell 100.0%
devChallengeExcel/SheetRepository.go:76: GetCell 100.0%
devChallengeExcel/SheetRepository.go:110: GetCellList 100.0%
coverage: 84.6% of statements
total: (statements) 78.4%
docker compose run siege
Note:
- Run with empty database is more stressful (because of preventing write same as stored value).
- you can see siege log at
siege/log/siege.log
- load testing inside docker on same machine is not so representative. Better to use two hosts: API-server and siege-client.
- it contains siege with Fibonacci sequences in five sheets. It requests update each element of each sequence (92 elements * 5 sheets).
Siege result on my machine:
- Docker resources: Apple M2, 6 CPUs, RAM 8 GB.
Transactions: 109647 hits
Availability: 100.00 %
Elapsed time: 60.08 secs
Data transferred: 8.42 MB
Response time: 0.05 secs
Transaction rate: 1825.02 trans/sec
Throughput: 0.14 MB/sec
Concurrency: 84.72
Successful transactions: 76134
Failed transactions: 0
Longest transaction: 0.92
Shortest transaction: 0.00
Application and tests cover corner cases listed below. This list not exhaustive.
-
Final round: recursive subscription and webhook between sheets and API instance. Example: cell A = EXTERNAL_REF(B) => B => EXTERNAL_REF(C); On change C, A and B will be updated and subscriptions for A will be called.
-
Max length for cell name and sheet name is 32768 (BBolt limit). With special chars in cell name it's less.
-
Support digit cell names (e.g.
1
,2.5
). -
In case with digit cell name, it's possible to use it as a digit in formula (e.g. set
10=50
and then formula=10+2.5
will be evaluated as50 + 2.5 => 52.5
). -
Restriction: cell with a digit name should have only a digit value or formula evaluated into a digit. You can't set
10=awesome
because it potentially leads to error in any formula with digit10
. This rule is not applied for string cell names. -
Long chain of referencing. Example: Fibonacci sequence.
-
Circular references is forbidden.
-
Max supported values of formula result is 64-bit integer range:
-9223372036854775808
to9223372036854775807
. So, it can calculate only first 92 elements of Fibonacci sequence. -
For decimals it's 64-bit float range:
-1.7976931348623157e+308
to1.7976931348623157e+308
. -
Detect syntax errors in parentheses (e.g.
((1+2)
) -
Division by zero is resulted to Infinity.
-
Supported operators:
+
,-
,*
,/
,^
can't be used with strings. -
Restrict usage supported operators in cell name (e.g.
cell1+cell2
is not allowed). -
Cell name escaper to remove chars reserved by Expr parser as operator. Example:
year.2021,month:April;
will be escaped without chars.,~:;
.
It's a box solution to execute formula expression. It parses expression into AST (Abstract syntax tree). Then transform this tree into the opcode stack. Running the stack is like executing Assembler program. According to benchmarks, this approach is very fast and memory efficient.
O(log n)
complexity with B-tree.- key-value for store cell
- buckets to isolate sheets in single file
- prefix scan in B-tree to get dependencies of cell (see below)
To ensure that changes in one cell do not break other cells, we need to store dependencies between cells. Dependencies represent as directional graph. Graph stored as prefixed key in B-tree.
To recursive get dependencies we need to do prefix scan for each cell in dependence chain. It's O(k * log n)
time, where k
is length of dependence chain.
Example:
cell10 = cell2 + cell3
cell20 = cell2 + cell3
cell30 = cell10 + cell20
It translates to keys:
cell2:cell10
cell2:cell20
cell3:cell10
cell3:cell20
cell10:cell30
cell20:cell30
cell2
is dependency of cell10
and cell20
and (recursively) cell30
.
To fetch all cells which depend on cell2
we need:
- prefix search in B-tree by
cell2:
-O(log n + 2)
- prefix search in B-tree by
cell10:
-O(log n + 1)
- prefix search in B-tree by
cell20:
-O(log n + 1)
- prefix search by
cell30:
-O(log n)
- get empty result and exit from recursion
In total O(log n) * 4
time.
Search all recursive dependencies has Log linear complexity O(n log n)
.
To prevent circular and repeat recursive there is a hashmap to store already visited cells.
Import collection postman/DevChallengeExcel.postman_collection.json
into Postman app.
This collection is useful to test any other API implementation.