Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat:(ast) Node support concurrently-read #661

Merged
merged 5 commits into from
Jul 29, 2024
Merged

feat:(ast) Node support concurrently-read #661

merged 5 commits into from
Jul 29, 2024

Conversation

AsterDY
Copy link
Collaborator

@AsterDY AsterDY commented Jun 25, 2024

Background

Since sonic uses Lazy-Load design to parse JSON nodes on demands, it doesn't support concurrently-Read by default. But in practice, we found many users may use it in such scenarios and cause panic.

Desgin

Thus we design one kind of State Transition Lock to achieve the goal by least cost. In brief:

  • Drop the Lazy state (parse transversely on demands). Every node has only two state: Parsed (bool, number, or object slice...) or Raw (JSON), and the only transit direction is Raw->Parsed;
  • Every exported reading API (like Get()) will check its state using Atomic Load:
    • If in Raw, transfer to Parsed with Write Lock
    • If in Parsed, the state won't change forever, just use if safely
    • Except for encoding the Node (Raw() and MarshalJSON()). In these circumstances, we need keep the Raw state to achieve the best performance. Thus use Read Lock to keep the state.

Performance

Load/LoadAll will be faster a lot

  • After this PR, Load/LoadAll() will automatically add lock for Node, instead of parsing all children nodes. This will help to benefits users who use them for supporting concurrently-read

Get() will be slower on small data but faster on large data

  • For large Object (children > 8): since commit, sonic will build a map index for object, so the Node.Get() will generally faster
  • For small object: since every Get() will use atomic operation to check state, it will be a bit slower

Benchmark

goos: linux
goarch: amd64
pkg: github.com/bytedance/sonic/ast
cpu: Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz
                │   base.out   │             target.out              │
                │    sec/op    │   sec/op     vs base                │
Node_LoadAll-32   45.527µ ± 1%   2.200µ ± 2%  -95.17% (p=0.000 n=20)
Node_Get-32        101.5n ± 2%   103.4n ± 2%        ~ (p=0.213 n=20)
geomean            2.150µ        476.9n       -77.82%

                │   base.out   │               target.out                │
                │     B/s      │      B/s       vs base                  │
Node_LoadAll-32   272.8Mi ± 1%   5645.4Mi ± 2%  +1969.40% (p=0.000 n=20)
Node_Get-32       119.4Gi ± 2%    117.3Gi ± 2%          ~ (p=0.211 n=20)
geomean           5.641Gi         25.43Gi        +350.75%

                │    base.out     │               target.out               │
                │      B/op       │     B/op      vs base                  │
Node_LoadAll-32   25.133Ki ± 0%     1.102Ki ± 0%  -95.62% (p=0.000 n=20)
Node_Get-32          0.000 ± 0%       0.000 ± 0%        ~ (p=1.000 n=20) ¹

@AsterDY AsterDY changed the title feat:(ast) support concurrently-read feat:(ast) Node support concurrently-read Jun 27, 2024
@codecov-commenter
Copy link

codecov-commenter commented Jun 27, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 75.60976% with 60 lines in your changes missing coverage. Please review.

Project coverage is 51.81%. Comparing base (125ff79) to head (9947074).
Report is 2 commits behind head on main.

Files Patch % Lines
ast/node.go 77.57% 14 Missing and 10 partials ⚠️
ast/buffer.go 68.88% 11 Missing and 3 partials ⚠️
ast/parser.go 81.66% 9 Missing and 2 partials ⚠️
api.go 0.00% 4 Missing ⚠️
ast/iterator.go 55.55% 1 Missing and 3 partials ⚠️
ast/search.go 70.00% 2 Missing and 1 partial ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #661      +/-   ##
==========================================
- Coverage   52.85%   51.81%   -1.05%     
==========================================
  Files         174      128      -46     
  Lines       11470    10844     -626     
==========================================
- Hits         6063     5619     -444     
+ Misses       5071     4906     -165     
+ Partials      336      319      -17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

option/option.go Outdated Show resolved Hide resolved
ast/parser.go Outdated Show resolved Hide resolved
ast/encode.go Show resolved Hide resolved
ast/node.go Show resolved Hide resolved
ast/node.go Show resolved Hide resolved
ast/encode.go Show resolved Hide resolved
@liuq19
Copy link
Collaborator

liuq19 commented Jul 24, 2024

一个全局的问题,我们是否需要同时维护 lazy 和 loadonce 两种状态,按理说一个node的状态只有三种:1. all loaded, 2, lazied, 3, loadonce with lock ?

如果parser 里面同时有noLazy 和loadonce 两个属性的话,会增加不必要的复杂度

README.md Outdated Show resolved Hide resolved
ast/parser.go Show resolved Hide resolved
@AsterDY
Copy link
Collaborator Author

AsterDY commented Jul 24, 2024

一个全局的问题,我们是否需要同时维护 lazy 和 loadonce 两种状态,按理说一个node的状态只有三种:1. all loaded, 2, lazied, 3, loadonce with lock ?

如果parser 里面同时有noLazy 和loadonce 两个属性的话,会增加不必要的复杂度

lazyload还是有其存在的价值,不是所有业务都有并发加载的需求

@AsterDY AsterDY merged commit df7126e into main Jul 29, 2024
45 checks passed
@AsterDY AsterDY deleted the feat/ast_lock branch July 29, 2024 05:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants