-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Databroker performance improvements in Get service call #26
Databroker performance improvements in Get service call #26
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #26 +/- ##
==========================================
+ Coverage 49.40% 50.92% +1.52%
==========================================
Files 31 31
Lines 11285 11878 +593
==========================================
+ Hits 5575 6049 +474
- Misses 5710 5829 +119 ☔ View full report in Codecov by Sentry. |
This may be faster, but it breaks a lot in the existing API as well. before
After
(I guess breaking AutoComplete is another side effect of this) So that would mean a major version bump and losing features If RegExp (I guess the extending/matching with stars) is the performance bottle neck here, would it not be better, to only use RegExp if needed? As in
That would just cost a bit of ifs, and I guess the heavy part of the regexps is, that you can't precompile (becasue they come form client). So implementing like above: A client that needs performance ONLY subscribes to Lead nodes and is fine,. Everybody else can knick themselves out |
This was a hot fix to see if we could get some performance improvements. One more thing regarding auto-completion, yes, it is good to have it and it has been considered to implement an extra service method for Metadata that will support wildcards, which means that the user would not need wildcard support for Get as all signals could be stored locally by the client once in advance. |
Yes, in terms of an updated/new API I am sure there may be better ways to split/do this However, for the current API, we should not merge something into main, that changes behavior significantly (I mean this doesn"t reinterpret some unclarities, it would actually remove features contradict what is documented) "Engineering" wise,I am not sure, but have you looked into whether "globbing" libraries are faster? Like i.e. https://github.com/devongovett/glob-match , I however lack the background to know if this is "fundamentally" faster then the regexp construction of state amchie and machting. And it also wil be a bandaid, because non-wildcard hashmap lookup shall be O(1) (with respect of number of signals, the hashing obviously would be O(l) with l being length of the path), and I doubt the most clever glob matching when backed by a "normal" map/list can ever be faster than O(n)..... But surprise me, if you can deliver O(1) we will probably take it :D so tldr: I still think now and also in a future APIs ome kind of matching useful, but should be implemented in way that it is only "used when needed" Off topic here, but since you mentioned it: I guess even when "discoverability" is in a metadata API a user might still want to subscribe to "a branch" (which can be seen as a "wildcard" sort of), otherwise something like the recorder from https://github.com/eclipse-kuksa/kuksa-csv-provider gets really hard, consider you want to generate a trace of everzthing running throug databroker (or a subbranch) but are not able to say "subscribe vehicle", but instead need to list/do 1000 individual paths |
820bb38
to
010fa74
Compare
Trying out the library you posted, will test it and check the performance results. |
43fe4e4
to
bd14dcd
Compare
Wildcard now uses glob-matching library https://github.com/devongovett/glob-match |
635f6a6
to
f129492
Compare
f99747b
to
6dff684
Compare
I think, that sutff like
Rationale is: the first case I have never used, and find it hard to believe anyone did, the second one I definitely use often at least via CLI |
|
Ah, I remember, because "branches" are not really a thing in our internal data structure. In terms of keeping API compatibility, would something be feasible like
It should not really cost anything , when querying direct leaves. I understand, for any kind of new API we would always use the "best" approch, but this is slightly modifying existing API behavior, so I would prefer to keep behavior changes to a minimum |
Implemented your suggestions. Now it is also compatible for branches use cases like |
@rafaeling sounds good It would be good, if the conflicts are resolved, because that seems to prevent CI runs, so currently can not lazily test by just pulling the docker image :D |
Built manually and checked behavior. Looking good to me! |
bb13826
to
aea1aea
Compare
bd47ae4
to
b57e74f
Compare
…b for wildcard handling
b57e74f
to
29a968b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm 🍭
This change means a 8 times faster improvement for the Get method, which drops to 7.8 ms for 5000 requests compared to the main branch implementation, which is 60 ms.
See the wildcard exceptions that have been made obsolete by this change:
https://github.com/eclipse-kuksa/kuksa-databroker/blob/9ec27697af5d7594245e5c7cd5ebc2eb687abb80/doc/wildcard_matching.md