-
Notifications
You must be signed in to change notification settings - Fork 11.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Early data parsing, stacking by value and support object data #6576
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update the PR description to explain the high-level idea / organization of this PR? (e.g. what parse in scale vs controller are each used for). And also why each of the changed tests are different?
Should this replace all usage of getRightValue
(if not in this PR then eventually)? I think there's still about half a dozen left
Are there any performance impacts (positive or negative) on large charts? e.g. I wonder how it would affect the uPlot benchmarks
e68d290
to
e3a3a71
Compare
src/core/core.datasetController.js
Outdated
@@ -75,6 +75,74 @@ function unlistenArrayEvents(array, listener) { | |||
delete array._chartjs; | |||
} | |||
|
|||
function storeCrossRef(scale, datasetIndex, scaleValue, datasetValue) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment explaining what the cross-ref is?
I just took a cursory glance at this PR in the Chrome debugger. I see This PR seems about 10% slower overall on the uPlot benchmark, but it seems like that should be addressable |
This PR goes way too far now, but I wanted to see where it could go. There are some things that can be extracted to separate PR's. Speed: first update is somewhat similar to current master (from chartjs.org - it does not seem to include all commit). So in reality its slower. Subsequent updates are a lot faster. Try hiding / showing datasets in those two; |
9205cd6
to
8066422
Compare
Some benchmarks done in my laptop, using uPlot benchmark + plugin to measure between events: master (3cb308d) chart.update()master So, first update is about the same and subsequent updates are faster (~30%). Same chart, draw + 50x update(). master takes 29..30 seconds. Early parse 18..19 seconds. More than 30% faster. The speed improvement comes from caching the parsed data, and can surely be done without "early parsing" too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like it should be possible to store the data internally as {x: 0, y: 1}
instead of {index: 1, xScale0: 0, yScale0: 1}
. Then in the base _parseObjectData
case we wouldn't actually have to do any parsing at all, which should be a speedup (though maybe behind a skipParsing
flag for 2.x, which says you've passed in only numbers)
In 3.0 I'd really like to see us get rid of parsing altogether. It really slows things down. We could perhaps provide some data transformer utilities to help users get their data into the format we expect if we want to support multiple formats. @etimberg was asking about making the next version 3.0 (#6555 (comment)). Perhaps we should think about how we'd approach this if we didn't have to worry about backwards compatibility |
1dc078e
to
bc3807b
Compare
A couple thoughts about how we might approach this:
|
@benmccann from a performance point of view, you are correct. That approach however does not solve the issues I'm trying to solve here. Removing parsing would make this lib a bit harder to use in many cases - and IMO ease of use is one of the things making this lib so popular. |
We might be able to make parsing optional though. If you want to pass in data in our ideal format then we would not need to do parsing and if you want to be loose with the data then we'll massage it for you I think the main thing you'd need to do is change it back so that it stores data as |
bc3807b
to
a564802
Compare
It helps avoiding |
Another thing I was thinking is that I don't think the scales really need to be aware of parsing at all and we might be able to keep it all in the controllers. E.g. the linear and logarithmic scales right now need to know about floating bar charts in order to
We still call |
a564802
to
8b50c33
Compare
8b7acd0
to
9ea67ce
Compare
I refactored linkScales a bit too (CC was complaining about it). The tests done were more expensive than the assignment (and linkScales is not critical anyway), so changed it to always assign. And extraced |
b926551
to
2e7d65e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great initiative! Thanks so much for driving this!
I really like it overall. I think there are a couple more things we might be able to do based off of this or a couple things I would tweak, but I'd rather do those in follow up PRs to be able to get this in and build off it
Benchmarks with latest version: {
"2 update/total": 620.79,
"3 render/total": 177.54,
"x from page load": 802.21
} 2nd update: {
"2 update/total": 284.52,
"3 render/total": 124.38
} 50x update: ~17s master {
"2 update/total": 703.42,
"3 render/total": 143.66,
"x from page load": 850.76
} 2nd update: {
"2 update/total": 521.75,
"3 render/total": 127.67
}
Code
} 50x update: ~27s |
#6106 done a bit further.
Leading thoughts
data
should be parsed only once, not each animation cycle or interaction updatecore.datasetController
implements common data parses that can be overriddendataset controller
knows how to parse its (custom) datadataset controller
asks assignedscale
's to parse their input_custom
entry is available to contain any custom data (not related to scales) adataset
needs.dataset controllers
_parse
,_parsePlainData
,_parseArrayData
,_parseObjectData
and_parseCustomObjectData
doughnut
(and descendants) override the whole_parse
, because they are not using scales.bar
overrides_parseArrayData
, forfloatBar
supportbubble
overrides_parseCustomObjectData
to parser
Data formats
[1,2,3...]
[{x: 1, y: 2}, ...]
x
is parsed by x-scale andy
by y-scale.t
for example is parsed by time scale. if both x and y axes are time scales, both get the same value.r
forbubble
is an example of_custom
data that is not related to any scale.floatBar
, [[start,end],[start,end]]x
and value isy
Pen
Pen - far far away
Pen - master
Fixes: #6103
Fixes: #5657
Fixes: #5405
Fixes: #5072
Fixes: #6437
Fixes: #6455
Closes: #6136
Closes: #6461
(probably some more)