-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-21236] Make the threshold of using HighlyCompressedStatus configurable. #18446
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change LGTM, should we also add a new test case in MapStatusSuite
?
Sure, thanks for review :) |
Test build #78761 has finished for PR 18446 at commit
|
Jenkins, retest this please. |
It's probably OK, but what's the use case for configuring it? when would a caller know to set it higher or lower? just trying to figure out if this is a meaningful knob. |
I guess the size of blocks are not very accurately stored in |
Yes, this is discussed ever in #16989 . Only average size of blocks are stored in |
Test build #78768 has finished for PR 18446 at commit
|
Test build #78774 has finished for PR 18446 at commit
|
Jenkins, retest this please. |
Test build #78790 has finished for PR 18446 at commit
|
Jenkins, retest this please. |
Test build #78801 has finished for PR 18446 at commit
|
is this still useful after we have #18031 ? I think users can just set |
True. I just try to make it more complete and refine the hardcode. |
every new config comes with a cost that users have to learn about it. For this case users already have a config( Let's close it first. If we get a real use case that needs this config, we can reopen. |
Sure :) |
What changes were proposed in this pull request?
Currently the threshold of using
HighlyCompressedMapStatus
is hardcoded 2000.We could make this configurable. Thus users having enough memory on driver can configure the threshold to be larger thus to save the size of blocks more accurately in
CompressedMapStatus
.