-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use NIO's Files API to replace FileInputStream/FileOutputStream in some paths #65
Conversation
Codecov Report
@@ Coverage Diff @@
## master #65 +/- ##
============================================
+ Coverage 55.15% 56.38% +1.22%
- Complexity 1112 1172 +60
============================================
Files 149 149
Lines 7969 7969
Branches 761 761
============================================
+ Hits 4395 4493 +98
+ Misses 3332 3233 -99
- Partials 242 243 +1
|
Could you add some performance results? Should we only modify the critical code? |
Actually i didn't do any performance test. Just found this optimization while I was browsing spark codebase. Maybe we should ping @jerryshao(the author of spark-21745) |
Modify the title and description. |
@zuston From apache/spark#20119, there has performance issue with NIO's Files API. |
The performance issue is caused by the default
After browsing spark related codebase, i think maybe using NIO api of filechannel will improve the performance. But just guess, i have no practice on this and need more test. |
Any progress? What do i need to do before merging @jerqi |
I think we need some performance tests to prove the effectiveness of change. |
I think we could directly follow the Spark change. Besides, the performance test looks hard. @jerqi |
I prefer not merging this pr without test. Maybe we can solve this problem util we encounter the similar, at that time, we have the situation to prove the effect of pr. |
What changes were proposed in this pull request?
Use NIO's Files API to replace FileInputStream/FileOutputStream in some paths.
Why are the changes needed?
Follow this PR of spark: apache/spark#20144 and apache/spark#18684
Refer to reason of this change:
Does this PR introduce any user-facing change?
No
How was this patch tested?
No need.