-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-3777: [C++] Add Slow input streams and slow filesystem #5439
Conversation
@@ -28,12 +28,10 @@ | |||
|
|||
# examine the output of clang-format and if changes are | |||
# present assemble a (unified)patch of the difference | |||
def _check_one_file(completed_processes, filename): | |||
def _check_one_file(filename, formatted): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unrelated to this PR, but I finally got annoyed enough by the slow format-check that I spent some time figuring out what happened. The bottom line is that each call to _check_one_file
in a child process was serializing and transmitting the entire results for all files. Instead, we just transmit the result for the single file being checked, it's much faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
bcfd933
to
b46487d
Compare
b46487d
to
2ca64c5
Compare
Codecov Report
@@ Coverage Diff @@
## master #5439 +/- ##
==========================================
- Coverage 89.69% 89.69% -0.01%
==========================================
Files 810 710 -100
Lines 114853 108385 -6468
Branches 1498 0 -1498
==========================================
- Hits 103021 97219 -5802
+ Misses 11821 11166 -655
+ Partials 11 0 -11
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this. Minor comments but can be merged whenever. I think we should opt for reducing the size of header files (and hiding more implementation details) when possible
@@ -28,12 +28,10 @@ | |||
|
|||
# examine the output of clang-format and if changes are | |||
# present assemble a (unified)patch of the difference | |||
def _check_one_file(completed_processes, filename): | |||
def _check_one_file(filename, formatted): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@@ -265,5 +265,47 @@ class ARROW_EXPORT SubTreeFileSystem : public FileSystem { | |||
Status FixStats(FileStats* st) const; | |||
}; | |||
|
|||
/// \brief EXPERIMENTAL: a FileSystem implementation that delegates to another | |||
/// implementation but inserts latencies at various points. | |||
class ARROW_EXPORT SlowFileSystem : public FileSystem { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've brought this topic up before, but what do you think about putting the entire implementation of SlowFS in a .cc file and returning std::shared_ptr from the function that creates it? This can always be done later so refactor need not happen now. It's probably better to use factory methods for instantiating most FS classes anyway
Unless you anticipate adding additional methods to the class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comments hold regarding the slow stream classes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. I don't think our conventions should vary too much. If some classes are exposed and other hidden it feels a bit weird. But strong opinion from me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My preference would be to hide the implementations in all cases except where we expose additional methods. If only that it offers more freedom to refactor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, we discuss this in a separate JIRA. Will merge.
No description provided.