-
Notifications
You must be signed in to change notification settings - Fork 781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce the travese if glob is filepath. #1385
Conversation
Check if the glob is file path, if so do not traverse folder for that glob.
Please review @Krinkle |
filteredGlobs.push( `${glob}/**/*.js` ); | ||
} else if ( stat && stat.isFile() ) { | ||
files.push( glob ); | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is return false
meant to do? I suspect it may be intended for jQuery.each
which uses it to break the for-loop, however Array#forEach
does not do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. return false doesn't actually mean anything for forEach.
@SparshithNR Are you experiencing a performance issue without this change? That is, is there a problem with deep recursion (which seems like a bug we can fix in |
Review the test failures as well, it appears the current patch version may be causing a regression. Let me know if you have any questions :) |
// print process.argv
process.argv.forEach(function (val, index, array) {
console.log(index + ': ' + val);
}); and if I run this I get, which will avoid traversing tress for all these many files as we get all these files readly available which other wise we min-match with entire tree structure.
|
@SparshithNR I'm sorry but I am not yet understanding the problem. If the patterns are plain file names like Is that your understanding is as well, or did you find a bug? Perhaps it is traversing unrelated directories or doing redundant recursion. If so, then that is a bug I would like to fix instead. If not, can you quantify what amount of performance would be gained? I was not able to notice any significant difference both with a small list and with a large list of files. I generally prefer not to add optimisation complexity in code paths unless they are "hot" code paths. That way, the overall code remains simple and intiutive to debug and understand, allowing bigger optimisations to be thought of and implemented. |
FWIW, from my perspective there is definitely a problem to be solved in this space. The current globbing logic is extremely slow in folders with many files even if you specify a single file. |
Steps to repro (in a repo I happened to have worked in not too long ago that I noticed was slow):
The test is super simple (import a function, invoke it, add one or two assertions), but that command takes ~ 45s (on my machine). Nearly all of the time is taken up in Lines 56 to 83 in 0a0b7b8
|
Check if the argument is path to a simple known file, if so do not traverse the entire folder for that glob only to find self. Closes #1385. Co-authored-by: Robert Jackson <[email protected]>
Thanks, yeah ~40s vs a split second! I do note that if the file doesn't exist, the CLI remains slow for this case, so there's room for a future improvement to perhaps also discard the entry early on if there's no file/dir if the input is not glob-"ish". |
Check if the argument is path to a simple known file, if so do not traverse the entire folder for that glob only to find self. Closes #1385. Co-authored-by: Robert Jackson <[email protected]>
Check if the argument is path to a simple known file, if so do not traverse the entire folder for that glob only to find self. Closes #1385. Co-authored-by: Robert Jackson <[email protected]>
As part of landing I have:
|
Check if the glob is a file path if so do not traverse folder for that glob.