Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelise refactor-nrepl.find.find-symbol/find-global-symbol #246

Closed

Conversation

vemv
Copy link
Member

@vemv vemv commented Mar 28, 2019

refactor-nrepl.find.find-symbol/find-symbol-in-file is an expensive op, particularly because it can trigger an AST analysis for a given ns.

That, combined with the large workload of analysing all files within a given project seems to justify parallelization.


  • The commits are consistent with our contribution guidelines
  • You've added tests (if possible) to cover your change(s)
    • Probably one can trust the existing tests to cover this implementational change
  • All tests are passing (run lein do clean, test)
  • Code inlining with mranderson works and tests pass with inlined code (run ./build.sh install -- takes a long time)
  • You've updated the changelog (if adding/changing user-visible functionality)
    • (Impl detail)
  • You've updated the readme (if adding/changing user-visible functionality)
    • (Impl detail)

@vemv vemv mentioned this pull request Mar 28, 2019
@bbatsov
Copy link
Member

bbatsov commented Mar 28, 2019

Did you measure the speedup on your hardware?

@benedekfazekas
Copy link
Member

build is failing I suppose because the branch is behind master and does not have circleci config. can you sync with master please

@vemv
Copy link
Member Author

vemv commented Mar 28, 2019

Did you measure the speedup on your hardware?

Thanks for the heads up. Could now measure things (at first it seemed hard and I blindly trusted reducers)

There was no gain:

  • The wrapped partials were doing lazy work
  • And automatic parallelization is only triggered given a specific collection size.
    • e.g. (reducers/foldcat (reducers/map (fn [x] (println (Thread/currentThread)) (str x)) (vec (repeat 3 :a)))) is not parallel, but if you change the 3 to 1000, it will.

Working on both!

build is failing I suppose because the branch is behind master and does not have circleci config. can you sync with master please

Thanks. Didn't realise my clone was old. Fixing

@expez
Copy link
Member

expez commented Mar 28, 2019

This function is already called in its own thread while two other threads are working full tilt looking for local symbols and symbols that are macros.

Depending on the amount of cores, and the total amount of IO going on, there might not be much more speed to squeeze out.

automatic parallelization is only triggered given a specific collection size.

Yeah, overriding the default probably makes sense here as each op is expensive and will quickly dominate the overhead of managing additional threads.

Thanks for investigating!

@vemv
Copy link
Member Author

vemv commented Mar 28, 2019

Thanks for the hints!

Now I did have success using pmap strategically. Got a 1700ms -> 350ms speedup in my 6-core machine.

Will close this PR and open a cleaner one promptly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants