Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor remote_directory provider #3837

Merged
merged 3 commits into from
Sep 3, 2015

Conversation

lamont-granquist
Copy link
Contributor

  • Huge speed and memory perf boost. In the prior code if you were
    copying 10 dotfiles to a home directory with a million files in it,
    would slurp all million directory entries into a ruby Set object even if you were
    not ultimately going to purge the unmanaged files. This code inverts
    the logic and tracks managed files and then iterates through the
    filesystem without slurping a list into memory.
  • Only do file purging logic if purging is set.
  • Fixes mutation of new_resource.overwrite.
  • Fixes mutation of new_resource.rights (subtle).
  • Adds helper delegators to the new_resource properties.
  • Deprecates (instead of removes) now-unused methods.
  • Renamed a method (with deprecated alias preserved) for consistency.
  • Adds YARD for everything.
  • Changes protected to private because protected is largely useless
    in ruby.
  • Removes whyrun_supported? because the superclass sets that.

@lamont-granquist
Copy link
Contributor Author

@Igorshp can you give this a test on your use case?

@lamont-granquist
Copy link
Contributor Author

There's no spec changes here because AFAIK this a pure refactor and is functionally identical, but has perf CPU + Mem fixes, so unsure right now of what specs to add.

There's a FIXME I added with a reference to an old CHEF ticket that turned up when I was researching stuff for documentation on why we search the way we search. Fixing that was out of scope with the timebox that I want around this PR since this is a squirrel project that is distracting me from other work right now.

@lamont-granquist
Copy link
Contributor Author

@chef/client-core needs review

@lamont-granquist
Copy link
Contributor Author

remote_directory_refactor

This is the before and after of memory pressure (live objects after a forced full GC) on a CCR on one of my servers with the big red spike being a single remote_directory provider before this refactor was done.

@lamont-granquist
Copy link
Contributor Author

Also removes a 4 second delay on my cookbooks as it no longer walks my home dir at all, because I do not have purge set:

https://gist.github.com/lamont-granquist/def55172542a8c0b3200

It should also not cause the memory explosion with purge set and should be slightly faster and use less memory than @Igorshp's fix since we stop mangling large arrays in memory and just walk the filesystem dirents once and during the process skip skippable files.

class Chef
module Deprecation
module Provider
module RemoteDirectory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for folks who've subclassed RemoteDirectory ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

@danielsdeleo
Copy link
Contributor

A few nitpicky things that I don't care deeply about, 👍 though.

@smurawski
Copy link
Contributor

LGTM 👍

@lamont-granquist
Copy link
Contributor Author

Okay, converted to Set and switched to #overwrite? per @danielsdeleo

@Igorshp any chance you could test this code against your case?

@Igorshp
Copy link
Contributor

Igorshp commented Sep 2, 2015

@lamont-granquist sure thing, i'll report back with results shorlty

@Igorshp
Copy link
Contributor

Igorshp commented Sep 2, 2015

@lamont-granquist

Before: DEBUG: 109.663956 remote_directory[/var/lib/jenkins]
After: DEBUG: 0.442248 remote_directory[/var/lib/jenkins]

That's some impressive performance boost :)

👍

@danielsdeleo
Copy link
Contributor

:shipit:

@thommay
Copy link
Contributor

thommay commented Sep 2, 2015

👏 🎉 :shipit:

- Huge speed and memory perf boost.  In the prior code if you were
  copying 10 dotfiles to a home directory with a million files in it,
  would slurp all million files into a ruby Set object even if you were
  not ultimately going to purge the unmanaged files.  This code inverts
  the logic and tracks managed files and then iterates through the
  filesystem without slurping a list into memory.
- Only do file purging logic if purging is set.
- Fixes mutation of new_resource.overwrite.
- Fixes mutation of new_resource.rights (subtle).
- Adds helper delegators to the new_resource properties.
- Deprecates (instead of removes) now-unused methods.
- Renamed a method (with deprecated alias preserved) for consistency.
- Adds YARD for everything.
- Changes protected to private because protected is largely useless
  in ruby.
- Removes whyrun_supported? because the superclass sets that.
- convert to Set
- use #override?
@lamont-granquist lamont-granquist force-pushed the lcg/remote_directory_refactor branch from 2c1e8ac to 6109fb0 Compare September 3, 2015 00:51
lamont-granquist added a commit that referenced this pull request Sep 3, 2015
@lamont-granquist lamont-granquist merged commit 8143bf6 into master Sep 3, 2015
@Igorshp
Copy link
Contributor

Igorshp commented Sep 3, 2015

👏

@lamont-granquist lamont-granquist deleted the lcg/remote_directory_refactor branch January 27, 2016 17:58
@chef chef locked and limited conversation to collaborators Nov 16, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants