Skip to content
This repository has been archived by the owner on Dec 5, 2022. It is now read-only.

Intelligently handle conflict of having chef lockfile in NFS mount of file_cache_path #28

Closed
patcon opened this issue Jul 17, 2013 · 10 comments

Comments

@patcon
Copy link
Collaborator

patcon commented Jul 17, 2013

However, chef-solo and chef-client both drop their lockfiles in file_cache_path, and it's not recommended that the lockfile be places on NF mounts:
http://docs.opscode.com/config_rb_solo.html

See the "Run Locking" section of this post:
http://www.opscode.com/blog/2013/02/06/chef-11-in-depth-client-improvements/

Note that the locking mechanism uses flock. This means that if Chef dies unexpectedly, the lock will automatically be released, so you don’t need to deal with stale lockfiles blocking Chef runs. By default, the lockfile is located at $file_cache_path/chef-client-running.pid. Some filesystems (most notably NFS) don’t support flock, so you’ll need to manually configure the lockfile location if you’re using such a filesystem.

Hasn't been an issue yet, so not urgent, but might surface later. The plugin should perhaps override the lockfile setting and place it in /tmp, perhaps if it detects NFS shares are enabled for the cache dir, or perhaps regardless.

This is blocked by the fact that Vagrant core templates don't allow lockfile to be configured in solo.rb and client.rb:
https://github.com/mitchellh/vagrant/tree/master/templates/provisioners

@fgrehm
Copy link
Owner

fgrehm commented Dec 13, 2013

@patcon any updates on this?

@fgrehm fgrehm added this to the v1.0 milestone Feb 14, 2014
@fgrehm
Copy link
Owner

fgrehm commented May 14, 2014

No one has reported this issue in 10 months, let's not worry about it for now and let's wait until it becomes a problem :-)

@fgrehm fgrehm closed this as completed May 14, 2014
@devjj
Copy link

devjj commented May 15, 2014

This appears to be a problem for me, but for some reason only when using the hostname cookbook. Before applying the cookbook, my runs complete successfully. Now, my first box (in a 5-instance setup) comes up fine, but when it moves onto provisioning the second box, I see "Chef client ... is still running". It appears to be a pidfile sitting in the shared Chef cache that isn't getting deleted after the first run.

@devjj
Copy link

devjj commented May 15, 2014

To elaborate:

We're simulating a small cluster. 1 app server, 1 db server (redis, postgres), 3 worker hosts. I've been building out the chef repo for our app, and got everything working pretty well until I tried adding the hostname cookbook. After adding it to my run list, I noticed that Chef leaves behind a chef-client-running.pid file in ~/.vagrant.d/cache/precise64_vmware/chef. Upon attempting to provision the next host, chef-client spins up and immediately reports that an existing run is in progress, and will wait for it to finish indefinitely. If I manually delete this file from the cache before proceeding through each subsequent host, it works fine. I have not ascertained why the hostname cookbook is resulting in this behavior. I'm using the NFS mount using default configurations. Disabling caching on all hosts allows everything to proceed normally, but of course much slower.

@fgrehm
Copy link
Owner

fgrehm commented May 15, 2014

It's been a while since I used chef but is there anything we can do about that lockfile?

@devjj a workaround for you would be to use the :machine scope for your boxes, although it will end up eating more disk space, it will make sure each machine have their own isolated cache and I believe that the error would go away.

@patcon
Copy link
Collaborator Author

patcon commented May 15, 2014

there's a lockfile option in solo.rb and client.rb:
http://docs.opscode.com/config_rb_solo.html

And vagrant seems to allow custom config now:
https://github.com/mitchellh/vagrant/blob/master/templates/provisioners/chef_solo/solo.erb#L48

Not using chef now, but I guess I knew where to look, so just tossing it out there :)

@fgrehm fgrehm added bug and removed blocked labels May 28, 2014
@fgrehm fgrehm reopened this May 28, 2014
@fgrehm
Copy link
Owner

fgrehm commented May 28, 2014

Ok, since we now have a bug report, I've tagged it as a bug and just reopened so that it stays on our radar. If someone is able to put up a PR I'll be more than happy to review it :-)

@fgrehm fgrehm removed this from the v1.0 milestone Jul 20, 2014
@tknerr
Copy link

tknerr commented Dec 5, 2015

Should be easy to add a lockfile chef config option to vagrant core (as a prerequisite before setting this value properly in vagrant-cachier).

Would have to be added in a few places only (e.g. compare with file_cache_path):
https://github.com/mitchellh/vagrant/search?utf8=%E2%9C%93&q=file_cache_path

Maybe I can whip something up before vagrant 1.8 comes out... (but on the other hand it seems not really required as a custom config file can be provided by now already - it's just a bit more tedious...)

@fgrehm
Copy link
Owner

fgrehm commented Dec 7, 2015

Cool, LMK if that works out and I can give you the permissions for cutting a new release :)

@fgrehm fgrehm added the ignored label Nov 22, 2022
@fgrehm
Copy link
Owner

fgrehm commented Nov 22, 2022

Hey, sorry for the silence here but this project is looking for maintainers 😅

As per #193, I've added the ignored label and will close this issue. Thanks for the interest in the project and LMK if you want to step up and take ownership of this project on that other issue 👋

@fgrehm fgrehm closed this as completed Nov 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants