Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@vm_helper.ip is this global structure, which gets set to the current… #41

Merged
merged 3 commits into from
Jun 6, 2017

Conversation

jjlimepoint
Copy link

@vm_helper.ip is this global structure, which gets set to the current IP building - which actually worked out okay by accident until wait_for_ipv4, but now with the extra delay in the system, machine_batch builds will get an IP wrong around 30% of the time when provisioning new machines (it works when doing the chef run on already up machines because the timing is different...)

replace this with the actual eventual location of the bootstrap, machine_spec.location['ipaddress']

This could almost certainly use testing by someone other than myself - in particular, i've only built linux machines with it!

… IP building - which actually worked out okay by accident until wait_for_ipv4, but now with the extra delay in the system, machine_batch builds will get an IP wrong around 30% of the time when provisioning new machines (it works when doing the chef run on already up machines because the timing is different...)

replace this with the actual eventual location of the bootstrap, machine_spec.location['ipaddress']

Signed-off-by: Jaymz Julian <[email protected]>
@jjasghar
Copy link

jjasghar commented Jun 2, 2017

Oh wow, this is great. I'll give this a test tomorrow and if all goes to plan release it tomorrow too!

Thanks! 🤘

@juanxhos
Copy link

juanxhos commented Jun 2, 2017

why, IP wrong around 30%?

@jjlimepoint
Copy link
Author

@jcalonsoh The nature of it is a thread race condition - If you're machine_batch'ing 10 machines, the flow goes like this:

parallel_foreach(machines)
set ip in driver global structure @vmhelper.ip
wait_for_ipv4 -> set structure in @vmhelper.ip again
ssh transport -> use ip in @vmhelper.ip

What was happening, is that the time between set ip in global structure the first time, and the use of it, was small enough that this would work. By accident, not design, but it would work. What was happening now with a large enough number of machines, is that you'd get a situation where (using two machines as an example)

thread 1: set ip for machine 1
thread 1: wait_for_ipv4 set ip for machine 2
thread 2: set ip for machine 2
thread 1: transport ssh with the global ip - the wrong ip (machine #2)
thread 2: wait for ipv4
thread 2: transport ssh with the global ip - also to the IP for machine #2!

so machine #1 never gets bootstrapped, and machine #2 gets bootstrapped for both machine1 and machine2, and whichever chef client wins that race wins.

@jjasghar
Copy link

jjasghar commented Jun 2, 2017

Ok, so testing, it seems this hasn't broken kitchen-vsphere which is the main goal.

13:31:39 JJs-MacBook-Pro my_cookbook > (jjlimepoint-fix-machine-batch-ips) be kitchen verify -c 2
-----> Starting Kitchen (v1.16.0)
-----> Creating <default-ubuntu-1604>...
-----> Creating <default-windows>...
/Users/jjasghar/.gem/ruby/2.4.1/gems/cheffish-13.0.0/lib/cheffish/merged_config.rb:6: warning: toplevel constant Mash referenced by Chef::Mash
/Users/jjasghar/.gem/ruby/2.4.1/gems/cheffish-13.0.0/lib/cheffish/merged_config.rb:7: warning: toplevel constant Mash referenced by Chef::Mash
creating machine default-ubuntu-1604-27b5fc59 on vsphere://172.16.20.2/sdk?use_ssl=true&insecure=true
  use_linked_clone: true
  datacenter: "Datacenter"
  template_name: "ubuntu16-template"
creating machine default-windows-474c3670 on vsphere://172.16.20.2/sdk?use_ssl=true&insecure=true
  use_linked_clone: true
  template_folder: "Linux"
  resource_pool: "Cluster"
  num_cpus: 2
  memory_mb: 4096
  ssh: {:user=>"admini", :paranoid=>false, :password=>"*********", :port=>22}
  datacenter: "Datacenter"
establishing connection to 172.16.20.2
  template_name: "windows2012R2"
  template_folder: "Windows"
  resource_pool: "Cluster"
  num_cpus: 2
  memory_mb: 8096
  ssh: {:user=>"administrator", :paranoid=>false, :password=>"*********", :port=>5985}
establishing connection to 172.16.20.2
[2017-06-02T13:31:48-05:00] WARN: Using a VM Template, ignoring use_linked_clone.
[2017-06-02T13:31:48-05:00] WARN: Using a VM Template, ignoring use_linked_clone.
Machine - created - default-ubuntu-1604-27b5fc59 (500a660e-b65d-8392-b3b6-90cfe32e15d4 on vsphere://172.16.20.2/sdk?use_ssl=true&insecure=true)
/Users/jjasghar/.gem/ruby/2.4.1/gems/cheffish-13.0.0/lib/cheffish/merged_config.rb:6: warning: toplevel constant Mash referenced by Chef::Mash
/Users/jjasghar/.gem/ruby/2.4.1/gems/cheffish-13.0.0/lib/cheffish/merged_config.rb:7: warning: toplevel constant Mash referenced by Chef::Mash
Power on VM [vm/default-ubuntu-1604-27b5fc59]
waiting for default-ubuntu-1604-27b5fc59 (500a660e-b65d-8392-b3b6-90cfe32e15d4 on vsphere://172.16.20.2/sdk?use_ssl=true&insecure=true) to be ready ...
......................................default-ubuntu-1604-27b5fc59 is now ready
waiting up to 90 seconds for customization and find 172.16.20.104
rebooting...
/Users/jjasghar/.gem/ruby/2.4.1/gems/cheffish-13.0.0/lib/cheffish/merged_config.rb:6: warning: toplevel constant Mash referenced by Chef::Mash
/Users/jjasghar/.gem/ruby/2.4.1/gems/cheffish-13.0.0/lib/cheffish/merged_config.rb:7: warning: toplevel constant Mash referenced by Chef::Mash
Shutdown guest OS and power off VM [vm/default-ubuntu-1604-27b5fc59]
Power on VM [vm/default-ubuntu-1604-27b5fc59]
restart machine default-ubuntu-1604-27b5fc59 (vsphere://172.16.20.2/sdk?use_ssl=true&insecure=true)
waiting up to 90 seconds for customization and find 172.16.20.104
IP address obtained: 172.16.20.104
/Users/jjasghar/.gem/ruby/2.4.1/gems/cheffish-13.0.0/lib/cheffish/merged_config.rb:6: warning: toplevel constant Mash referenced by Chef::Mash
/Users/jjasghar/.gem/ruby/2.4.1/gems/cheffish-13.0.0/lib/cheffish/merged_config.rb:7: warning: toplevel constant Mash referenced by Chef::Mash
create node default-ubuntu-1604-27b5fc59 at chefzero://localhost:8889
  add normal.chef_provisioning = {"reference"=>{"driver_url"=>"vsphere://172.16.20.2/sdk?use_ssl=true&insecure=true", "driver_version"=>"2.0.4", "server_id"=>"500a660e-b65d-8392-b3b6-90cfe32e15d4", "is_windows"=>false, "allocated_at"=>"2017-06-02 18:36:49 UTC", "ipaddress"=>"172.16.20.104", "started_at"=>"2017-06-02 18:41:21 UTC"}}
  add normal.tags = nil
       Finished creating <default-ubuntu-1604> (12m59.15s).
-----> Converging <default-ubuntu-1604>...
       Preparing files for transfer
       Preparing dna.json
       Preparing current project directory as a cookbook
       Removing non-cookbook files before transfer
       Preparing nodes
       Preparing clients
       Preparing validation.pem
       Preparing client.rb
-----> Installing Chef Omnibus (install only if missing)
       Downloading https://omnitruck.chef.io/install.sh to file /tmp/install.sh
       Trying wget...
       Download complete.
       ubuntu 16.04 x86_64
       Getting information for chef stable  for ubuntu...
       downloading https://omnitruck.chef.io/stable/chef/metadata?v=&p=ubuntu&pv=16.04&m=x86_64
         to file /tmp/install.sh.1427/metadata.txt
       trying wget...
       sha1	0a9cb607bc5b9189c88a981ee010e1e15a8a9042
       sha256	d8b0a8c012945cda9a2ff1b6b93bd852b06b81c71b4604250dac7c90143fd14d
       url	https://packages.chef.io/files/stable/chef/13.1.31/ubuntu/16.04/chef_13.1.31-1_amd64.deb
       version	13.1.31
       downloaded metadata file looks valid...
       downloading https://packages.chef.io/files/stable/chef/13.1.31/ubuntu/16.04/chef_13.1.31-1_amd64.deb
         to file /tmp/install.sh.1427/chef_13.1.31-1_amd64.deb
       trying wget...
       Comparing checksum with sha256sum...

       WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING

       You are installing an omnibus package without a version pin.  If you are installing
       on production servers via an automated process this is DANGEROUS and you will
       be upgraded without warning on new releases, even to new major releases.
       Letting the version float is only appropriate in desktop, test, development or
       CI/CD environments.

       WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING

       Installing chef
       installing with dpkg...
       Selecting previously unselected package chef.
(Reading database ... 91218 files and directories currently installed.)
       Preparing to unpack .../chef_13.1.31-1_amd64.deb ...
       Unpacking chef (13.1.31-1) ...
Machine - created - default-windows-474c3670 (500aa95c-01bf-efb6-3856-2ee0041f6b5c on vsphere://172.16.20.2/sdk?use_ssl=true&insecure=true)
/Users/jjasghar/.gem/ruby/2.4.1/gems/cheffish-13.0.0/lib/cheffish/merged_config.rb:6: warning: toplevel constant Mash referenced by Chef::Mash
/Users/jjasghar/.gem/ruby/2.4.1/gems/cheffish-13.0.0/lib/cheffish/merged_config.rb:7: warning: toplevel constant Mash referenced by Chef::Mash
Power on VM [vm/default-windows-474c3670]
waiting for default-windows-474c3670 (500aa95c-01bf-efb6-3856-2ee0041f6b5c on vsphere://172.16.20.2/sdk?use_ssl=true&insecure=true) to be ready ...
..       Setting up chef (13.1.31-1) ...
.       Thank you for installing Chef!
       Transferring files to <default-ubuntu-1604>
       Starting Chef Client, version 13.1.31
.       Creating a new client identity for default-ubuntu-1604 using the validator key.
       resolving cookbooks for run list: ["my_cookbook::default"]
       Synchronizing Cookbooks:
         - my_cookbook (1.1.0)
       Installing Cookbook Gems:
       Compiling Cookbooks...
       Converging 1 resources
       Recipe: my_cookbook::default
.         * apt_package[vim] action install (up to date)

       Running handlers:
       Running handlers complete
       Chef Client finished, 0/1 resources updated in 08 seconds
       Finished converging <default-ubuntu-1604> (1m24.48s).
-----> Setting up <default-ubuntu-1604>...
       Finished setting up <default-ubuntu-1604> (0m0.00s).
-----> Verifying <default-ubuntu-1604>...
       Loaded tests from {:path=>"/Users/jjasghar/repo/chef-provisioning-vsphere/my_cookbook/test/integration/default"}

Profile: tests from {:path=>"/Users/jjasghar/repo/chef-provisioning-vsphere/my_cookbook/test/integration/default"}
Version: (not specified)
Target:  ssh://[email protected]:22


  Port 80
     ✔  should not be listening
  System Package
     ✔  vim should be installed

Test Summary: 2 successful, 0 failures, 0 skipped
       Finished verifying <default-ubuntu-1604> (0m1.77s).
...........default-windows-474c3670 is now ready
IP address obtained: 172.16.20.105
create node default-windows-474c3670 at chefzero://localhost:8889
  add normal.chef_provisioning = {"reference"=>{"driver_url"=>"vsphere://172.16.20.2/sdk?use_ssl=true&insecure=true", "driver_version"=>"2.0.4", "server_id"=>"500aa95c-01bf-efb6-3856-2ee0041f6b5c", "is_windows"=>true, "allocated_at"=>"2017-06-02 18:45:47 UTC", "ipaddress"=>"172.16.20.105"}}
  add normal.tags = nil
       Finished creating <default-windows> (15m59.81s).
-----> Converging <default-windows>...
       Preparing files for transfer
       Preparing dna.json
       Preparing current project directory as a cookbook
       Removing non-cookbook files before transfer
       Preparing nodes
       Preparing clients
       Preparing validation.pem
       Preparing client.rb
-----> Installing Chef Omnibus (install only if missing)
       Downloading package from https://packages.chef.io/files/stable/chef/13.1.31/windows/2012r2/chef-client-13.1.31-1-x64.msi
       Download complete.
       Successfully verified C:\Users\ADMINI~1\AppData\Local\Temp\chef-client-13.1.31-1-x64.msi
       Installing Chef Omnibus package C:\Users\ADMINI~1\AppData\Local\Temp\chef-client-13.1.31-1-x64.msi
       Installation complete
       Transferring files to <default-windows>
       Starting Chef Client, version 13.1.31
       Creating a new client identity for default-windows using the validator key.
       resolving cookbooks for run list: ["my_cookbook::default"]
       Synchronizing Cookbooks:
         - my_cookbook (1.1.0)
       Installing Cookbook Gems:
       Compiling Cookbooks...
       Converging 1 resources
       Recipe: my_cookbook::default
         * windows_package[npp] action install
         Recipe: <Dynamically Defined Resource>
           * remote_file[C:\Users\ADMINI~1\AppData\Local\Temp\kitchen\cache\package\npp.7.4.1.Installer.exe] action create
             - create new file C:\Users\ADMINI~1\AppData\Local\Temp\kitchen\cache\package\npp.7.4.1.Installer.exe
             - update content in file C:\Users\ADMINI~1\AppData\Local\Temp\kitchen\cache\package\npp.7.4.1.Installer.exe from none to abd12e
             (new content is binary, diff output suppressed)
           - install version latest of package npp

       Running handlers:
       Running handlers complete
       Chef Client finished, 2/2 resources updated in 27 seconds
       Finished converging <default-windows> (3m12.42s).
-----> Setting up <default-windows>...
       Finished setting up <default-windows> (0m0.00s).
-----> Verifying <default-windows>...
       Loaded tests from {:path=>"/Users/jjasghar/repo/chef-provisioning-vsphere/my_cookbook/test/integration/default"}

Profile: tests from {:path=>"/Users/jjasghar/repo/chef-provisioning-vsphere/my_cookbook/test/integration/default"}
Version: (not specified)
Target:  winrm://administrator@http://172.16.20.105:5985/wsman:3389


  User root
     ✔  should not exist
  File C:\Program
     ✔  Files (x86)\Google\Chrome\Application\chrome.exe should exist
  File C:\Program
     ✔  Files (x86)\Notepad++\notepad++.exe should exist

Test Summary: 3 successful, 0 failures, 0 skipped
       Finished verifying <default-windows> (0m9.20s).
-----> Kitchen is finished. (19m23.09s)

The problem is that provisioning is still hit and miss with the Windows side. I think it might be my environment, but i'll continue testing.

@jjasghar
Copy link

jjasghar commented Jun 2, 2017

Ok, so Creating windows is working fine now. It was some environmental issue on my side.

Though, it seems destroy isn't working:

14:30:51 JJs-MacBook-Pro my_cookbook > (jjlimepoint-fix-machine-batch-ips) be chef-client -z recipes/destory.rb
[2017-06-02T14:31:55-05:00] INFO: Started chef-zero at chefzero://localhost:8889 with repository at /Users/jjasghar/repo/chef-provisioning-vsphere/my_cookbook
  One version per cookbook

[2017-06-02T14:31:55-05:00] INFO: Forking chef instance to converge...
Starting Chef Client, version 13.0.118
[2017-06-02T14:31:55-05:00] INFO: *** Chef 13.0.118 ***
[2017-06-02T14:31:55-05:00] INFO: Platform: x86_64-darwin16
[2017-06-02T14:31:55-05:00] INFO: Chef-client pid: 25620
[2017-06-02T14:31:55-05:00] INFO: The plugin path /etc/chef/ohai/plugins does not exist. Skipping...
[2017-06-02T14:32:00-05:00] INFO: Run List is []
[2017-06-02T14:32:00-05:00] INFO: Run List expands to []
[2017-06-02T14:32:00-05:00] INFO: Starting Chef Run for jjasghar
[2017-06-02T14:32:00-05:00] INFO: Running start handlers
[2017-06-02T14:32:00-05:00] INFO: Start handlers complete.
resolving cookbooks for run list: []
[2017-06-02T14:32:00-05:00] INFO: Loading cookbooks []
Synchronizing Cookbooks:
Installing Cookbook Gems:
Compiling Cookbooks...
[2017-06-02T14:32:00-05:00] WARN: Node jjasghar has an empty run list.
Converging 1 resources
Recipe: @recipe_files::/Users/jjasghar/repo/chef-provisioning-vsphere/my_cookbook/recipes/destory.rb
  * machine[testing-windows] action destroy[2017-06-02T14:32:00-05:00] INFO: Processing machine[testing-windows] action destroy (@recipe_files::/Users/jjasghar/repo/chef-provisioning-vsphere/my_cookbook/recipes/destory.rb line 30)
[2017-06-02T14:32:00-05:00] INFO: Processing chef_node[testing-windows] action delete (basic_chef_client::block line 91)
 (up to date)
[2017-06-02T14:32:00-05:00] INFO: Chef Run complete in 0.270802 seconds

Running handlers:
[2017-06-02T14:32:00-05:00] INFO: Running report handlers
Running handlers complete
[2017-06-02T14:32:00-05:00] INFO: Report handlers complete
Chef Client finished, 0/1 resources updated in 04 seconds

Can you verify that the delete command works on your side too?

@jjlimepoint
Copy link
Author

I just finished my daily integration test with this driver, which both created and destroyed around 150 machines, so it seems to do both for linux. I don't have any windows to test though :(.

@jjasghar
Copy link

jjasghar commented Jun 5, 2017

Aww man, I was just hoping for another validation. Maybe someone can step up and verify windows too? As I said create works, which is awesome, but delete didn't work for me (for windows). If we don't hear anything by Wednesdayish, I'll go ahead and merge with a note about possible known issues.

@jjlimepoint
Copy link
Author

I'll try and get a windows setup working and test this tomorrow... just need to work out how to do that :).

@jjasghar
Copy link

jjasghar commented Jun 5, 2017

With some investigation, I think is due to the nodes/MACHINE.json not being written out on creation. This isn't due to your change, which is both good and bad. Good, because this works as expected, bad because...well this means there's another pretty big issue going on.

I'll go ahead and wait for you to confirm this, then I'll merge and I'll open an issue mentioning what i think is going on. Thanks for the PR!

@juanxhos
Copy link

juanxhos commented Jun 6, 2017

I suspect something is wrong also, but changes got applied only on
@vm_helper.ip
and not
@vm_helper.port

I thinks it's between chef-provisioning and cheffish, I go foward for the change but also investigate

@jjasghar jjasghar merged commit 02393d0 into chef-boneyard:master Jun 6, 2017
@gaelcolas
Copy link

gaelcolas commented Jun 8, 2017

Just seeing this thread, I might be able to test with windows machine... (but only tk+kitchen-Dsc+pester)
How many test cases should I do to be conclusive? (had no issue with ~6 concurrent, on version 2.0.5)

@jjasghar
Copy link

jjasghar commented Jun 8, 2017

@gaelcolas awesome, thanks for that. I've opened up #44 to help give good examples. Ideally, I'll push some up today, and if you could a 6 concurrent example would be amazing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants