Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

r/virtual_machine: Updated customization and net waiter behaviour #158

Merged
merged 4 commits into from
Sep 14, 2017

Conversation

vancluever
Copy link
Contributor

@vancluever vancluever commented Sep 14, 2017

This is a two-part change that ultimately changes/Fixes customization
workflow, but has some other implications:

  • First off, we are now waiting for customization to complete by
    watching VM events. This is the correct way to wait for customization to
    complete as the actual customization task returns very quickly, while
    the actual customization process can actually take a few minutes and
    fail out-of-band of the initial task.

  • Second off, we are now no longer waiting for all interfaces to have IP
    addresses during Read, rather we are now waiting for a routeable
    network. This fixes situations where a template may be configured for
    DHCP and gets an auto-configuration address before actually getting an
    IP. This seems to come up during Windows customization especially,
    which is what we are trying to fix with this work, so having the event
    watch fix without having this one would not fix the issue completely.

  • The wait_for_guest_net option has also been added, which allows
    someone to completely turn off the network waiter. This should help
    alleviate some edge cases where NICs have not been configured with IP
    addresses, and also allows someone to bypass this behaviour if they are
    not configuring a gateway on the VM through either static configuration
    of DHCP.


Finally, as part of the additional testing work, the network waiter has been moved to the Create and Update functions of the resource, rather than the Read function. This is required especially on Create to ensure that created VMs can be rolled back properly by terraform destroy, which will do an initial refresh prior to destroy to get an updated state.

Fixes #140.

This is a two-part change that ultimately changes/fixes customization
workflow, but has some other implications:

* First off, we are now waiting for customization to complete by
watching VM events. This is the correct way to wait for customization to
complete as the actual customization task returns very quickly, while
the actual customization process can actually take a few minutes and
fail out-of-band of the initial task.

* Second off, we are now no longer waiting for all interfaces to have IP
addresses during Read, rather we are now waiting for a **routeable**
network. This fixes situations where a template may be configured for
DHCP and gets an autoconfiguration address before actually getting an
IP. This seems to come up during Windows customization especially,
which is what we are trying to fix with this work, so having the event
watch fix without having this one would not fix the issue completely.

* The `wait_for_guest_net` option has also been added, which allows
someone to completely turn off the network waiter. This should help
alleviate some edge cases where NICs have not been configured with IP
addresses, and also allows someone to bypass this behaviour if they are
not configuring a gateway on the VM through either static configuration
of DHCP.

Fixes #140.
* Properly wrap all options.
* Add documentation for wait_for_guest_net option.
Adding in the test for all of the customization work we have done, which
mainly crops up with a Windows template.

Also added a new test for DHCP only without waiting on guest networking.
Triggering this test to fail deliberately uncovered an issue with how
the network waiter basically prevents destroy of powered on virtual
machines when we don't have routeable guest network info, so I have
finally just moved the net waiter to Create and Read, which is a more
suitable place for a process that waits for a VM to be ready anyway.
Copy link
Member

@mbfrahry mbfrahry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@vancluever
Copy link
Contributor Author

Thanks @mbfrahry!

@rismoney
Copy link

What I have found in my testing, is that OS customization can be reported to VMWare as completed, but that event can NOT be relied on.

The only way I have been able to conclusively determine if customization is complete is a local script that I wrote (power-cli) which scans guestcust.log for "Deleted Folder C:\sysprep" - Every other attempt has been either built around wait guessing, or eventing in vsphere which isn't accurate.

@vancluever
Copy link
Contributor Author

Hey @rismoney - thanks for the info. Have you been encountering issues with the new customization behaviour related to this? If so, do you mind putting in a new bug report with the specific issue?

The customization timeout has been adjusted to 10 minutes in v0.4.0 which should cover most cases, especially the timeout that happens here is exclusive to customization only. If you end up getting an error due to the event never showing up in vSphere, you will get an error similar to what is being seen in #160.

If you do see that, let us know in the new issue as well. This will help us gauge if this behaviour requires any further adjustments.

Thanks for your patience on this!

@rismoney
Copy link

I have recently upgraded to 0.4.0 from original .1 and now terraformed bails out around 4-5 minutes. The machine gets created and OS customization kicks off and completes but TF gets hosed. This problem has existed in 0.2+. I wonder if there is something a problem with API change and vsphere. I opened another issue with the tracelog.

I don't know if the issue I experience on customization is related or if there is a vsphere bug reporting it's completion before done. I believe some sort of scheduled task runs that could be minutes after vsphere believes it to be done. Regardless the wait script I wrote handles this edge case flawlessly so I moved forward.

@ghost ghost locked and limited conversation to collaborators Apr 19, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Customization does not work for windows 2008 R2 template
3 participants