-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ADDomain fails after reboot #581
Comments
hi @lkt82, thanks for reporting this. This is a similar issue to #574, and we added an additional exception to the try/catch block around |
I have just tried to reprovision a new node. Waited until DSC was completed and the ran a manuel reboot. The node reports the same error as described. output from get module on the node
|
@lkt82 Can you add verbose messages too to better see what is happening. See this #574 (comment), but the added code should be before line 113, as the first thing in the catch-block. |
@johlju and @X-Guardian -- I just ran into this when trying to provision a brand new domain/DC in Azure. I triple-checked that I was using the most current ActiveDirectoryDSC (6.0.0) and am still stuck. The DSC gets down to the VM fine, the domain build happens, but the "domain not found" exception is triggered after the first reboot, and kills my ARM template deployment. One thing I did find is that in MSFT_ADDomain.ps1, starting at the do loop on Line 96...whatever exception is being triggered doesn't fall under the 3 types specifically called out. Instead it triggers the generic catch, calls "New-InvalidOperationException" -- so it never falls into the retry behavior that I was used to in previous versions of this DSC module. (Azure DCs would always take a few extra minutes to come up after the first install, and the code was able to handle that by retrying several times.) It looks like whatever I'm hitting triggers a System.ArgumentException. I'm attaching the log showing the verbose output of the error -- it's the same one everyone else is getting ("Server instance not found on the given port") |
Hi @erictorbenson. Can you add the following verbose messages at line 113 of
|
Done! (And I learned that File is attached. Interestingly I found a positional parameter error: "[ERROR] A positional parameter cannot be found that accepts argument '+'." (I replaced the actual domain name with "redactedfqdn") Any help would be appreciated...everything in the DSC resource looks OK so it might be something it's calling externally? |
@erictorbenson it was a bug in the snippet that @X-Guardian provided above that you hit instead of the actual error we are looking for. It didn't like the I verified that this snippet below will output what we are looking for.
|
No problem, I'm happy that someone is able to look at this because it's holding up a project for me! I do have another clue to contribute though...it took me 3 tries to reproduce this. It has to have something to do with timing because successful attempts worked flawlessly. Maybe it's a startup-order thing where DSC is starting before the basic AD services? It takes a good 4 or 5 minutes for new DCs to start in Azure (and on prem too) after the first reboot. Sorry about the vague issue reporting...I'm new to DSC and everything I've done so far with it has worked pretty well, so I haven't had time to dig into the internals. |
Thanks @erictorbenson, It looks like we need to add If you want to prove this yourself, you just need to add |
) - ADDomain - Added additional Get-ADDomain retry exceptions (issue #581).
This will soon be published as a preview release. As soon as the pipeline finishes running. |
Update for anyone who finds this later via search...Adding the exception to the list worked! It only needed the one retry. One thing I did notice is that everything is timing after the first reboot. After I moved on from this, other DSC configuration resources I had built (creating reverse lookup zone in DNS, enabling AD Recycle Bin, etc.) would also fail with similar "can't find the domain" errors. If you actually watch the DC build in Azure, it'll take several minutes before the "configuring settings" spinning-dots screen goes away. I assume all of the AD cmdlets need something that initializes later. A "workaround" that let me move on is a simple delay that all the future resources depend on -- in my case it took a 5 minute delay before everything worked correctly. (Obviously this could be a lot more refined, but it does work.)
|
@erictorbenson It doesn't work having the
|
Actually I forgot about WaitForADDomain. :-) The first thing that came into my head to fix the problem fast was a Script that sleeps. I just redid everything and WaitForADDomain works, BUT, I did have to increase the timeout to 300 seconds, otherwise DSC reboots the DC before it gets a chance to respond and you run the risk of never initializing the DC the whole way before the restarts expire. Here's what the DC build part of my DSC config looks like now...this looks like a solid way to ensure we don't time out on any of the other AD-dependent elements. (The next element in the config needs to depend on Thanks @johlju @X-Guardian for the fast help for a relative DSC newbie. Hopefully I can contribute at some point.
|
Details of the scenario you tried and the problem that is occurring
After rebooting a node the state is reported as failed. The problem is that the ADDomain resource fails with "Server instance not found".
Eventually the node will reach the state of Compliant
Verbose logs showing the problem
Suggested solution to the issue
Detect that the node is starting and wait for dependencies to be ready
The DSC configuration that is used to reproduce the issue (as detailed as possible)
The operating system the target node is running
Version and build of PowerShell the target node is running
Version of the DSC module that was used
The text was updated successfully, but these errors were encountered: