-
Notifications
You must be signed in to change notification settings - Fork 708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
windows_exporter v0.14 don't auto start #637
Comments
I'm experiencing the same problem. I didn't use any version before 0.14 so I don't know whether this version introduced the problem.
followed by
Unfortunately, the Windows exporter does not log anything at that time, so there is not much to work with. Starting it manually afterwards has always succeeded. This happens consistently on Windows 2012R2 (scheduled restarts over night) while 2016 and 2019 seem to work fine. I will try this on a clean 2012R2. Might this be a missing dependency? |
@hieu15 Can you verify if you see the same error messages in the event viewer? Otherwise this might be a different problem. |
I can confirm that this happens on a fresh 2012R2
I've tried changing the start up type to "automatic delayed". The exporter starts fine in that case. |
I've tested on a fresh 2016 and it doesn't show any problems, so I've tested on 2012R2 a bit more. This is where things get a bit crazy. Version 0.13 shows the same problems. However, v0.12 works perfectly fine. I've rebooted multiple times, it's always consistent. I did a clean uninstall of the exporter before installing a new one. According to the release notes the only change in 0.13 is the project name. This seems crazy, but could the name have an effect on the start up order? |
@fischerman any relation to #544 or #599 maybe? They all seem to be about not starting exporters (after reboot). |
@fischerman That's very odd, but thanks for the exhaustive testing! |
@carlpett I didn't find an installer of master so I installed from 0.14 and swapped the .exe with https://ci.appveyor.com/project/prometheus-community/windows-exporter/builds/35870173/artifacts As suspected this is not the cause. @frittentheke I don't think they are related. But it gave me the idea to test 32 bit. Same problem. |
Thanks for checking! Your comment
... has been bouncing around in the back of my head during the day. And after managing a repro I now have a working theory. Could you test if running
Some reasons it could work on other OSes would be that they have something earlier in the startup phase which starts whatever it is we need to be started. I'm not 100% confident the wmiApSrv is the right dependency, or if it just happens to initialize something as a side effect. It'd had been a much neater explanation if it was also in the Auto start group (it's manual), and if it'd stayed running (on my repro-machine at least it stops quite quickly). But would be super interesting to see if this fixes it for you as well! (As a side note, renaming the service to eg xwindows_exporter also seems to fix it on my VM) |
I've added the dependency and did 3 restarts. The service came up fine every time. After adding the dependency, the WMI Performance adapter stays running in my case, possibly indefinitely (also in If the initialization of the event logger is the problem can we temporarily write the error message into a file? |
Thanks for the test! It's a bit unsatisfying to not really understand why, but since it does seem to fix the problem, I think we should probably add it to the next release. |
This appears to fix service startup issues on certain systems, eg #637 Signed-off-by: Calle Pettersson <[email protected]>
This appears to fix service startup issues on certain systems, eg #637 Signed-off-by: Calle Pettersson <[email protected]>
This appears to fix service startup issues on certain systems, eg #637 Signed-off-by: Calle Pettersson <[email protected]>
I did a little bit more testing. I manually wrote stuff in log files. In all the test cases, no log file was written during Windows start up, but were showing up fine when starting the service afterwards. The only exception was a hello world program (built with promu) with which I replaced the windows exporter. That one always produced logs, so filesystem is generally available. I believe the problem occurs before the main function, hence in one of init functions. I had my eye on the I also wrapped the exporter in nssm to get the stderr, but that one starts fine... |
@fischerman I had this same issue with windows_exporter v0.13 on Win2012R2. I found two culprits which contributed to the issue.
|
Delayed start is also my current workaround. I'm still hoping we can find the root cause. However, if my suspicion is correct and an init function of a dependency causes the crash it will be hard to discover. One idea I just thought of is to generate a core dump using |
I also have this issue. Thank you @fischerman for all the testing! I was going to suggest moving the windows service handler stuff to the beginning of main() as I believe it is the control messages from the service that makes Windows understand that the service is there. But, if it really crashes before main(), then it is no use.
Good idea! I think Go doesn't yet produce minidumps without patching: It could perhaps be possible to use GOTRACEBACK=crash (or =system) and in some way get stderr redirected to a file to see the tracebacks. I know there are wrappers like winsw that can do that, but it sounds like quite a hassle. |
I experience the issue too but what's interesting is that the service crashed within a second but the log message show that it tried for 30 seconds.
Update: I could only get it fixed by restarting the vm. |
@basroovers Which version do you use? How do you know that the service crashed after one second? |
@fischerman I have hundreds of instances running on v0.15.0. Most of these didn't autostart after the last Windows update cycle (reboots). Here is an export of the Windows event log of that VM:
|
The problem occured yesterday on a 2016 server with 0.14. System has now been configured to use the delayed start. |
I also have same issue from time to time using or not "delayed start" – it doesn't matter. Exporter didn't start tonight after automatic reboot in reason of installing updates for Windows Server 2019 Std, but later, after manual reboot it started well. Updates were installed: Hope it helps. |
We've updated to v0.16.0, but we experienced the same thing again last weekend. Several Windows updates were installed after which the windows_exporter service was not running. Same messages:
Perhaps good to check if this is a VMware specific issue? We are running VMware. |
I updated to 0.16.0 right before the April update cycle in our test
environment. I also modified the service to Automatic (Delayed Start).
Results were mixed.
2008R2 and 2019 did not start windows_exporter after the automatic patch
and restart
2012R2 and 2016 servers started windows_exporter successfully after the
automatic patch and restart
Tim
…On Mon, Apr 19, 2021 at 8:31 AM Bas Roovers ***@***.***> wrote:
We've updated to v0.16.0, but we experienced the same thing again last
weekend. Several Windows updates were installed after which the
windows_exporter service was not running.
Same messages:
The windows_exporter service failed to start due to the following error:
The service did not respond to the start or control request in a timely fashion.
A timeout was reached (30000 milliseconds) while waiting for the windows_exporter service to connect.
Perhaps good to check if this is a VMware specific issue? We are running
VMware.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#637 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADUP2G4ZOWPW424LLKO3P2TTJQWEVANCNFSM4S2R7NTA>
.
--
Tim Biles
Sys/Database Design/Admin 3 | ITSS | itss.d.umn.edu
Storage Champion | z.umn.edu/scn
University of Minnesota Duluth | www.d.umn.edu
***@***.*** | 218-726-6959
|
Just to confirm, I ran into this issue with v0.16.0 Exporter failed to start on 2 out of 10 servers after cumulative updates.
David |
My test results were slightly different than last month. 2008R2
consistently has failed, 2012R2 and 2016 consistently succeed, 2019 is
intermittent. All mine are VSphere VMs, all set to Automatic (Delayed
Start).
3 of 10 machines failed to start windows_exporter (1 2008R2, 2 2019).
7 machines started windows exporter (4 2012R2, 1 2016, 2 2019)
Tim
…On Wed, May 12, 2021 at 3:54 AM davideverall ***@***.***> wrote:
Just to confirm, I ran into this issue with v0.16.0
Exporter failed to start on 2 out of 10 servers after cumulative updates.
- Server2012R2
- Vmware virtual machine
- Service set to Automatic (Delayed Start)
David
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#637 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADUP2GZQ4VGB73J2QGTF4NLTNI63ZANCNFSM4S2R7NTA>
.
--
Tim Biles
Sys/Database Design/Admin 3 | ITSS | itss.d.umn.edu
Storage Champion | z.umn.edu/scn
University of Minnesota Duluth | www.d.umn.edu
***@***.*** | 218-726-6959
|
I see this issue frequently, occurring typically when there are alot of Windows updates released and available for install. After the updates are installed and the server is restarted, if the server is under heavy CPU load whilst the updates are installing, then the Windows_exporter service does not start up. Usually, it is the Windows Modules Installer Worker(TiWorker.exe) service which is using up high CPU on the server to install the updates. I use mainly Windows Server 2016. Would it be possible to put some logic in whereby the "Windows_exporter" service is checked every x minutes to check its up and running, and if not, to start it up? |
Just to report it, it seems to be happen with v0.16. still, on server 2019... Get-Package *exporter*
Name Version Source ProviderName
---- ------- ------ ------------
windows_exporter 0.16.0 Get-ComputerInfo | select WindowsProductName, WindowsVersion, OsHardwareAbstractionLayer
WindowsProductName WindowsVersion OsHardwareAbstractionLayer
------------------ -------------- --------------------------
Windows Server 2019 Standard 1809 10.0.17763.2686 TimeCreated Id LevelDisplayName Message
----------- -- ---------------- -------
13/04/2022 09:02:53 7036 Information The windows_exporter service entered the running state.
13/04/2022 03:35:04 7000 Error The windows_exporter service failed to start due to the following error: ...
13/04/2022 03:35:04 7009 Error A timeout was reached (30000 milliseconds) while waiting for the windows_exporter service to connect.
12/04/2022 13:23:43 7036 Information The windows_exporter service entered the running state.
12/04/2022 06:11:32 7036 Information The windows_exporter service entered the stopped state.
15/03/2022 11:37:37 7036 Information The windows_exporter service entered the running state.
09/03/2022 03:32:47 7000 Error The windows_exporter service failed to start due to the following error: ...
09/03/2022 03:32:47 7009 Error A timeout was reached (30000 milliseconds) while waiting for the windows_exporter service to connect.
09/03/2022 03:30:42 7036 Information The windows_exporter service entered the stopped state.
Event at Also it is not isolated, at least one others server shows this behaviour (that one is the same OS version): TimeCreated Id LevelDisplayName Message
----------- -- ---------------- -------
17/03/2022 10:28:10 7036 Information The windows_exporter service entered the running state.
17/03/2022 00:32:36 7000 Error The windows_exporter service failed to start due to the following error: ...
17/03/2022 00:32:36 7009 Error A timeout was reached (30000 milliseconds) while waiting for the windows_exporter service to connect.
17/03/2022 00:31:40 7036 Information The windows_exporter service entered the stopped state. I realise we do not run the latest release, so if this issue supposed to be fixed by that, can you confirm - maybe point me to the right direction - so we can consider upgrade? |
Hello, Same issue for me on Windows server 2019:
Exporter: 0.18.1
|
This issue has been marked as stale because it has been open for 90 days with no activity. This thread will be automatically closed in 30 days if no further activity occurs. |
This should be resolved as of #1047. Let me know if it's still an issue. |
This appears to fix service startup issues on certain systems, eg prometheus-community#637 Signed-off-by: Calle Pettersson <[email protected]>
Windows_exporter v0.14 don't auto start service when restart server. I must start service manual.
The text was updated successfully, but these errors were encountered: