Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent metaspace errors on Windows #19311

Closed
stuartwdouglas opened this issue Aug 10, 2021 · 3 comments
Closed

Intermittent metaspace errors on Windows #19311

stuartwdouglas opened this issue Aug 10, 2021 · 3 comments
Labels
env/windows Impacts Windows machines kind/bug Something isn't working triage/out-of-date This issue/PR is no longer valid or relevant

Comments

@stuartwdouglas
Copy link
Member

stuartwdouglas commented Aug 10, 2021

Describe the bug

CI fails on windows sometimes with OutOfMemoryError: metaspace (e.g. #19279)

The windows failures are very particular, in that they always happen at the exact same test. This test is at the point where the first metaspace collection would normally happen (i.e. when we hit the 1500mb metaspace limit).

This implies that 100% of the ClassLoaders are leaking, as if any were not leaking they would be GCed and the failure would potentially happen later. I have also run locally on windows and confirmed that under normal circumstances there are no leaks. In a normal passing windows build when the 1500mb limit is hit GC is run and the metaspace used drops down to 300mb.

This implies one of the following:

  • Sometimes we leak 100% of ClassLoaders, while most of the time there is no leak. This seems very improbable, I can't really think of a situation where this would occur
  • Sometimes the JDK just fails to collect metaspace, and OOM's instead. Possibly it could need more metaspace to actually do GC (e.g. to run finalizers), and sometimes it fails?
@stuartwdouglas stuartwdouglas added the kind/bug Something isn't working label Aug 10, 2021
@quarkus-bot quarkus-bot bot added env/windows Impacts Windows machines triage/needs-triage labels Aug 10, 2021
@jaikiran
Copy link
Member

CI fails on windows sometimes with OutOfMemoryError: metaspace (e.g. #19279)

The windows failures are very particular, in that they always happen at the exact same test. This test is at the point where the first metaspace collection would normally happen (i.e. when we hit the 1500mb metaspace limit).

I haven't checked in detail, but just wanted to note that at least that linked run is using a metaspace max limit of 1g and not 1500mb (it's not that big a difference though). From the logs of that run:

MAVEN_OPTS: -Xmx1500m -XX:MaxMetaspaceSize=1g

@jaikiran
Copy link
Member

jaikiran commented Aug 11, 2021

Perhaps till this issue is resolved, run at least the Windows JDK 11 job with the gc logging enabled as follows?

-Xlog:gc*=debug:stdout 

I guess this option will have to be appended to the current MAVEN_OPTS of that job.

@geoand
Copy link
Contributor

geoand commented Jan 24, 2022

Closing this as Windows CI has been remarkably stable lately

@geoand geoand closed this as completed Jan 24, 2022
@geoand geoand added the triage/out-of-date This issue/PR is no longer valid or relevant label Jan 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
env/windows Impacts Windows machines kind/bug Something isn't working triage/out-of-date This issue/PR is no longer valid or relevant
Projects
None yet
Development

No branches or pull requests

3 participants