-
Notifications
You must be signed in to change notification settings - Fork 729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fails to start on Windows Subsystem for Linux (WSL) #270
Comments
Thanks for opening an Issue. I tried it and it did start for me, but it takes a while (30 sec). Seems there is a search happening from omrvmem_reserve_memory() for 256MB of memory, where its searching one page at a time, which is taking a long time. |
Yes, I can confirm that some patience does help, just not used to
|
I'm guessing the problem is in findAvailableMemoryBlockNoMalloc() |
@mpirvu The problem allocation is coming from the JIT. First it uses a startAddress, which fails but takes a long time, and then it retries without a specified startAddress. It should work better if the VMEM_ALLOC_QUICK flag is specified. Also it seems the port library can be improved for this case. The allocated address is before the requested startAddress, and the search direction is forward. The port library continuously increments the startAddress until it reaches the endAddress, which takes some time, always getting the same allocated memory which is before the requested startAddress. Opened eclipse-omr/omr#1800. |
I agree in both respects: |
I'm also wondering where the startAddress comes from. I wasn't able to find it in the code. Or why a start address is used at all. It would also work if the requested startAddress was lower in memory such that the memory allocation was in range. |
The start address is determined by |
For the record, in one particular run the startAddress was 00007F66EFA00000 and the OS returned memory at 00007F66CF0D0000. The port library cycled through addresses in 4K increments to 00007F66F1200000 before failing. |
@mpirvu Hi Marius, could you please point to me the code to set the VMEM_ALLOC_QUICK flag? |
@harryyu1994 The allocation of the code cache repository happens in this function:
There is a local variable
Then let's verify that this is enough to fix this issue and that there is no startup regression on native linux platforms due to parsing of smaps file |
I evaluated a JIT dll compiled by Harry that uses VMEM_ALLOC_QUICK, but there is no improvement. The time to print the version is the same ~35 seconds
I will do some tracing. |
I have run with -Xtrace:print="omrport.499-509" and got the following trace:
So what happens is:
Note that the difference between what we want (00007F1965800000) and what mmap returns (00007F1963C80000) is 28,835,840 bytes (~28MB). This should probably be good for us as the code cache is still close to the JIT dll. |
Using a wider range for startAddress may solve this issue, though if mmap really cannot find anything suitable we are going to waste even more time. |
Increasing the range searching range from 28MB to 64MB reduces the execution time for java -version to 0.375 sec mpirvu@LAPTOP-3BI10H90:~/sdks$ time pxa6480sr6-20171027_01/jre/bin/java -XXjitdirectory=VMEM_QUICK_ALLOC -version real 0m0.375s |
I evaluated the proposed solution on native Linux using Liberty startup as a benchmark. All in all there is a small regression, most likely from opening the smaps file and parsing it. I will think at something else. |
I have instrumented the getMemoryInRangeForDefaultPages further and it seems that WSL just ignore addr or unable to allocate pages anywhere near the suggested address:
I think we should to modify the code so that it will check if multiple calls to mmap returns the same address it will bail out from the loop. |
My proposal for a fix is the following:
The proposal above is based on the observation that when mmap is asked to allocate memory at a given address it may return an address that is close enough for our purposes. |
I evaluated the prototype written by Harry.
I will also verify the behavior when large pages are enabled. |
With large pages turned on, on the same HW setup as above, the prototype improves startup time by 1.2%. I can't put my finger on what could cause an improvement in this case, but I'll take it.
|
On the OMR side: add a new Virtual Memory Option called OMRPORT_VMEM_ADDRESS_HINT Change the method getMemoryInRangeForDefaultPages() to do the following: - when OMRPORT_VMEM_ADDRESS_HINT is used, instead of trying page by page to allocate in the desired region, we stop after the first attempt and return whatever default_pageSize_reserve_memory() gives us - when doing OMRPORT_VMEM_ALLOC_QUICK, do not try the slow search with mmap if the fast search with smaps failed - when doing OMRPORT_VMEM_ALLOC_QUICK, avoid doing the range check when OMRPORT_VMEM_ADDRESS_STRICT is not set On the OpenJ9 side: change the JIT to do the following during code cache allocation - try to allocate memory providing a desired address and setting OMRPORT_VMEM_ADDRESS_HINT - if the address returned by the OS is not within (2GB - 24MB) of the JIT dll, then try again by setting OMRPORT_VMEM_ALLOC_QUICK and a much larger address range(the full 2GB - 24MB from JIT dll address range), but not setting OMRPORT_VMEM_STRICT_ADDRESS and accept whatever we get back - refactor redundant code into functions and change the order of operation inside J9::CodeCacheManager::allocateCodeCacheSegment to make the code cleaner The OpenJ9 side change is dependent on the OMR side change, therefore pull in the OMR side change first. Closes: eclipse-openj9#270 Signed-off-by: Harry Yu <[email protected]>
The AdoptOpenJDK builds of OpenJ9+OpenJDK fails to start on WSL, they just hang during startup.
IBM JVM 8 SR5 also fails to start.
For comparison, IBM 8 SR4 FP11 starts up fine, as does Oracle 9 and Zulu 9, see
-version
output below (all Linux x86_64 binaries):Also reported this a few weeks back on the WSL github, but only today tested with IBM 8 SR5, so figured I should better report it here to.
The WSL report has strace output attached to it for trying to start up OpenJ9.
microsoft/WSL#2498
The text was updated successfully, but these errors were encountered: