Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows client/server improvements #262

Merged
merged 1 commit into from
Mar 29, 2018
Merged

Windows client/server improvements #262

merged 1 commit into from
Mar 29, 2018

Conversation

robby-phd
Copy link
Contributor

@robby-phd robby-phd commented Mar 27, 2018

This PR attempts to fix a couple of issues with Windows client/server:

  1. Failure to connect to server manifested as IOException (first commit)

    This is addressed by re-attempting the connection and hard shutdown using
    System.exit after several attempts (currently, the client exits with code 1
    without printing any message).

  2. The more pressing issue is the unpredictable performance under Java 9 when starting mill client/server for the first time (second commit)

    Unfortunately, mill client/server under Windows with Java 9 (or above) runs unpredictably
    when run for the first time. This is because exporting rt.jar on the server side can somehow
    be non-deterministically slow after displaying:

    Compiling (synthetic)/ammonite/predef/interpBridge.sc
    

    This can take a minute (or more, or fast), but then it will eventually make progress.

    Below is the jstack dump as a reference whenever it seems to "hang" before making progress
    Of interest here is Thread-1 where it can spend a long time exporting rt.jar (I sampled this several times during such a hang and they all looked similar, i.e., Thread-1 is in the process of exporting rt.jar).

    One workaround is to ask user to run mill interactively first to trigger rt.jar caching
    (I find this rather unacceptable for users).

    On the other hand, triggering the rt.jar caching early when mill first starts seems to
    avoid this issue. This PR does that for interactive/non-interactive modes
    (needed only for client/server on Windows using Java 9+ but done for all for UX consistency).

    Unfortunately, the changes are somewhat not ideal because we want to keep mill
    client Java only (to avoid slowdown caused by class loading Scala); the main issue
    here is that eagerly caching rt.jar requires it to know mill's home, which implies
    parsing CLI arguments early on.

2018-03-26 21:10:26
Full thread dump OpenJDK 64-Bit Server VM (10+46 mixed mode):

Threads class SMR info:
_java_thread_list=0x000001ae7f9f97d0, length=13, elements={
0x000001ae77f4c800, 0x000001ae7ee00000, 0x000001ae7ee03000, 0x000001ae7ee08800,
0x000001ae7edfb800, 0x000001ae7edfc800, 0x000001ae7ee4e800, 0x000001ae7ee5d800,
0x000001ae7ee71800, 0x000001ae7f01b800, 0x000001ae7f022800, 0x000001ae7f9f7000,
0x000001ae7f9fa000
}

"main" #1 prio=5 os_prio=0 tid=0x000001ae77f4c800 nid=0x1a58 waiting on condition  [0x00000006afbfe000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(java.base@10/Native Method)
        at mill.clientserver.Server.handleRun(Server.scala:112)
        at mill.clientserver.Server.$anonfun$run$2(Server.scala:59)
        at mill.clientserver.Server$$Lambda$3/760563749.apply$mcV$sp(Unknown Source)
        at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
        at mill.clientserver.Server$.lockBlock(Server.scala:134)
        at mill.clientserver.Server.$anonfun$run$1(Server.scala:39)
        at mill.clientserver.Server$$Lambda$2/2017354584.apply$mcV$sp(Unknown Source)
        at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
        at mill.clientserver.Server$.tryLockBlock(Server.scala:141)
        at mill.clientserver.Server.run(Server.scala:36)
        at mill.clientserver.ServerMain.main(Server.scala:15)
        at mill.clientserver.ServerMain.main$(Server.scala:9)
        at mill.ServerMain$.main(Main.scala:12)
        at mill.ServerMain.main(Main.scala)

   Locked ownable synchronizers:
        - None

"Reference Handler" #2 daemon prio=10 os_prio=2 tid=0x000001ae7ee00000 nid=0x7a0 waiting on condition  [0x00000006b08ff000]
   java.lang.Thread.State: RUNNABLE
        at java.lang.ref.Reference.waitForReferencePendingList(java.base@10/Native Method)
        at java.lang.ref.Reference.processPendingReferences(java.base@10/Reference.java:174)
        at java.lang.ref.Reference.access$000(java.base@10/Reference.java:44)
        at java.lang.ref.Reference$ReferenceHandler.run(java.base@10/Reference.java:138)

   Locked ownable synchronizers:
        - None

"Finalizer" #3 daemon prio=8 os_prio=1 tid=0x000001ae7ee03000 nid=0x18fc in Object.wait()  [0x00000006b09ff000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(java.base@10/Native Method)
        - waiting on <0x00000006c320ed38> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(java.base@10/ReferenceQueue.java:151)
        - waiting to re-lock in wait() <0x00000006c320ed38> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(java.base@10/ReferenceQueue.java:172)
        at java.lang.ref.Finalizer$FinalizerThread.run(java.base@10/Finalizer.java:216)

   Locked ownable synchronizers:
        - None

"Signal Dispatcher" #4 daemon prio=9 os_prio=2 tid=0x000001ae7ee08800 nid=0x152c runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
        - None

"Attach Listener" #5 daemon prio=5 os_prio=2 tid=0x000001ae7edfb800 nid=0xb2c waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
        - None

"C2 CompilerThread0" #6 daemon prio=9 os_prio=2 tid=0x000001ae7edfc800 nid=0x1630 waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
   No compile task

   Locked ownable synchronizers:
        - None

"C2 CompilerThread1" #7 daemon prio=9 os_prio=2 tid=0x000001ae7ee4e800 nid=0x1ff0 waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
   No compile task

   Locked ownable synchronizers:
        - None

"C1 CompilerThread2" #8 daemon prio=9 os_prio=2 tid=0x000001ae7ee5d800 nid=0x1a40 waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
   No compile task

   Locked ownable synchronizers:
        - None

"Sweeper thread" #9 daemon prio=9 os_prio=2 tid=0x000001ae7ee71800 nid=0x169c runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
        - None

"Service Thread" #10 daemon prio=9 os_prio=0 tid=0x000001ae7f01b800 nid=0xec4 runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
        - None

"Common-Cleaner" #11 daemon prio=8 os_prio=1 tid=0x000001ae7f022800 nid=0x694 in Object.wait()  [0x00000006b12ff000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(java.base@10/Native Method)
        - waiting on <0x00000006c320f2c8> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(java.base@10/ReferenceQueue.java:151)
        - waiting to re-lock in wait() <0x00000006c320f2c8> (a java.lang.ref.ReferenceQueue$Lock)
        at jdk.internal.ref.CleanerImpl.run(java.base@10/CleanerImpl.java:148)
        at java.lang.Thread.run(java.base@10/Thread.java:844)
        at jdk.internal.misc.InnocuousThread.run(java.base@10/InnocuousThread.java:134)

   Locked ownable synchronizers:
        - None

"Thread-0" #12 prio=5 os_prio=0 tid=0x000001ae7f9f7000 nid=0x1484 waiting on condition  [0x00000006b13ff000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(java.base@10/Native Method)
        at mill.clientserver.Server$.$anonfun$interruptWith$1(Server.scala:150)
        at mill.clientserver.Server$$$Lambda$22/398110318.run(Unknown Source)
        at java.lang.Thread.run(java.base@10/Thread.java:844)

   Locked ownable synchronizers:
        - None

"Thread-1" #13 prio=5 os_prio=0 tid=0x000001ae7f9fa000 nid=0x8b8 runnable  [0x00000006b14fb000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.fs.WindowsNativeDispatcher.MoveFileEx0(java.base@10/Native Method)
        at sun.nio.fs.WindowsNativeDispatcher.MoveFileEx(java.base@10/WindowsNativeDispatcher.java:327)
        at sun.nio.fs.WindowsFileCopy.move(java.base@10/WindowsFileCopy.java:380)
        at sun.nio.fs.WindowsFileSystemProvider.move(java.base@10/WindowsFileSystemProvider.java:288)
        at java.nio.file.Files.move(java.base@10/Files.java:1413)
        at jdk.nio.zipfs.ZipFileSystem.sync(jdk.zipfs@10/ZipFileSystem.java:1318)
        at jdk.nio.zipfs.ZipFileSystem.lambda$close$2(jdk.zipfs@10/ZipFileSystem.java:269)
        at jdk.nio.zipfs.ZipFileSystem$$Lambda$633/1606098202.run(jdk.zipfs@10/Unknown Source)
        at java.security.AccessController.doPrivileged(java.base@10/Native Method)
        at jdk.nio.zipfs.ZipFileSystem.close(jdk.zipfs@10/ZipFileSystem.java:268)
        at io.github.retronym.java9rtexport.Export.export(Export.java:64)
        - locked <0x00000006c3211108> (a java.lang.Object)
        at ammonite.runtime.Classpath$.classpath(Classpath.scala:72)
        at ammonite.interp.CompilerLifecycleManager.init(CompilerLifecycleManager.scala:79)
        - locked <0x00000006c3211380> (a ammonite.interp.CompilerLifecycleManager)
        at ammonite.interp.CompilerLifecycleManager.preprocess(CompilerLifecycleManager.scala:51)
        - locked <0x00000006c3211380> (a ammonite.interp.CompilerLifecycleManager)
        at ammonite.interp.Interpreter.compileRunBlock$1(Interpreter.scala:486)
        at ammonite.interp.Interpreter.$anonfun$processAllScriptBlocks$15(Interpreter.scala:546)
        at ammonite.interp.Interpreter$$Lambda$554/246820480.apply(Unknown Source)
        at ammonite.util.Res$Success.flatMap(Res.scala:61)
        at ammonite.interp.Interpreter.$anonfun$processAllScriptBlocks$14(Interpreter.scala:543)
        at ammonite.interp.Interpreter$$Lambda$548/1746912130.apply(Unknown Source)
        at ammonite.util.Res$Success.flatMap(Res.scala:61)
        at ammonite.interp.Interpreter.$anonfun$processAllScriptBlocks$12(Interpreter.scala:540)
        at ammonite.interp.Interpreter$$Lambda$538/227274813.apply(Unknown Source)
        at scala.Option.getOrElse(Option.scala:121)
        at ammonite.interp.Interpreter.loop$1(Interpreter.scala:540)
        at ammonite.interp.Interpreter.processAllScriptBlocks(Interpreter.scala:578)
        - locked <0x00000006c3212fe8> (a ammonite.interp.Interpreter)
        at ammonite.interp.Interpreter.$anonfun$processModule$6(Interpreter.scala:374)
        at ammonite.interp.Interpreter$$Lambda$530/1175302998.apply(Unknown Source)
        at ammonite.util.Catching.flatMap(Res.scala:114)
        at ammonite.interp.Interpreter.$anonfun$processModule$5(Interpreter.scala:365)
        at ammonite.interp.Interpreter$$Lambda$529/1180205918.apply(Unknown Source)
        at ammonite.util.Res$Success.flatMap(Res.scala:61)
        at ammonite.interp.Interpreter.processModule(Interpreter.scala:357)
        - locked <0x00000006c3212fe8> (a ammonite.interp.Interpreter)
        at ammonite.interp.Interpreter.$anonfun$initializePredef$1(Interpreter.scala:110)
        at ammonite.interp.Interpreter.$anonfun$initializePredef$1$adapted(Interpreter.scala:110)
        at ammonite.interp.Interpreter$$Lambda$151/2138144710.apply(Unknown Source)
        at ammonite.interp.PredefInitialization$.$anonfun$apply$7(PredefInitialization.scala:65)
        at ammonite.interp.PredefInitialization$$$Lambda$166/776148680.apply(Unknown Source)
        at ammonite.util.Res$.$anonfun$fold$1(Res.scala:31)
        at ammonite.util.Res$$$Lambda$167/324072930.apply(Unknown Source)
        at scala.collection.TraversableOnce.$anonfun$foldLeft$1(TraversableOnce.scala:157)
        at scala.collection.TraversableOnce.$anonfun$foldLeft$1$adapted(TraversableOnce.scala:157)
        at scala.collection.TraversableOnce$$Lambda$168/1119601226.apply(Unknown Source)
        at scala.collection.Iterator.foreach(Iterator.scala:929)
        at scala.collection.Iterator.foreach$(Iterator.scala:929)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1417)
        at scala.collection.IterableLike.foreach(IterableLike.scala:71)
        at scala.collection.IterableLike.foreach$(IterableLike.scala:70)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
        at scala.collection.TraversableOnce.foldLeft(TraversableOnce.scala:157)
        at scala.collection.TraversableOnce.foldLeft$(TraversableOnce.scala:155)
        at scala.collection.AbstractTraversable.foldLeft(Traversable.scala:104)
        at ammonite.util.Res$.fold(Res.scala:29)
        at ammonite.interp.PredefInitialization$.apply(PredefInitialization.scala:53)
        at ammonite.interp.Interpreter.initializePredef(Interpreter.scala:112)
        at ammonite.Main.$anonfun$instantiateInterpreter$1(Main.scala:161)
        at ammonite.Main$$Lambda$93/59396526.apply(Unknown Source)
        at scala.util.Either$RightProjection.flatMap(Either.scala:702)
        at ammonite.Main.instantiateInterpreter(Main.scala:129)
        at mill.main.MainRunner.$anonfun$runScript$2(MainRunner.scala:68)
        at mill.main.MainRunner$$Lambda$89/123978973.apply(Unknown Source)
        at mill.main.RunScript$.runScript(RunScript.scala:38)
        at mill.main.MainRunner.$anonfun$runScript$1(MainRunner.scala:77)
        at mill.main.MainRunner$$Lambda$88/1705323540.apply(Unknown Source)
        at mill.main.MainRunner.watchLoop2(MainRunner.scala:48)
        at mill.main.MainRunner.runScript(MainRunner.scala:63)
        at mill.Main$.main0(Main.scala:126)
        at mill.ServerMain$.main0(Main.scala:24)
        at mill.clientserver.Server.$anonfun$handleRun$1(Server.scala:92)
        at mill.clientserver.Server$$Lambda$23/733672688.run(Unknown Source)
        at java.lang.Thread.run(java.base@10/Thread.java:844)

   Locked ownable synchronizers:
        - <0x00000006c35eea00> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)

"VM Thread" os_prio=2 tid=0x000001ae7edec000 nid=0xbf4 runnable

"GC Thread#0" os_prio=2 tid=0x000001ae77f63800 nid=0x1270 runnable

"GC Thread#1" os_prio=2 tid=0x000001ae77f66000 nid=0x830 runnable

"GC Thread#2" os_prio=2 tid=0x000001ae77f68800 nid=0x13b8 runnable

"GC Thread#3" os_prio=2 tid=0x000001ae77f6a000 nid=0x1fb8 runnable

"G1 Main Marker" os_prio=2 tid=0x000001ae7de26000 nid=0x95c runnable

"G1 Conc#0" os_prio=2 tid=0x000001ae7de29000 nid=0x15fc runnable

"G1 Refine#0" os_prio=2 tid=0x000001ae7df28000 nid=0x185c runnable

"G1 Refine#1" os_prio=2 tid=0x000001ae7df29000 nid=0x990 runnable

"G1 Refine#2" os_prio=2 tid=0x000001ae7df2a000 nid=0x840 runnable

"G1 Refine#3" os_prio=2 tid=0x000001ae7df2d000 nid=0x14bc runnable

"G1 Young RemSet Sampling" os_prio=2 tid=0x000001ae7df30800 nid=0x1b78 runnable
"VM Periodic Task Thread" os_prio=2 tid=0x000001ae7f01d800 nid=0x1dc4 waiting on condition

JNI global references: 61

@lihaoyi
Copy link
Member

lihaoyi commented Mar 28, 2018

@robby-phd if the problem is just that it hangs for a long time confusingly, would an acceptable workaround be to log something to the console when initializing rt.jar for the first time? That seems like an easy "fix" and is something we already do when e.g. we spend time initializing the Zinc compiler interface (which takes 10s of seconds, only once).

I'm not very comfortable with the ad-hoc parsing code we have here; it means that any user code which has the -h flag ends up messing with Mill's rt.jar placement, even if -h is handled by user code (to the right of the Mill command) and not by Mill (to the Left of the Mill command).

@robby-phd
Copy link
Contributor Author

@lihaoyi Ah yes... I forgot that mill supports passing args to user code; that can be taken into account.

On the leaving it hanging, as I mentioned, the time can vary significantly, sometimes fast and sometimes very long (minutes); if you think that's ok for now, I'll limit the PR to just the first problem.

@robby-phd robby-phd force-pushed the eager-rt branch 2 times, most recently from 4df2ae9 to 20a197d Compare March 28, 2018 13:45
@robby-phd robby-phd changed the title Eager Java 9 rt caching for predicable Windows client/server Windows client/server improvements Mar 28, 2018
@lihaoyi
Copy link
Member

lihaoyi commented Mar 28, 2018

@robby-phd yeah I think hanging is ok as long as it is properly messaged to the user so they know what to expect ("currently preparing Java 9 Scala integration jar, this may take a minute or two...")

@robby-phd
Copy link
Contributor Author

robby-phd commented Mar 28, 2018

Hmm.. getting strange errors locally now when I use mill built using mill all __.publishLocal release; somehow it skips compiling the compiler interface (dev.assembly seems fine though):

Preparing Java 9.0.4.1 runtime; this may take a minute or two ...
Compiling (synthetic)/ammonite/predef/interpBridge.sc
Compiling (synthetic)/ammonite/predef/DefaultPredef.sc
Compiling /Users/robby/Temp/mill/shared.sc
Compiling /Users/robby/Temp/mill/upload.sc
Compiling /Users/robby/Temp/mill/build.sc
[273/276] scalajslib.test.runIvyDeps 
11 targets failed
mill.scalalib.ScalaWorkerModule.worker java.lang.NoClassDefFoundError: sbt/internal/inc/CompileFailed
    java.base/java.lang.Class.getDeclaredConstructors0(Native Method)
    java.base/java.lang.Class.privateGetDeclaredConstructors(Class.java:3110)
    java.base/java.lang.Class.getConstructor0(Class.java:3315)
    java.base/java.lang.Class.getConstructor(Class.java:2108)
    mill.scalalib.ScalaWorkerModule.$anonfun$worker$2(ScalaWorkerApi.scala:43)
mill.scalalib.ScalaWorkerModule.worker java.lang.NoClassDefFoundError: sbt/internal/inc/CompileFailed
    java.base/java.lang.Class.getDeclaredConstructors0(Native Method)
    java.base/java.lang.Class.privateGetDeclaredConstructors(Class.java:3110)
    java.base/java.lang.Class.getConstructor0(Class.java:3315)
    java.base/java.lang.Class.getConstructor(Class.java:2108)
    mill.scalalib.ScalaWorkerModule.$anonfun$worker$2(ScalaWorkerApi.scala:43)
mill.scalalib.ScalaWorkerModule.worker java.lang.NoClassDefFoundError: sbt/internal/inc/CompileFailed
    java.base/java.lang.Class.getDeclaredConstructors0(Native Method)
    java.base/java.lang.Class.privateGetDeclaredConstructors(Class.java:3110)
    java.base/java.lang.Class.getConstructor0(Class.java:3315)
    java.base/java.lang.Class.getConstructor(Class.java:2108)
    mill.scalalib.ScalaWorkerModule.$anonfun$worker$2(ScalaWorkerApi.scala:43)
...

@lihaoyi any idea why?

@robby-phd
Copy link
Contributor Author

@lihaoyi Turns out it's an unrelated new bug introduced after 0.1.7-8-b913c6 (that is used in ci); I recreated the issue on travis and appveyor by upgrading the ci test to use 0.1.7-19-680364 (latest, which I was using locally).

Anyway, let me know if the changes in this PR now looks ok to you and I'll merge.

@lihaoyi
Copy link
Member

lihaoyi commented Mar 29, 2018

@robby-phd looks good to me

@robby-phd robby-phd merged commit 41ca6c2 into master Mar 29, 2018
@robby-phd robby-phd deleted the eager-rt branch March 29, 2018 16:32
lihaoyi added a commit that referenced this pull request Apr 2, 2018
shengc pushed a commit to shengc/mill that referenced this pull request Apr 2, 2018
@lefou lefou added this to the 0.2.0 milestone May 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants