Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modifications to classpath by middleware are not visible to user nREPL session #103

Closed
jeffvalk opened this issue Jan 5, 2021 · 50 comments · Fixed by #140
Closed

Modifications to classpath by middleware are not visible to user nREPL session #103

jeffvalk opened this issue Jan 5, 2021 · 50 comments · Fixed by #140
Labels
bug Something isn't working

Comments

@jeffvalk
Copy link
Contributor

jeffvalk commented Jan 5, 2021

Summary

Middleware is loaded in a different classloader context than user interactions with that middleware. The latter shares no modifiable classloader parent with the former, so changes made to the classpath when middleware is loaded are not seen by user interactions.

Impact

The primary example of this is in java.clj, which adds the JDK src.zip to the classpath and sets the jdk-sources variable to the zip file's path if found. In-editor JDK class/method docstrings (and argument names, etc) depends on this feature, as does source navigation (jumping) for JDK classes.

The following issues appear to describe this bug:

Scope

This behavior happens on both JDK8 and 9+. dynapath formerly hacked java.net.URLClassLoader to allow it to be modifiable on JDK8 and below, which would have prevented this issue on JDK8 (albeit in an unsupported way), but recent versions of dynapath removed this hack, so all JDK versions behave the same wrt this issue.

Investigation

Below are the values of jdk-sources, the classpath, and the classloader hierarchy at middleware load time (via print statements in the java middleware), and at runtime in a user REPL session. Note that:

  1. Both contexts reflect that src.zip has been found (jdk-sources is set); however
  2. src.zip is visible on the classpath at load time, but not at runtime in the REPL session, because
  3. there is no shared DynamicClassLoader in their respective classloader hierarchies. The lowest common ancestor of the context classloader for these contexts is an AppClassLoader, which is not modifiable. Hence, any modification to the highest modifiable classloader in the first context is not visible in the second.

At load time:

;; jdk-sources
#object[java.net.URL 0x2199cb1b file:/usr/lib/jvm/java-11-openjdk/lib/src.zip]

;; classpath
(;; ...all identical to below, except this entry is present:
 #object[java.net.URL 0x2199cb1b "file:/usr/lib/jvm/java-11-openjdk/lib/src.zip"])

;; classloaders
(#object[clojure.lang.DynamicClassLoader 0x6d6bc158 "clojure.lang.DynamicClassLoader@6d6bc158"]
 #object[jdk.internal.loader.ClassLoaders$AppClassLoader 0x277050dc "jdk.internal.loader.ClassLoaders$AppClassLoader@277050dc"]
 #object[jdk.internal.loader.ClassLoaders$PlatformClassLoader 0x5825f15a "jdk.internal.loader.ClassLoaders$PlatformClassLoader@5825f15a"])

And at runtime, via the REPL:

;; jdk-sources
#object[java.net.URL 0x2199cb1b file:/usr/lib/jvm/java-11-openjdk/lib/src.zip]

;; classpath
(;; ...all identical to above, but missing the added src.zip classpath entry
)

;; classloaders
(#object[clojure.lang.DynamicClassLoader 0x8424570 "clojure.lang.DynamicClassLoader@8424570"]
 #object[clojure.lang.DynamicClassLoader 0x69433f24 "clojure.lang.DynamicClassLoader@69433f24"]
 #object[clojure.lang.DynamicClassLoader 0x48d3a104 "clojure.lang.DynamicClassLoader@48d3a104"]
 #object[clojure.lang.DynamicClassLoader 0x5367dcfe "clojure.lang.DynamicClassLoader@5367dcfe"]
 #object[jdk.internal.loader.ClassLoaders$AppClassLoader 0x277050dc "jdk.internal.loader.ClassLoaders$AppClassLoader@277050dc"]
 #object[jdk.internal.loader.ClassLoaders$PlatformClassLoader 0x5825f15a "jdk.internal.loader.ClassLoaders$PlatformClassLoader@5825f15a"])

Analysis

This may be as much an nREPL discussion as an orchard one. I suspect it may have to do with *bindings* (including the clojure.lang.Compiler/LOADER) being stored and passed as the :session for nREPL messages -- essentially that these contexts are intentionally separated.

If so, testing for correct behavior requires both contexts. It can't be tested purely within a CIDER/nREPL session (as one would do in development), nor would our current approach to unit tests catch any breakage.

At a high level, there seem to be two ways to make this behave as intended:

  1. Propagate changes to the classpath made at load time into each new nREPL session (likely by pushing the classloader or a descendant classloader into the session); or
  2. Change the logic in orchard such that each nREPL session makes its own changes to the classpath.

@bbatsov What do you think of this?

@jeffvalk
Copy link
Contributor Author

jeffvalk commented Jan 6, 2021

Having thought about it for a short bit, I think option 1 above better achieves the principle of least astonishment. I think any middleware author would expect the user's session to reflect any explicit classpath changes made by the middleware. I know I did. Understanding the current behavior was a brain teaser.

@bbatsov
Copy link
Member

bbatsov commented Jan 6, 2021

Great investigation, @jeffvalk! Thanks for tracking down the root cause of this. I guess this has been broken ever since the update to dynapath 1.0.

I also prefer option 1).

@vemv
Copy link
Member

vemv commented Jan 18, 2021

Hi, I'd like to contribute a question/observation: to what extent is runtime classpath modification a core need of Orchard?

I say this because, like other Clojure developers, I feel a bit wary of fiddling much with classloaders. For example I contributed https://github.com/clojure-emacs/cider-nrepl/pull/668 - a POC showing that the mere presence of CIDER can affect other code.

One can also make the observation that neither Lein or deps.edn officially provide runtime classpath modification. clj-refactor also disabled it at some point.

If the extent of this runtime classpath modification is "provide sources and javadocs", one can consider alternatives. I have completed this Lein plugin which accomplishes so clojure-emacs/cider-nrepl#64 (comment) using traditional means - the classpath is calculated beforehand instead of mutated.

If there are no other use cases, it might be worth considering if Orchard's internals can be simplified. This can have other desirable effects, e.g. newcomers can understand better its execution model (since not everyone has expertise in JDK or classloader intricacies).

@jeffvalk
Copy link
Contributor Author

Personally, I'd say the current use case is something any development environment would want. In-editor documentation and source navigation for JDK classes seem like core requirements to me.

As best I can tell, the alternative to the current approach would be adding the src.zip path manually to your project.clj or deps.edn for every project. If you develop/test against multiple Java versions, you'd have to manage that by hand for each project too.

Regarding using a lein plugin to replace this specific use case:

  • The JDK and its sources are not ordinary project dependencies. Unlike artifacts installed from a maven (or similar) repository, these are effectively user-provided (installed either manually or using an OS package manager). The possible variations in JDK vendor and release is very large. There is no standard way to find and download JDK sources given a java version string, so a plugin would have a difficult time accomplishing this.
  • Leiningen is one of several ways to launch a clojure project. The others would need a solution too.

Regarding avoiding classpath dynamism:

  • Clojure itself depends on dynamic classloading, so regardless, we can't reasonably say we stay away from this. We should definitely not use sneaky hacks, but there's no reason to shy away from using (correctly implemented) dynamism in a dynamic language.
  • The part about sneaky hacks is important here, because it's the root of this issue. Prior to Java 9, sneakily hacking the application classloader was pervasive in Java tooling. Java 9 put an abrupt stop to this: it broke stuff left and right. 😆 People turned broken things off. And here we are cleaning up.

@vemv
Copy link
Member

vemv commented Jan 18, 2021

Thanks for the reply!

There is no standard way to find and download JDK sources given a java version string, so a plugin would have a difficult time accomplishing this.

A Lein plugin can do arbitrary computation, so the existing logic for finding src.zip could be replicated.

There's the precedent of https://github.com/pallet/lein-jdk-tools which does a similar job (although more basic: it doesn't download any files since often they are already included when installing the JDK).

Leiningen is one of several ways to launch a clojure project. The others would need a solution too.

Yes. The plugin I linked to was written just a couple weeks ago. It's a vanilla Clojure program (it doesn't need :eval-in-leiningen) so it could be trivially ported to deps.edn. I think the 'deps.edn way' is simply feeding the output of a program to https://clojure.org/reference/deps_and_cli#_make_classpath_modifiers .

but there's no reason to shy away from using (correctly implemented) dynamism in a dynamic language.

The language may be dynamic, but the platform can have some constraints...

The part about sneaky hacks is important here, because it's the root of this issue.

Agreed. I'd humbly suggest to make sure we're not repeating errors:

  • is the classpath supposed to be mutable?
  • are runtime-extensible classloaders a clean, working reality in JDK9+?
    • e.g. add-lib work started almost 3 years ago - maybe it's not that feasible?

In constrast, Lein's approach of doing classpath calculations "ahead-of-time" seems very much time-proven, well-aligned with JVM's assumptions and compatible with other plugins like deps.edn.

@bbatsov
Copy link
Member

bbatsov commented Jan 18, 2021

One can also make the observation that neither Lein or deps.edn officially provide runtime classpath modification. clj-refactor also disabled it at some point.

I'll just add here that the only reason why this got disabled is that Java 9 broke it. The users loved this functionality and I think it was quite useful.

The part about sneaky hacks is important here, because it's the root of this issue. Prior to Java 9, sneakily hacking the application classloader was pervasive in Java tooling. Java 9 put an abrupt stop to this: it broke stuff left and right. 😆 People turned broken things off. And here we are cleaning up.

Can't agree more!

In constrast, Lein's approach of doing classpath calculations "ahead-of-time" seems very much time-proven, well-aligned with JVM's assumptions and compatible with other plugins like deps.edn.

I agree this approach works fine the use case with sources, although knowing how often users are asking for such functionality, not to mention this is at the heart of nREPL's sideloading, which obviously can't be achieved otherwise.

@bbatsov
Copy link
Member

bbatsov commented Jan 18, 2021

I'll also add that it seems to me our real issue is that lack of time - we're not really facing any insurmountable issues, but there's a lot of work to go around and no one has much time to dedicate to it. That's why breaking changes often go unaddressed for years (e.g. the ClojureScript completion has also been broken for a while, after some changes were done to shadow-cljs and the maintainer of compliment didn't have time to address those). I myself, try to focus more on the Elisp code, as there are fewer people willing to touch it, compared to those that might fix something occassionallly in cider-nrepl and orchard.

@jeffvalk
Copy link
Contributor Author

  • is the classpath supposed to be mutable?
  • are runtime-extensible classloaders a clean, working reality in JDK9+?

Yes and yes. This is core Java platform functionality. Application servers and all manner of "enterprise" Java depend on this design. And every time we load a clojure file or evaluate a statement at the REPL, we use it.

@jeffvalk
Copy link
Contributor Author

Finally tracked this one down.

When the nREPL session middleware creates the session classloader, it passes the context classloader as the parent here and here. At the time middleware is loaded, the context classloader is the application classloader, which is not modifiable.

For the session to see classpath modifications made when middleware is loaded, two things must be true:

  1. The middleware that modifies the classpath must obtain a reference to a dynamic classloader whenever the context classloader is not modifiable. Orchard already does this (see here), though some improvement may be useful.
  2. The session must have this dynamic classloader in its classloader hierarchy. This can be enabled by changing this:
(clojure.lang.DynamicClassLoader.
  (.getContextClassLoader (Thread/currentThread))) ; <- jdk.internal.loader.ClassLoaders$AppClassLoader

to something like this:

(clojure.lang.DynamicClassLoader.
  @clojure.lang.Compiler/LOADER) ; <- clojure.lang.DynamicClassLoader

when creating the nREPL session. I say "something like" because in practice, I'd suggest we use the highest modifiable parent in the compiler's hierarchy, not the lowest.

@jeffvalk
Copy link
Contributor Author

Before making any changes, it probably makes sense to ask again whether this is the best way to address this.

An alternative would be to make modifications to the classpath within the session. Would this better reflect what a session is designed to do? The tradeoff is that this would create a specific dependency on the session middleware, and the session object would have to be passed as an argument to various orchard functions.

@bbatsov
Copy link
Member

bbatsov commented Jan 20, 2021

An alternative would be to make modifications to the classpath within the session. Would this better reflect what a session is designed to do? The tradeoff is that this would create a specific dependency on the session middleware, and the session object would have to be passed as an argument to various orchard functions.

As Orchard should be usable with other REPLs (e.g. a socket REPL) we can't really exponse any session dependencies in it.
Which functions would need the classloader from the session?

Before making any changes, it probably makes sense to ask again whether this is the best way to address this.

The best approach would be one that keeps this functionality reusable between REPLs. If that's not feasible, I guess we'll have to move this fully to nREPL - e.g. we can expose some op to add resources to the classpath.

@jeffvalk
Copy link
Contributor Author

That certainly makes sense.

I pushed a change that addresses this in Orchard without any regard to what the underlying REPL does. It verifies the classpath every time the added resources might be referenced. If the context classloader doesn't share a modifiable ancestor with the previously seen classloader(s), the classpath is updated. It may be a bit brute force in approach, but I think that's a required tradeoff for REPL portability. Regardless, it's simple and works.

@bbatsov
Copy link
Member

bbatsov commented Jan 20, 2021

The change looks good to me, I was just wondering about 3rd party Java library sources. Do we need to change something for them to work if they are present on the classpath?

An alternative would be to make modifications to the classpath within the session. Would this better reflect what a session is designed to do?

Btw, regarding this alternative approach - do we really need the session itself or just it's classloader? I guess it's fine to add explicit/optional classloader params to functions that might benefit from one.

@jeffvalk
Copy link
Contributor Author

I was just wondering about 3rd party Java library sources. Do we need to change something for them to work if they are present on the classpath?

No, everything else is the same. This commit doesn't affect the static classpath at all, and shouldn't impact any other dynamic classpath modifications (e.g. from other middleware) either.

regarding this alternative approach - do we really need the session itself or just it's classloader?

We'd only need its classloader.

bbatsov added a commit that referenced this issue Jan 20, 2021
bbatsov added a commit to clojure-emacs/cider-nrepl that referenced this issue Jan 21, 2021
bbatsov added a commit to clojure-emacs/cider that referenced this issue Jan 21, 2021
This fixes #2732. It also fixes #2687.

The actual fix is in orchard/cider-nrepl. See
clojure-emacs/orchard#103 for
more details.
@bbatsov
Copy link
Member

bbatsov commented Jan 21, 2021

I've cut new Orchard and cider-nrepl releases including the fix. It will soon trickle to CIDER's master.

@bpringe
Copy link

bpringe commented Jan 21, 2021

Is there a concrete example of something that didn't work before that works now? I'm not sure if I understand fully what this newly allows / fixes. I tried going to definition on (System/getenv) before and after the new version and it doesn't work in either case, but maybe this is not related? Apologies if it's a silly question.

@bbatsov
Copy link
Member

bbatsov commented Jan 21, 2021

That's exactly the type of problem that this is supposed to fix. Basically before the fix we'd add the JDK sources to the classpath dynamically, but they weren't in the classloader hierarchy of the REPL sessions and this results in the JDK sources not being available when needed. You do have the JDK sources installed, right?

@bpringe
Copy link

bpringe commented Jan 21, 2021

Not sure... how would I do that? I've installed OpenJDK. If I run java -version:

openjdk version "1.8.0_275"
OpenJDK Runtime Environment (build 1.8.0_275-b01)
OpenJDK 64-Bit Server VM (build 25.275-b01, mixed mode)

Does installing the JDK not install the source? If that's the case, then I guess that means there's a little bit of extra work for a dev to get these features? I assume most people would just install the JDK.

@bbatsov
Copy link
Member

bbatsov commented Jan 21, 2021

On Linux the sources are a separate package (at least on Debian-like distros), I'm not sure how it is on other OS-es. You should have a file named src.zip somewhere. For me it's in /usr/lib/jvm/openjdk-14/src.zip.

@bbatsov
Copy link
Member

bbatsov commented Jan 25, 2021

@jeffvalk Btw, how did you test that this change works? I just got to playing with this myself in CIDER and navigation to code and docs is still broken. E.g. this is what I get for startsWith:

(-->
  id         "62"
  op         "info"
  session    "f27e4d55-15d0-4176-a18f-a855cbfbb207"
  time-stamp "2021-01-25 11:58:23.932800500"
  ns         #("orchard.misc" 0 12 (fontified t help-echo cider--help-echo cider-locals nil cider-block-dynamic-font-lock t face font-lock-type-face))
  sym        ".startsWith"
)
(<--
  id           "62"
  session      "f27e4d55-15d0-4176-a18f-a855cbfbb207"
  time-stamp   "2021-01-25 11:58:23.934852200"
  arglists-str "[this java.lang.String int]
[this java.lang.String]"
  argtypes     ("java.lang.String" "int")
  class        "java.lang.String"
  javadoc      "https://docs.oracle.com/en/java/javase/14/docs/api/java.base/java/lang/String.html#startsWith(java.lang.String,int)"
  member       "startsWith"
  modifiers    "#{:public}"
  returns      "boolean"
  status       ("done")
  throws       nil
)

@jeffvalk
Copy link
Contributor Author

jeffvalk commented Jan 25, 2021

Prior to e3de281, I'd only tested launching via lein. That commit (in response to this thread) made things work when launching via the clojure executable ...but seems to have stopped it working with lein. Annoying.

I just pushed 0768f8e that works with both. I tested it launching both with lein and clojure, under Java 8 and 11. If you could confirm, that would be excellent.

@bbatsov
Copy link
Member

bbatsov commented Mar 7, 2021

@jeffvalk It seems that a lot of people are running into java.lang.ClassNotFoundException: com.sun.tools.javac.util.List with the latest solution. Someone every other day someone mentions this on Slack as well.

@eamonnsullivan
Copy link

I'm surprised to see this is still an issue -- I cannot work in Cider with Java 8, even though that's a perfectly valid deployment platform. Is there anything I can do to help? I have completely changed laptops since first experiencing this, and reinstalled absolutely everything, and I still can only use Cider with Java 11 (Or, I guess I can pin myself to an earlier version.)

@bbatsov
Copy link
Member

bbatsov commented Apr 8, 2021

I'm sorry about the problems this update has caused. I was hoping that @jeffvalk would find time for this, but I guess he has been busy. I won't be able to dive into this any time soon, but I can revert the changes if no one has better ideas.

@vemv
Copy link
Member

vemv commented Apr 8, 2021

I (ref: earlier in this thread) was long thinking about proposing making all the dynapath-related features opt-out. This way, these features would be still the default targeted experience, but users would have a way to make things more conservative if they prefered so.

Earlier this week I actually tracked down the root cause of clojure-emacs/cider-nrepl#668 to dynapath in orchard. So there are at least two completely different problems caused by this dependency/implementation.

@bbatsov: would you appreciate if I created a proposal / problem exposition in a separate GH issue? I think things are pretty self-evident by now, but I can also do some writeup.

@bbatsov
Copy link
Member

bbatsov commented Apr 8, 2021

@vemv yeah, that'd be great.

I agree it'd be best to remove dynapath from Orchard and potentially move this responsibility to the Orchard clients. For me it makes sense for them to be doing such classpath modifications if necessary.

@vemv
Copy link
Member

vemv commented Apr 11, 2021

@vemv yeah, that'd be great.

Done! #113

vemv added a commit to reducecombine/orchard that referenced this issue Apr 11, 2021
vemv added a commit to reducecombine/orchard that referenced this issue Apr 11, 2021
vemv added a commit to reducecombine/orchard that referenced this issue Apr 11, 2021
vemv added a commit to reducecombine/orchard that referenced this issue Apr 13, 2021
vemv added a commit to reducecombine/orchard that referenced this issue Apr 13, 2021
vemv added a commit to reducecombine/orchard that referenced this issue Apr 13, 2021
bbatsov pushed a commit that referenced this issue Apr 13, 2021
@bbatsov
Copy link
Member

bbatsov commented Aug 30, 2021

Useful resource for people who want to learn more about classloaders and Clojure https://lambdaisland.com/blog/2021-08-25-classpath-is-a-lie

@vemv
Copy link
Member

vemv commented Dec 22, 2021

For people tracking this issue because of errors such as java.lang.ClassNotFoundException: com.sun.tools.javac.util.List: they should be fixed in CIDER 1.2 (which was released today) / Orchard 0.8.0.

For the underlying problem (Modifications to classpath by middleware are not visible to user nREPL session), we are not using mutable classloaders anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants