Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Register all polyglot java import classes for refective access #9997

Merged
merged 22 commits into from
May 23, 2024

Conversation

JaroslavTulach
Copy link
Member

@JaroslavTulach JaroslavTulach commented May 18, 2024

Pull Request Description

The goal of this PR is to execute ./runner --run test/Base_Tests ^Text. That requires us to properly register all polyglot java import classes for reflective access. That's done via EnsoLibraryFeature - a code that gets executed during native-image build. The feature finds all libraries with JARs on classpath, parses all their sources, finds all their imported Java classes and reports them via native image API as eligible for reflection.

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • All code follows the
    Scala,
    Java,
  • More tests succeeds: ./runner --run test/Base_Tests ^Text
  • Property to include polyglot import from Test/Base_Tests directory

@JaroslavTulach JaroslavTulach added the CI: No changelog needed Do not require a changelog entry for this PR. label May 18, 2024
@JaroslavTulach JaroslavTulach self-assigned this May 18, 2024
@JaroslavTulach
Copy link
Member Author

Alas, the usage of EnsoParser in the native image feature triggers loading of the parser into the JVM. That causes following failure as the EnsoParser is explicitly listed to be initialized at runtime (it has to, as it loads its .so file dynamically).

Could I load the IRs from .bindings of the library (easily) instead, @hubertp?

ex.printStackTrace();
throw new IllegalStateException(ex);
}
System.err.println("Summary for polyglot import java:");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the Standard.Base library needs 153 classes registered for reflection. Libs guys, @radeusgd, @AdRiley, @jdunkerley, @GregoryTravis - isn't that a bit too much?

Summary for polyglot import java:
  com.fasterxml.jackson.core.JsonProcessingException
  com.fasterxml.jackson.databind.JsonNode
  com.fasterxml.jackson.databind.ObjectMapper
  com.fasterxml.jackson.databind.node.ArrayNode
  com.fasterxml.jackson.databind.node.BooleanNode
  com.fasterxml.jackson.databind.node.DoubleNode
  com.fasterxml.jackson.databind.node.JsonNodeFactory
  com.fasterxml.jackson.databind.node.JsonNodeType
  com.fasterxml.jackson.databind.node.LongNode
  com.fasterxml.jackson.databind.node.NullNode
  com.fasterxml.jackson.databind.node.ObjectNode
  com.fasterxml.jackson.databind.node.TextNode
  com.ibm.icu.lang.UCharacter
  com.ibm.icu.text.BreakIterator
  com.ibm.icu.text.Normalizer2$Mode
  java.io.ByteArrayInputStream
  java.io.ByteArrayOutputStream
  java.io.File
  java.io.FileNotFoundException
  java.io.IOException
  java.io.InputStream
  java.io.OutputStream
  java.io.StringReader
  java.io.UncheckedIOException
  java.lang.ArithmeticException
  java.lang.ClassCastException
  java.lang.Double
  java.lang.Exception
  java.lang.IllegalArgumentException
  java.lang.IllegalStateException
  java.lang.IndexOutOfBoundsException
  java.lang.Integer
  java.lang.Long
  java.lang.Math
  java.lang.NullPointerException
  java.lang.NumberFormatException
  java.lang.OutOfMemoryError
  java.lang.StringBuilder
  java.lang.System
  java.lang.Throwable
  java.math.BigDecimal
  java.math.MathContext
  java.math.RoundingMode
  java.net.InetSocketAddress
  java.net.ProxySelector
  java.net.URI
  java.net.URISyntaxException
  java.net.URLEncoder
  java.net.http.HttpClient
  java.net.http.HttpRequest
  java.net.http.HttpRequest$BodyPublisher
  java.nio.charset.Charset
  java.nio.charset.UnsupportedCharsetException
  java.nio.file.AccessDeniedException
  java.nio.file.DirectoryNotEmptyException
  java.nio.file.FileAlreadyExistsException
  java.nio.file.FileSystemException
  java.nio.file.FileSystems
  java.nio.file.NoSuchFileException
  java.nio.file.NotDirectoryException
  java.nio.file.Path
  java.nio.file.StandardCopyOption
  java.nio.file.StandardOpenOption
  java.nio.file.attribute.PosixFilePermission
  java.text.DecimalFormat
  java.text.DecimalFormatSymbols
  java.text.NumberFormat
  java.text.ParseException
  java.time.DateTimeException
  java.time.DayOfWeek
  java.time.Duration
  java.time.Instant
  java.time.LocalTime
  java.time.Period
  java.time.ZoneId
  java.time.ZoneOffset
  java.time.ZonedDateTime
  java.time.format.DateTimeFormatter
  java.time.format.DateTimeFormatterBuilder
  java.time.format.SignStyle
  java.time.format.TextStyle
  java.time.temporal.ChronoField
  java.time.temporal.ChronoUnit
  java.time.temporal.IsoFields
  java.time.temporal.TemporalAdjuster
  java.time.temporal.TemporalAdjusters
  java.time.temporal.TemporalUnit
  java.util.Base64
  java.util.Locale
  java.util.Random
  java.util.UUID
  java.util.logging.Logger
  javax.net.ssl.SSLContext
  javax.xml.parsers.DocumentBuilder
  javax.xml.parsers.DocumentBuilderFactory
  javax.xml.xpath.XPathConstants
  javax.xml.xpath.XPathFactory
  org.enso.base.Array_Utils
  org.enso.base.CompareException
  org.enso.base.DryRunFileManager
  org.enso.base.Encoding_Utils
  org.enso.base.Environment_Utils
  org.enso.base.FileLineReader
  org.enso.base.Regex_Utils
  org.enso.base.Text_Utils
  org.enso.base.Time_Utils
  org.enso.base.XML_Utils
  org.enso.base.arrays.LongArrayList
  org.enso.base.encoding.ReportingStreamDecoder
  org.enso.base.encoding.ReportingStreamEncoder
  org.enso.base.enso_cloud.AuthenticationProvider
  org.enso.base.enso_cloud.CacheSettings
  org.enso.base.enso_cloud.CloudAPI
  org.enso.base.enso_cloud.CloudRequestCache
  org.enso.base.enso_cloud.DataLinkSPI
  org.enso.base.enso_cloud.EnsoSecretAccessDenied
  org.enso.base.enso_cloud.EnsoSecretHelper
  org.enso.base.enso_cloud.HideableValue
  org.enso.base.enso_cloud.audit.AuditLog
  org.enso.base.file_format.FileFormatSPI
  org.enso.base.file_system.FileSystemSPI
  org.enso.base.net.URITransformer
  org.enso.base.net.URIWithSecrets
  org.enso.base.net.http.MultipartBodyBuilder
  org.enso.base.net.http.UrlencodedBodyBuilder
  org.enso.base.numeric.ConversionResult
  org.enso.base.numeric.Decimal_Utils
  org.enso.base.polyglot.WrappedDataflowError
  org.enso.base.random.RandomInstanceHolder
  org.enso.base.random.Random_Utils
  org.enso.base.statistics.CorrelationStatistics
  org.enso.base.statistics.FitError
  org.enso.base.statistics.Rank
  org.enso.base.statistics.Regression
  org.enso.base.statistics.Statistic
  org.enso.base.text.Replacer_Cache
  org.enso.base.text.TextFoldingStrategy
  org.enso.base.time.CustomTemporalUnits
  org.enso.base.time.Date_Period_Utils
  org.enso.base.time.EnsoDateTimeFormatter
  org.enso.base.time.FormatterCache
  org.enso.base.time.FormatterCacheKey
  org.enso.base.time.FormatterKind
  org.enso.polyglot.common_utils.Core_Math_Utils
  org.graalvm.collections.Pair
  org.w3c.dom.Document
  org.w3c.dom.Element
  org.w3c.dom.Node
  org.w3c.dom.NodeList
  org.w3c.dom.Text
  org.xml.sax.InputSource
  org.xml.sax.SAXException
  org.xml.sax.SAXParseException
Registered 153 classes for reflection

Copy link
Member Author

@JaroslavTulach JaroslavTulach May 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For every registered class, I have to register all its public:

  • methods
  • constructors
  • fields

That may be too much - as not all these methods are really called by Enso code. As such, it might be better to use following pattern. Rather than calling org.w3c.dom.Element.getTagName() you could wrap the call into:

class XML_Utils {
  public static String getTagName(Element e) {
    return e.getTagName();
  }
}

that way we would concentrate and select only the needed methods and the transitive closure of needed code computed by native-image would be smaller.

If we don't create the wrapper methods then we'll be forced to polyglot java import additional classes like in 6f3bfad as the native image needs to know types of classes we want to invoke instance methods on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't that a bit too much?

Looking at the list you provided, every of these classes seems needed to provide some of the functionality of the standard library. The std-lib provides a broad variety of functionalities so it seems expected it may have some dependencies. 153 does not seem that large to me.

I guess we may want to discuss whether we should split off some of these dependencies away from Base into separate libraries. I guess we could indeed do so, to keep a 'lighter' "core". IF we split off, I suspect most of our base workflows will for the time being import most of the separated parts anyway, as they are expected to be available for the users out of the box. So what could be the benefit of splitting?

I expect @jdunkerley may have some opinions on whether we want to keep a 'big' Base library or make it more modularized.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

153 does not seem that large to me.

We shouldn't be counting just the classes, but rather methods. The problem with current approach is that by using a method rich class like HttpClient we "bring in" all its methods (including for example such a megamorphic methods like equals, hashCode and toString), while we are probably using just a few. The

class XML_Utils {
  public static String getTagName(Element e) {
    return e.getTagName();
  }
}

allows us to select only the needed methods and the transitive closure of needed code computed by native-image would be smaller. Hence I'd like you to use this static method wrapper where possible for Standard libraries to be eligible for native image compilation.

Right now the ./runner has 418MB (with Graal.js and GraalPython compiled in). By carefully selecting what goes into the image we might get to 410MB or maybe even below 400MB.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the ./runner has 418MB (with Graal.js and GraalPython compiled in). By carefully selecting what goes into the image we might get to 410MB or maybe even below 400MB.

That's like 5%? Seems like a lot of hassle and maintenance for reducing the image by that amount.

Copy link
Member Author

@JaroslavTulach JaroslavTulach May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's like 5%?

Yes, it is not going to be much.

Seems like a lot of hassle and maintenance for reducing the image by that amount.

On contrary, I see that (e.g. exactly specifying what methods are used) as a good coding and architecture practice regardless of the expected gain in the size of native executable. Way more responsible than importing everything and hoping someone makes the result efficient.

If this native compilation of libraries gets to production, we should make sure we squeeze Standard libraries to make them as much effective as possible.

@JaroslavTulach
Copy link
Member Author

JaroslavTulach commented May 19, 2024

With e9a55f0 I can:

enso$ sbt engine-runner/buildNativeImage
enso$ ./runner --run test/Base_Tests/src/Data/Text_Spec.enso
...
136 tests succeeded.
0 tests failed.
1 tests skipped.
0 groups skipped.

Heuréka. Let's execute the tests in the CI: 5cc50de

@JaroslavTulach
Copy link
Member Author

JaroslavTulach commented May 20, 2024

We have a failure in generated tests:

Exception in thread "main" java.lang.UnsatisfiedLinkError: 'long org.enso.syntax2.Parser.getUuidHigh(long, long, long)'
at org.enso.syntax2.Parser.getUuidHigh(Native Method)
at org.enso.syntax2.Message.getUuid(Message.java:65)
at org.enso.syntax2.Tree.deserialize(Tree.java:99)
at org.enso.syntax2.GeneratedFormatTests.main(GeneratedFormatTests.java:532)

I wonder how to run those tests locally, @kazcw? Meanwhile 8ab9362 is made as a blind shot fix.

Btw. I have found java-tests comment, but that is giving me the non-interesting suggestion - e.g. how to run Java part. There is no note about running the Rust part! Update, it is probably:

enso/lib/rust/parser/generate-java$ cargo run --bin java-tests > GeneratedFormatTests.java

@JaroslavTulach JaroslavTulach requested a review from radeusgd May 20, 2024 11:25
@kazcw
Copy link
Contributor

kazcw commented May 20, 2024

We have a failure in generated tests:

Exception in thread "main" java.lang.UnsatisfiedLinkError: 'long org.enso.syntax2.Parser.getUuidHigh(long, long, long)'
at org.enso.syntax2.Parser.getUuidHigh(Native Method)
at org.enso.syntax2.Message.getUuid(Message.java:65)
at org.enso.syntax2.Tree.deserialize(Tree.java:99)
at org.enso.syntax2.GeneratedFormatTests.main(GeneratedFormatTests.java:532)

I wonder how to run those tests locally, @kazcw? Meanwhile 8ab9362 is made as a blind shot fix.

Btw. I have found java-tests comment, but that is giving me the non-interesting suggestion - e.g. how to run Java part. There is no note about running the Rust part! Update, it is probably:

enso/lib/rust/parser/generate-java$ cargo run --bin java-tests > GeneratedFormatTests.java

It sounds like you solved it. You can also use the ./run java-gen commands to generate and run those tests.

Co-authored-by: Radosław Waśko <[email protected]>
Copy link
Member

@Akirathan Akirathan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good generally. I am not entirely satisfied with how the classes inside polyglot directories of libraries/projects are looked up, but I guess you are neither. As I said in one of my review comments, without the dependency on runtime, I cannot think of a more elegant way how to query std library for its polyglot directory content.

@hubertp
Copy link
Collaborator

hubertp commented May 22, 2024

Minor comment - I didn't know about Feature in NI, so even if my review wasn't helpful, I feel I learned a lot when looking at this use-case. Thanks.

@JaroslavTulach JaroslavTulach merged commit 16c1b74 into develop May 23, 2024
36 checks passed
@JaroslavTulach JaroslavTulach deleted the wip/jtulach/EnsoLibraryFeature9774 branch May 23, 2024 06:20
@JaroslavTulach
Copy link
Member Author

I didn't know about Feature in NI, ... I learned a lot when looking at this use-case.

Native image can do various tricks. Here is another thing to explore:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: No changelog needed Do not require a changelog entry for this PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate compilation of Standard.Base library with Native Image
5 participants