Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed Standard.Base initialization in simple Hello World example up! #6100

Closed
JaroslavTulach opened this issue Mar 28, 2023 · 24 comments
Closed

Comments

@JaroslavTulach
Copy link
Member

#6062 introduced a testing infrastructure to allow us verify consistency of our IR caches more reliably. Now there is a time to use it and deliver some caching improvements. Enable additional check:

enso$ git diff
diff --git engine/runtime/src/test/java/org/enso/compiler/SerdeCompilerTest.java engine/runtime/src/test/java/org/enso/compiler/SerdeCompilerTest.java
index 5eb17f12d5..afcaa4b906 100644
--- engine/runtime/src/test/java/org/enso/compiler/SerdeCompilerTest.java
+++ engine/runtime/src/test/java/org/enso/compiler/SerdeCompilerTest.java
@@ -31,7 +31,7 @@ public class SerdeCompilerTest {
   @Test
   public void testFibTest() throws Exception {
     var testName = "Fib_Test";
-    final String forbiddenMessage = null; // "Parsing module [local.Fib_Test.Arith].";
+    final String forbiddenMessage = "Parsing module [local.Fib_Test.Arith].";
     parseSerializedModule(testName, forbiddenMessage);
   }

and make sure the .ir file for Arith module isn't read by storing the necessary information in caches.

@JaroslavTulach JaroslavTulach added this to the Beta Release milestone Mar 28, 2023
@JaroslavTulach JaroslavTulach self-assigned this Mar 28, 2023
@JaroslavTulach JaroslavTulach changed the title Avoid Parsing module [local.Fib_Test.Arith] when just importing it Avoid loading .ir for module [local.Fib_Test.Arith] when just importing it Mar 28, 2023
@JaroslavTulach JaroslavTulach changed the title Avoid loading .ir for module [local.Fib_Test.Arith] when just importing it Avoid loading .ir for module [local.Fib_Test.Arith] when just importing Mar 28, 2023
@jdunkerley jdunkerley moved this from ❓New to 📤 Backlog in Issues Board Mar 28, 2023
@JaroslavTulach JaroslavTulach changed the title Avoid loading .ir for module [local.Fib_Test.Arith] when just importing Speed Standard.Base initialization in simple Hello World example up! Jun 6, 2023
@JaroslavTulach
Copy link
Member Author

The original title of this issue was "Avoid loading .ir for module [local.Fib_Test.Arith] when just importing" but that doesn't seem to be catchy enough and moreover it doesn't express the real problem - e.g. that simple:

import Standard.Base.IO
main = [ IO, "Hello World!" ]

or main = IO.println <| "Hello World!" takes too much time!

@jdunkerley jdunkerley moved this from 📤 Backlog to ❓New in Issues Board Oct 17, 2023
@jdunkerley jdunkerley moved this from ❓New to 📤 Backlog in Issues Board Oct 17, 2023
@JaroslavTulach JaroslavTulach moved this from 📤 Backlog to ⚙️ Design in Issues Board Oct 26, 2023
@enso-bot
Copy link

enso-bot bot commented Oct 26, 2023

Jaroslav Tulach reports a new STANDUP for yesterday (2023-10-25):

Progress: - replacing getMetadata(BindingsAnalysis) with Context.getBindingsMap()

  • integrated: Introducing CompilerContext.Module #8144
  • managed to delay IR loading during ImportsExportsResolution phase
  • standups & meeting: Dmitry & Pavel needed help (instrumentation, classloading) It should be finished by 2023-10-31.

Next Day: Speeding up startup

@enso-bot
Copy link

enso-bot bot commented Oct 27, 2023

Jaroslav Tulach reports a new STANDUP for yesterday (2023-10-26):

Progress: - runImportsAndExportsResolution takes just a few milliseconds now: #8160

Next Day: Speeding up startup

@enso-bot
Copy link

enso-bot bot commented Oct 28, 2023

Jaroslav Tulach reports a new STANDUP for yesterday (2023-10-27):

Progress: - Fixes and CI fighting and addressing comments in: #8160

Next Day: Speeding up startup

Access Google Docs with a personal Google account or Google Workspace account (for business use).

@JaroslavTulach
Copy link
Member Author

JaroslavTulach commented Oct 28, 2023

#8160 helps to speed up the runImportsExportsResolution phase, but overall it doesn't have any huge speed impact as it the megabytes of .ir caches are still loaded in later.

Idea

One way to address this is to split IR caches into structure and method bodies as method bodies aren't really needed until the method gets executed. That can be done with a replaceObject in ModuleCache:

diff --git engine/runtime/src/main/java/org/enso/compiler/ModuleCache.java engine/runtime/src/main/java/org/enso/compiler/ModuleCache.java
index c23790b688..b4122edece 100644
--- engine/runtime/src/main/java/org/enso/compiler/ModuleCache.java
+++ engine/runtime/src/main/java/org/enso/compiler/ModuleCache.java
@@ -161,6 +166,9 @@ public final class ModuleCache extends Cache<ModuleCache.CachedModule, ModuleCac
 
         @Override
         protected Object replaceObject(Object obj) throws IOException {
+          if (obj instanceof Expression) {
+            return null;
+          }
           if (obj instanceof UUID) {
             return null;
           }

This change lowers the size of body-less caches to 40%:

1,5M    640K    Standard/AWS
29M     13M     Standard/Base
28K     28K     Standard/Builtins
19M     7,4M    Standard/Database
19M     7,7M    Standard/Table
1,5M    652K    Standard/Visualization

If we read only the body-less part first, we should shorten the initial time to 40%. Only then we would incrementally (as methods get executed) load the remaining parts.

Implementation

Either we can have two .ir files - one body-less one full or we can place the Expression parts at the end of the stream. Following changes in the serialization stream would do:

private Map<Integer,Expression> pendingExpressions;
private int counter
...

if (obj instanceof Expression exp && counter >= 0) {
   var refExpr = new RefExpression(++counter);
   pendingExpressions.put(counter, exp);
   return refExpr;
}

store stream.writeObject(entry.moduleIR()); while replacing each Expression with just a delayed reference and then:

stream.counter = -1; // no more expression replaces
for (var entry : pendingExpressions.entrySet()) {
  stream.writeInt(entry.getKey());
  stream.writeObject(entry.getValue());
}

the good thing is that all references among objects (mostly metadata) in the single stream are going to be shared between body-less part and expression bodies part. We just don't have to read the second part, when we are not interested in it.

Reading the second part requires the ObjectInputStream to override readResolve and keep list of all pending RefExpression reference. When reading the integer ID and its associated real expression, just inject that expression to pending RefExpressions.

@enso-bot
Copy link

enso-bot bot commented Oct 30, 2023

Jaroslav Tulach reports a new STANDUP for the last Saturday (2023-10-28):

Progress: - Integrated: #8160 (comment)

Next Day: Speeding up startup

Access Google Docs with a personal Google account or Google Workspace account (for business use).

@enso-bot
Copy link

enso-bot bot commented Oct 31, 2023

Jaroslav Tulach reports a new STANDUP for yesterday (2023-10-30):

Progress: - Integrated Node elimination: #8172

Next Day: Speeding up startup

@jdunkerley jdunkerley moved this from ⚙️ Design to 🔧 Implementation in Issues Board Oct 31, 2023
@enso-bot
Copy link

enso-bot bot commented Nov 1, 2023

Jaroslav Tulach reports a new STANDUP for yesterday (2023-10-31):

Progress: - Checking benchmarks & merging ApplicationSaturation removal: #8181

Next Day: Speeding up startup

@JaroslavTulach
Copy link
Member Author

The new idea is to store the IR in the .ir files in a mode that can be accessed "lazily" and read on demand when needed.

Let's start with an analysis. By overriding the module cache reading code one can see that simple IO.println loads in 123 different class:

@@ -46,7 +50,14 @@ public final class ModuleCache extends Cache<ModuleCache.CachedModule, ModuleCac
 
     @Override
     protected CachedModule deserialize(EnsoContext context, byte[] data, Metadata meta, TruffleLogger logger) throws ClassNotFoundException, IOException, ClassNotFoundException {
-        try (var stream = new ObjectInputStream(new ByteArrayInputStream(data))) {
+        try (var stream = new ObjectInputStream(new ByteArrayInputStream(data)) {
+          @Override
+          protected ObjectStreamClass readClassDescriptor() throws IOException, ClassNotFoundException {
+            var clazz = super.readClassDescriptor();
+            System.err.println("CLASS: " + clazz.getName());
+            return clazz;
+          }
+        }) {
           if (stream.readObject() instanceof Module ir) {
               try {
                   return new CachedModule(ir,CompilationStage.valueOf(meta.compilationStage()), module.getSource());

here is the list:

java.util.UUID
org.enso.compiler.core.ir.CallArgument$Specified
org.enso.compiler.core.ir.DefinitionArgument$Specified
org.enso.compiler.core.ir.DiagnosticStorage
org.enso.compiler.core.ir.Expression$Binding
org.enso.compiler.core.ir.Expression$Block
org.enso.compiler.core.ir.expression.Application$Force
org.enso.compiler.core.ir.expression.Application$Prefix
org.enso.compiler.core.ir.expression.Application$Sequence
org.enso.compiler.core.ir.expression.Case$Branch
org.enso.compiler.core.ir.expression.Case$Expr
org.enso.compiler.core.ir.expression.Foreign$Definition
org.enso.compiler.core.ir.Function$Lambda
org.enso.compiler.core.ir.IdentifiedLocation
org.enso.compiler.core.ir.Literal$Number
org.enso.compiler.core.ir.Literal$Text
org.enso.compiler.core.ir.MetadataStorage
org.enso.compiler.core.ir.Module
org.enso.compiler.core.ir.module.scope.Definition$Data
org.enso.compiler.core.ir.module.scope.Definition$Type
org.enso.compiler.core.ir.module.scope.definition.Method$Conversion
org.enso.compiler.core.ir.module.scope.definition.Method$Explicit
org.enso.compiler.core.ir.module.scope.Export$Module
org.enso.compiler.core.ir.module.scope.Import$Module
org.enso.compiler.core.ir.module.scope.imports.Polyglot
org.enso.compiler.core.ir.module.scope.imports.Polyglot$Java
org.enso.compiler.core.ir.Name$Blank
org.enso.compiler.core.ir.Name$BuiltinAnnotation
org.enso.compiler.core.ir.Name$GenericAnnotation
org.enso.compiler.core.ir.Name$Literal
org.enso.compiler.core.ir.Name$MethodReference
org.enso.compiler.core.ir.Name$Qualified
org.enso.compiler.core.ir.Name$Self
org.enso.compiler.core.ir.Pattern$Constructor
org.enso.compiler.core.ir.Pattern$Literal
org.enso.compiler.core.ir.Pattern$Name
org.enso.compiler.core.ir.Pattern$Type
org.enso.compiler.core.ir.Type$Error
org.enso.compiler.core.ir.Type$Function
org.enso.compiler.core.ir.type.Set$Union
org.enso.compiler.data.BindingsMap
org.enso.compiler.data.BindingsMap$Cons
org.enso.compiler.data.BindingsMap$ExportedModule
org.enso.compiler.data.BindingsMap$ModuleMethod
org.enso.compiler.data.BindingsMap$ModuleReference$Abstract
org.enso.compiler.data.BindingsMap$PolyglotSymbol
org.enso.compiler.data.BindingsMap$Resolution
org.enso.compiler.data.BindingsMap$ResolvedConstructor
org.enso.compiler.data.BindingsMap$ResolvedImport
org.enso.compiler.data.BindingsMap$ResolvedMethod
org.enso.compiler.data.BindingsMap$ResolvedModule
org.enso.compiler.data.BindingsMap$ResolvedPolyglotField
org.enso.compiler.data.BindingsMap$ResolvedPolyglotSymbol
org.enso.compiler.data.BindingsMap$ResolvedType
org.enso.compiler.data.BindingsMap$SymbolRestriction$All$
org.enso.compiler.data.BindingsMap$SymbolRestriction$AllowedResolution
org.enso.compiler.data.BindingsMap$SymbolRestriction$Hiding
org.enso.compiler.data.BindingsMap$SymbolRestriction$Only
org.enso.compiler.data.BindingsMap$SymbolRestriction$Union
org.enso.compiler.data.BindingsMap$Type
org.enso.compiler.pass.analyse.AliasAnalysis$
org.enso.compiler.pass.analyse.AliasAnalysis$Graph
org.enso.compiler.pass.analyse.AliasAnalysis$Graph$Link
org.enso.compiler.pass.analyse.AliasAnalysis$Graph$Occurrence$Def
org.enso.compiler.pass.analyse.AliasAnalysis$Graph$Occurrence$Use
org.enso.compiler.pass.analyse.AliasAnalysis$Graph$Scope
org.enso.compiler.pass.analyse.AliasAnalysis$Info$Occurrence
org.enso.compiler.pass.analyse.AliasAnalysis$Info$Scope$Child
org.enso.compiler.pass.analyse.AliasAnalysis$Info$Scope$Root
org.enso.compiler.pass.analyse.BindingAnalysis$
org.enso.compiler.pass.analyse.CachePreferenceAnalysis$
org.enso.compiler.pass.analyse.CachePreferenceAnalysis$WeightInfo
org.enso.compiler.pass.analyse.DataflowAnalysis$
org.enso.compiler.pass.analyse.DataflowAnalysis$DependencyInfo
org.enso.compiler.pass.analyse.DataflowAnalysis$DependencyInfo$Type$Dynamic
org.enso.compiler.pass.analyse.DataflowAnalysis$DependencyInfo$Type$Static
org.enso.compiler.pass.analyse.DataflowAnalysis$DependencyMapping
org.enso.compiler.pass.analyse.GatherDiagnostics$
org.enso.compiler.pass.analyse.GatherDiagnostics$DiagnosticsMeta
org.enso.compiler.pass.analyse.TailCall$
org.enso.compiler.pass.analyse.TailCall$TailPosition$NotTail$
org.enso.compiler.pass.analyse.TailCall$TailPosition$Tail$
org.enso.compiler.pass.resolve.DocumentationComments$
org.enso.compiler.pass.resolve.DocumentationComments$Doc
org.enso.compiler.pass.resolve.ExpressionAnnotations$
org.enso.compiler.pass.resolve.GenericAnnotations$
org.enso.compiler.pass.resolve.GlobalNames$
org.enso.compiler.pass.resolve.IgnoredBindings$
org.enso.compiler.pass.resolve.IgnoredBindings$State$Ignored$
org.enso.compiler.pass.resolve.IgnoredBindings$State$NotIgnored$
org.enso.compiler.pass.resolve.MethodCalls$
org.enso.compiler.pass.resolve.MethodDefinitions$
org.enso.compiler.pass.resolve.ModuleAnnotations$
org.enso.compiler.pass.resolve.ModuleAnnotations$Annotations
org.enso.compiler.pass.resolve.Patterns$
org.enso.compiler.pass.resolve.TypeNames$
org.enso.compiler.pass.resolve.TypeSignatures$
org.enso.compiler.pass.resolve.TypeSignatures$Signature
org.enso.pkg.QualifiedName
org.enso.syntax.text.Location
scala.collection.generic.DefaultSerializationProxy
scala.collection.generic.SerializeEnd$
scala.collection.immutable.HashMap$
scala.collection.immutable.HashSet$
scala.collection.immutable.List$
scala.collection.immutable.Map$EmptyMap$
scala.collection.immutable.Map$Map1
scala.collection.immutable.Map$Map2
scala.collection.immutable.Map$Map3
scala.collection.immutable.Map$Map4
scala.collection.immutable.Set$EmptySet$
scala.collection.immutable.Set$Set1
scala.collection.immutable.Set$Set2
scala.collection.immutable.Set$Set3
scala.collection.immutable.Set$Set4
scala.collection.IterableFactory$ToFactory
scala.collection.MapFactory$ToFactory
scala.collection.mutable.HashMap$DeserializationFactory
scala.None$
scala.Option
scala.runtime.ModuleSerializationProxy
scala.Some
scala.Tuple2

@enso-bot
Copy link

enso-bot bot commented Nov 2, 2023

Jaroslav Tulach reports a new 🔴 DELAY for yesterday (2023-11-01):

Summary: There is 10 days delay in implementation of the Speed Standard.Base initialization in simple Hello World example up! (#6100) task.
It will cause 10 days delay for the delivery of this weekly plan.

There has been some progress, but more work is needed.

Delay Cause: IR caches and import resolution is a complicated piece of our code base nobody wants to touch. I am moving forward, but slowly.

Possible solutions: Yesterday I got new idea and started #8207 - I am thrilled now as I believe this direction has its merits.

@enso-bot
Copy link

enso-bot bot commented Nov 2, 2023

Jaroslav Tulach reports a new STANDUP for yesterday (2023-11-01):

Progress: - runtime-compiler project created & merged: #8197

Next Day: "on demand" IR caches

Discord
Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.

@enso-bot
Copy link

enso-bot bot commented Nov 3, 2023

Jaroslav Tulach reports a new STANDUP for yesterday (2023-11-02):

Progress: - "on demand" serde: a25bfb6

Next Day: "on demand" IR caches

@enso-bot
Copy link

enso-bot bot commented Nov 3, 2023

Jaroslav Tulach reports a new STANDUP for today (2023-11-03):

Progress: - Investigating GraalVM for JDK21 update status: https://github.com/enso-org/enso/pull/7991/files#r1381210135

Next Day: "on demand" IR caches

GitHub
Fixes #7851 Pull Request Description Upgrade to GraalVM JDK 21. > java -version openjdk version "21" 2023-09-19 OpenJDK Runtime Environment GraalVM CE 21+35.1 (build 21+35-jvmci-23.1-b...

@enso-bot
Copy link

enso-bot bot commented Nov 9, 2023

Jaroslav Tulach reports a new STANDUP for yesterday (2023-11-08):

Progress: - review of inline evaluation: https://discord.com/channels/@me/955430343308095518/1170940214991130654

Next Day: "on demand" IR caches

Discord
Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.

@enso-bot
Copy link

enso-bot bot commented Nov 10, 2023

Jaroslav Tulach reports a new STANDUP for yesterday (2023-11-09):

Progress: - IR persist everything: 5413e2f

Next Day: "on demand" IR caches

GitHub
Pull Request Description

close #8132
Important Notes

Checklist
Please ensure that the following checklist has been satisfied before submitting the PR:

The documentation has been updated, if nec...

@enso-bot
Copy link

enso-bot bot commented Nov 11, 2023

Jaroslav Tulach reports a new STANDUP for yesterday (2023-11-10):

Progress: - able to write, read & use caches: #8207 (comment)

Next Day: "on demand" IR caches

GitHub
When adding a test in #8245, I wanted to check that even in non-strict mode, the errors for duplicate from conversion are reported to the user. Unfortunately, I was unable to do so - because in gen...
GitHub
With #8245 I've fixed the wrong rendering of the ambiguous conversion error, but I have also demonstrated with a test that in non-strict mode, the first conversion just works and the user is not ro...
GitHub
Source code edits are not currently easy in GUI2, and the implementation is currently broken. We could fix the current design, but not without increasing the complexity of the implementation. An al...

@enso-bot
Copy link

enso-bot bot commented Nov 13, 2023

Jaroslav Tulach reports a new 🔴 DELAY for the last Friday (2023-11-10):

Summary: There is 7 days delay in implementation of the Speed Standard.Base initialization in simple Hello World example up! (#6100) task.
It will cause 7 days delay for the delivery of this weekly plan.

Serialization & deserialization works, but we need to benefit from its possiblities.

Delay Cause: Two holidays, one conference day. Changes in restoreFromSerialization

Possible solutions: The idea #8207 - seems to be working, but needs few more days.

@enso-bot
Copy link

enso-bot bot commented Nov 13, 2023

Jaroslav Tulach reports a new STANDUP for yesterday (2023-11-12):

Progress: - fixing broken CI build: 01e0d20

Next Day: "on demand" IR caches

GitHub
Source code edits are not currently easy in GUI2, and the implementation is currently broken. We could fix the current design, but not without increasing the complexity of the implementation. An al...

@enso-bot
Copy link

enso-bot bot commented Nov 14, 2023

Jaroslav Tulach reports a new STANDUP for yesterday (2023-11-13):

Progress: - updating Frgaal 21: #8286

Next Day: "on demand" IR caches

Discord
Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.

@JaroslavTulach JaroslavTulach moved this from 🔧 Implementation to 👁️ Code review in Issues Board Nov 14, 2023
@enso-bot
Copy link

enso-bot bot commented Nov 15, 2023

Jaroslav Tulach reports a new STANDUP for yesterday (2023-11-14):

Progress: - working on #8207

  • review meeting - approved for integration
  • emails, reviews
  • planning meeting & other meetings It should be finished by 2023-11-17.

Next Day: "on demand" IR caches

@enso-bot
Copy link

enso-bot bot commented Nov 16, 2023

Jaroslav Tulach reports a new STANDUP for yesterday (2023-11-15):

Progress: - Found bug in PersistableProcessor: 5cb4b68

Next Day: "on demand" IR caches

@enso-bot
Copy link

enso-bot bot commented Nov 17, 2023

Jaroslav Tulach reports a new STANDUP for yesterday (2023-11-16):

Progress: - persistance project for faster startup: 79482a1

Next Day: "on demand" IR caches

GitHub
Fixes #7851 Pull Request Description Upgrade to GraalVM JDK 21. > java -version openjdk version "21" 2023-09-19 OpenJDK Runtime Environment GraalVM CE 21+35.1 (build 21+35-jvmci-23.1-b...

@enso-bot
Copy link

enso-bot bot commented Nov 18, 2023

Jaroslav Tulach reports a new STANDUP for yesterday (2023-11-17):

Progress: - fixing license after Lookup library removal from runtime dependencies

  • cleaning Persistance API up: 76859b9
  • standup
  • rewriting annotation processor
  • documenting the Persistance package: dbf62e2 It should be finished by 2023-11-17.

Next Day: merge "on demand" IR caches

@JaroslavTulach
Copy link
Member Author

There is 16% speed up of simple "hello world" application with the integration of:

@github-project-automation github-project-automation bot moved this from 👁️ Code review to 🟢 Accepted in Issues Board Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

1 participant