Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try some more optimizations #9409

Closed
wants to merge 21 commits into from
Closed

Conversation

odersky
Copy link
Contributor

@odersky odersky commented Jul 22, 2020

Based on #9405 and includes #9414.

  • We further reduce context fields, so that I count now 14 fields overall: 06756eb
  • We reduce contexts created with new periods by about 3/4ths: 5aa5c1f
  • We eliminate several allocation hotspots (remaining commits)

Together with #9405 and #9343, this reduces total memory allocated to contexts by 2/3rds and total memory allocated to TypeComparers to practically nothing.

@odersky
Copy link
Contributor Author

odersky commented Jul 22, 2020

test performance please

@dottybot
Copy link
Member

performance test scheduled: 1 job(s) in queue, 0 running.

@dottybot
Copy link
Member

Performance test finished successfully:

Visit http://dotty-bench.epfl.ch/9409/ to see the changes.

Benchmarks is based on merging with master (0f1a23e)

@odersky
Copy link
Contributor Author

odersky commented Jul 23, 2020

test performance please

@dottybot
Copy link
Member

performance test scheduled: 1 job(s) in queue, 0 running.

@odersky odersky changed the title Streamline treatment of withPhase and withSource Try some more optimizations Jul 23, 2020
@dottybot
Copy link
Member

Performance test finished successfully:

Visit http://dotty-bench.epfl.ch/9409/ to see the changes.

Benchmarks is based on merging with master (0f1a23e)

@liufengyun
Copy link
Contributor

Note that stdlib test failed, there's no point for that curve.

@odersky
Copy link
Contributor Author

odersky commented Jul 23, 2020

Why did stdlib fail? Was that a trasient failure?

@odersky
Copy link
Contributor Author

odersky commented Jul 23, 2020

test performance please

@dottybot
Copy link
Member

performance test scheduled: 1 job(s) in queue, 1 running.

@liufengyun
Copy link
Contributor

Why did stdlib fail? Was that a trasient failure?

It is not transient, it has been failing since #8652 . I'll propose a fix ASAP.

@dottybot
Copy link
Member

Performance test finished successfully:

Visit http://dotty-bench.epfl.ch/9409/ to see the changes.

Benchmarks is based on merging with master (eb3b0ac)

@retronym
Copy link
Member

FreshContext is down to 88 bytes... nice improvement!

dotty.tools.dotc.core.Contexts$FreshContext object internals:
 OFFSET  SIZE                                                   TYPE DESCRIPTION                               VALUE
      0     4                                                        (object header)                           05 00 00 00 (00000101 00000000 00000000 00000000) (5)
      4     4                                                        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4                                                        (object header)                           40 70 12 00 (01000000 01110000 00010010 00000000) (1208384)
     12     4                                                    int Context._period                           0
     16     8                                                   long Context.bitmap$0                          0
     24     4                                                    int Context._mode                             0
     28     4             dotty.tools.dotc.core.Contexts.ContextBase Context.base                              null
     32     4                 dotty.tools.dotc.core.Contexts.Context Context.given_Context$lzy1                null
     36     4                 dotty.tools.dotc.core.Contexts.Context Context._outer                            null
     40     4                   dotty.tools.dotc.core.Symbols.Symbol Context._owner                            null
     44     4                        dotty.tools.dotc.ast.Trees.Tree Context._tree                             null
     48     4                     dotty.tools.dotc.core.Scopes.Scope Context._scope                            null
     52     4                       dotty.tools.dotc.core.TyperState Context._typerState                       null
     56     4                    dotty.tools.dotc.typer.TypeAssigner Context._typeAssigner                     null
     60     4                   dotty.tools.dotc.core.GadtConstraint Context._gadt                             null
     64     4                   dotty.tools.dotc.typer.SearchHistory Context._searchHistory                    null
     68     4                       dotty.tools.dotc.util.SourceFile Context._source                           null
     72     4                         scala.collection.immutable.Map Context._moreProperties                   null
     76     4                                     java.lang.Object[] Context._store                            null
     80     4   dotty.tools.dotc.typer.Implicits.ContextualImplicits Context.implicitsCache                    null
     84     4                dotty.tools.dotc.util.SimpleIdentityMap Context.related                           null
Instance size: 88 bytes
Space losses: 0 bytes internal + 0 bytes external = 0 bytes total

@odersky
Copy link
Contributor Author

odersky commented Jul 24, 2020

@retronym I notice there's an unexpected lazy val bitmap and given_Context$lzy1 in there, which can both be avoided. If we could shave off one more word, we'd be down to 72.

@odersky
Copy link
Contributor Author

odersky commented Jul 24, 2020

test performance please

@dottybot
Copy link
Member

performance test scheduled: 1 job(s) in queue, 0 running.

@dottybot
Copy link
Member

Performance test finished successfully:

Visit http://dotty-bench.epfl.ch/9409/ to see the changes.

Benchmarks is based on merging with master (fd18546)

@odersky
Copy link
Contributor Author

odersky commented Jul 24, 2020

test performance please

@dottybot
Copy link
Member

performance test scheduled: 1 job(s) in queue, 0 running.

@dottybot
Copy link
Member

Performance test finished successfully:

Visit http://dotty-bench.epfl.ch/9409/ to see the changes.

Benchmarks is based on merging with master (fd18546)

@odersky
Copy link
Contributor Author

odersky commented Jul 25, 2020

test performance please

@dottybot
Copy link
Member

performance test scheduled: 1 job(s) in queue, 1 running.

@dottybot
Copy link
Member

performance test failed:

Please check http://lamppc37.epfl.ch:8000/pull-9409-07-25-10.55.out for more information

@odersky
Copy link
Contributor Author

odersky commented Jul 25, 2020

test performance please

@dottybot
Copy link
Member

performance test scheduled: 1 job(s) in queue, 0 running.

@dottybot
Copy link
Member

dottybot commented Sep 7, 2020

performance test scheduled: 1 job(s) in queue, 1 running.

@dottybot
Copy link
Member

dottybot commented Sep 7, 2020

Performance test finished successfully:

Visit http://dotty-bench.epfl.ch/9409/ to see the changes.

Benchmarks is based on merging with master (8c94870)

1 similar comment
@dottybot
Copy link
Member

dottybot commented Sep 7, 2020

Performance test finished successfully:

Visit http://dotty-bench.epfl.ch/9409/ to see the changes.

Benchmarks is based on merging with master (8c94870)

Looking at these benchmarks, our own HashSet is clearly the fastest
so we should use it wherever possible.

First tests without GC:
```
java.util.HashMap          took 1977.537756 ms
java.util.IdentityHashMap  took 2123.028843 ms
scala.collection.HashMap   took 2116.062172 ms
scala.collection.HashSet   took 1998.378418 ms
dotty.tools.dotc.HashSet   took 1429.826866 ms
```
Second tests with System.gc() run before every test:
```
java.util.HashMap          took 1980.977419 ms
java.util.IdentityHashMap  took 2082.178216 ms
scala.collection.HashMap   took 2101.882151 ms
scala.collection.HashSet   took 1967.379402 ms
dotty.tools.dotc.HashSet   took 1450.755273 ms
```
Allow to record total sizes of collections on which
some selected operation is performed.

# Conflicts:
#	compiler/src/dotty/tools/dotc/transform/Instrumentation.scala
Add nestedExists extension method for List[List[T]] exists and
optimize it as well as nestedMap.
Avoid closures for copied and simple. Avoid substitutions for simple.
Make all orElse methods that take a call-by-name argument inline methods.
Run stats only at last run. This allows one to warm up and JOT compile code
before measurements start.
If Stats is enabled, track total time spent in

 - implicit search overall
 - searching implicits in context / in type scope
 - typedImplicit
When tranisitioning from dense to hashing, we grow the table more than usual
since otherwise we'd have to grow it again at the very next addEntry. The
condition for this was wrong for HashMaps.
Make criterion when to fill a hole in a HashMap or HashSet remove more robust.
The old criterion relied on fill factor always being less than 0.5.
The new criterion works for arbitrary fill factors.
@odersky
Copy link
Contributor Author

odersky commented Sep 8, 2020

test performance please

@dottybot
Copy link
Member

dottybot commented Sep 8, 2020

performance test scheduled: 4 job(s) in queue, 1 running.

@dottybot
Copy link
Member

dottybot commented Sep 8, 2020

Performance test finished successfully:

Visit http://dotty-bench.epfl.ch/9409/ to see the changes.

Benchmarks is based on merging with master (ce48f5a)

@odersky odersky closed this Nov 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants