Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CP] [ CLI ] Fix DART_VM_OPTIONS usage only resulting in printing of help message #55818

Closed
bkonyi opened this issue May 22, 2024 · 8 comments
Closed
Assignees
Labels
area-meta Cross-cutting, high-level issues (for tracking many other implementation issues, ...). cherry-pick-approved Label for approved cherrypick request cherry-pick-review Issue that need cherry pick triage to approve

Comments

@bkonyi
Copy link
Contributor

bkonyi commented May 22, 2024

Commit(s) to merge

https://dart-review.googlesource.com/c/sdk/+/367720

Target

stable

Prepared changelist for beta/stable

https://dart-review.googlesource.com/c/sdk/+/367721

Issue Description

The DART_VM_OPTIONS environment variable is meant to allow for users to specify Dart VM flags to precompiled Dart executables generated by dart compile exe. Currently, if any flags are provided through this variable, the VM's help message is printed rather than setting flags based on the provided options. Since testing was relying on --help and --verbose to deterministically test parsing of DART_VM_OPTIONS, this bug slipped through.

What is the fix

Fixed the argument parsing logic in the VM to correctly handle lists of VM flags without a terminating script path or CLI command.

Why cherry-pick

This feature was explicitly called out in the 3.4 CHANGELOG. In its current state, this feature is completely broken and is preventing some customers from configuring heap size related flags for use in container environments.

Risk

Low.

Issue link(s)

#55767

Extra Info

No response

@bkonyi bkonyi added the cherry-pick-review Issue that need cherry pick triage to approve label May 22, 2024
@itsjustkevin
Copy link
Contributor

@a-siva could you review this cherry-pick request?

@mit-mit mit-mit added the area-meta Cross-cutting, high-level issues (for tracking many other implementation issues, ...). label May 24, 2024
@mit-mit
Copy link
Member

mit-mit commented May 24, 2024

SGTM

@itsjustkevin itsjustkevin added the cherry-pick-approved Label for approved cherrypick request label May 24, 2024
@itsjustkevin
Copy link
Contributor

Cherry-pick approved, please merge once the CL is reviewed @bkonyi

@athomas
Copy link
Member

athomas commented May 28, 2024

Not reviewed or landed, yet.

copybara-service bot pushed a commit that referenced this issue May 30, 2024
… of help message

Fixes #55767

TEST=pkg/dartdev/test/commands/compile_test.dart

Change-Id: I6a773acbd9fc21c086fc459c7cb983ea1ff11fcd
Cherry-pick: https://dart-review.googlesource.com/c/sdk/+/367720
Cherry-pick-request: #55818
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/367721
Reviewed-by: Siva Annamalai <[email protected]>
Commit-Queue: Ben Konyi <[email protected]>
@bkonyi
Copy link
Contributor Author

bkonyi commented May 30, 2024

Reviewed and landed.

@gmpassos
Copy link
Contributor

gmpassos commented Jun 6, 2024

FYI:

Tested in production: working very well, the GC works much better when a limit is defined.

export DART_VM_OPTIONS=--verbose_gc,--old_gen_heap_size=1500

./bin/server.exe

@a-siva
Copy link
Contributor

a-siva commented Jun 6, 2024

Tested in production: working very well, the GC works much better when a limit is defined.

@gmpassos glad that it worked for you, can you give us some stats or log information about how GC works better

@gmpassos
Copy link
Contributor

gmpassos commented Jun 6, 2024

@a-siva,

A Dart self-executable uses much less memory since it doesn't need to parse, analyze, or compile any code. Our project in the Dart VM uses 800MB just to start, whereas in a self-executable it uses only 80MB. We use multiple server nodes with only 2GB of RAM.

We detected two main changes in the GC for self-executables (which are known in the GC domain):

  • Without GC limit (default limit 30G):

    • The "VM" will always try to allocate more memory (increase heap capacity) until it reaches the server RAM limit and crashes. This may be because, with a capacity of 0.5-1G, it thinks it is far from the 30G limit and prioritizes more capacity.

    • With less release of objects, the amount of "ghost" objects (not referenced) still in memory accumulates more often, which increases the GC time and CPU usage. Since there is a tendency to have more "ghost" objects for a longer time, the average GC time will be longer.

    • IMHO, the tendency to accumulate ghost objects increases the chances of a "dirty" situation, where the GC has many ghost objects. When a new request that demands many allocations occurs, it will add even more ghost objects without triggering a GC cycle capable of releasing a significant amount of them. Since this is a "recursive" behavior, once you reach this situation, it will worsen the "VM" performance until it crashes or, by luck, performs a full GC.

  • With a GC limit (1.5G):

    • The "VM" will try more often to release objects before increasing capacity. This may be because a capacity of 300MB (easily reachable) is near the limit of 1.5GB.

    • Since objects are released more often, the GC time is reduced, as it's faster to compute for a smaller collection of objects. The average number of objects in the heap, with a 1.5GB memory limit, is significantly reduced, and this is the main factor that affects GC behavior and time.

    • The limit of 1.5G of heap capacity (old-gen) forces a cleaner heap with fewer "ghost" objects on average, allowing the "VM" and the GC to perform better.

Note: These behaviors are not simple to see in a simple log, as they depend on an analysis of multiple GC info (GC in verbose mode) and an understanding of the benchmark operations and objects. Essentially, our application is a backend that manipulates a DB and loads everything using ORM abstractions (without any memory leaks, all proven by these benchmarks: after a forced full GC, the number of objects per class returns to the initial count).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-meta Cross-cutting, high-level issues (for tracking many other implementation issues, ...). cherry-pick-approved Label for approved cherrypick request cherry-pick-review Issue that need cherry pick triage to approve
Projects
None yet
Development

No branches or pull requests

7 participants