Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(harvester): add config for maxAge/maxSize for exit uploads #30

Merged
merged 4 commits into from
Jan 13, 2023

Conversation

andrewazores
Copy link
Member

@andrewazores andrewazores commented Jan 12, 2023

Fixes #23

Usual test instructions.

  1. mvn install this PR.
  2. cd quarkus-test ; ./mvnw -U clean package && podman build -t quay.io/andrewazores/quarkus-test:latest -f src/main/docker/Dockerfile.jvm . ; podman image prune -f
  3. cd cryostat ; sh smoketest.sh

Modify smoketest.sh to suit, for example:

diff --git a/smoketest.sh b/smoketest.sh
index 48c0f88c..c2c6cf45 100755
--- a/smoketest.sh
+++ b/smoketest.sh
@@ -132,7 +132,10 @@ runDemoApps() {
         --env CRYOSTAT_AGENT_BASEURI="${protocol}://localhost:${webPort}/" \
         --env CRYOSTAT_AGENT_TRUST_ALL="true" \
         --env CRYOSTAT_AGENT_AUTHORIZATION="Basic $(echo user:pass | base64)" \
-        --rm -d quay.io/andrewazores/quarkus-test:0.0.10
+        --env CRYOSTAT_AGENT_HARVESTER_PERIOD_MS=30000 \
+        --env CRYOSTAT_AGENT_HARVESTER_EXIT_MAX_AGE_MS=60000 \
+        --env CRYOSTAT_AGENT_HARVESTER_EXIT_MAX_SIZE_B="$(echo 1024*1024*1 | bc)" \
+        --rm -d quay.io/andrewazores/quarkus-test:latest
 
     podman run \
         --name quarkus-test-agent-2 \

The Agent should periodically (every 30s with this config) push files into the Cryostat general archives. The file sizes will vary, but in my testing they range between ~350 and ~700 KB.

Then do podman stop quarkus-test-agent-1 to explicitly stop the container. This passes the SIGTERM to the JVM, triggering the Agent to perform the exit upload. This should cause a local Snapshot to be taken, dumped, and closed (closure observable in logs), and those contents uploaded to the archives. The maxSize JFR setting does not work as may be expected when the configured size is smaller than each individual recording chunk, so using this setting with a smaller limit will only force the upload of the single most recent chunk I believe. The harvester period and template can be experimented with to generate larger recordings so that the exit maxSize may have a more noticeable practical effect.

File compression support mentioned in the issue is not implemented here because it is not (yet) supported by the backend.

@andrewazores andrewazores added the feat New feature or request label Jan 12, 2023
@andrewazores andrewazores requested a review from maxcao13 January 12, 2023 14:14
Copy link
Member

@maxcao13 maxcao13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config: harvester period: 30 seconds, on-exit max-age: 1 minute
image

I tried viewing the uploaded on-exit recording to check the max-age limit that way, and it says that the recording is 6.8 minutes long. I assume it is checking the JFR recording metadata the archived recording is getting pieced from. Just something interesting to note. The recording exit data is about ~50 seconds long, when setting the max-age to PT1M.

@maxcao13
Copy link
Member

maxcao13 commented Jan 12, 2023

Does the agent have knowledge of Cryostat templates?
For example, if I wanted to set a specific template can I do this?
-- env CRYOSTAT_AGENT_HARVESTER_TEMPLATE="all"

EDIT:
Well this answers my question...

SEVERE [io.cry.age.Harvester] (cryostat-agent-worker-1) Unable to start recording: java.nio.file.NoSuchFileException: Could not locate configuration with name all

Copy link
Member

@maxcao13 maxcao13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my config:

        --env CRYOSTAT_AGENT_HARVESTER_PERIOD_MS=600000 \ // 10 minutes
        --env CRYOSTAT_AGENT_HARVESTER_EXIT_MAX_AGE_MS=60000 \ // 1 minute
        --env CRYOSTAT_AGENT_HARVESTER_EXIT_MAX_SIZE_B="$(echo 1024*1024*1 | bc)" \

The on-exit max-age settings seem to work and code looks good to me.
image

Copy link
Member

@maxcao13 maxcao13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update README with new config variables?

@andrewazores
Copy link
Member Author

Does the agent have knowledge of Cryostat templates? For example, if I wanted to set a specific template can I do this? -- env CRYOSTAT_AGENT_HARVESTER_TEMPLATE="all"

EDIT: Well this answers my question...

SEVERE [io.cry.age.Harvester] (cryostat-agent-worker-1) Unable to start recording: java.nio.file.NoSuchFileException: Could not locate configuration with name all

Yep, it does not support this - not yet, anyway. I don't know if I want the Agent to become aware of the existing Cryostat custom event templates that are stored and managed by Cryostat. I think what seems more idiomatic is for any such custom templates to be packaged together with the application that will also house the Agent (ex. included as a layer in the application Dockerfile). This way the template is available locally to the Agent and there is less need to transfer information over the network. Eventually if/when Cryostat can talk to the Agent to list event templates and start recordings on demand then it can do that based on the event templates that are available to the particular Agent instance.

@andrewazores
Copy link
Member Author

I tried viewing the uploaded on-exit recording to check the max-age limit that way, and it says that the recording is 6.8 minutes long. I assume it is checking the JFR recording metadata the archived recording is getting pieced from. Just something interesting to note. The recording exit data is about ~50 seconds long, when setting the max-age to PT1M.

I think you are correct, this value comes from the difference between the recording start time in the recording metadata and the current timestamp. Pretty sure I've had this conversation with @tthvo before. The more accurate way to determine the recording length would be to actually parse the recording and find the minimum and maximum timestamps on each event in the recording, but that's a bit more intensive to do.

@andrewazores
Copy link
Member Author

cryostatio/jfr-datasource#134

Copy link
Member

@maxcao13 maxcao13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works well. Looks good to me!

@andrewazores
Copy link
Member Author

andrewazores commented Jan 13, 2023

I think what seems more idiomatic is for any such custom templates to be packaged together with the application that will also house the Agent (ex. included as a layer in the application Dockerfile).

https://github.com/cryostatio/cryostat/blob/51c790c1511e5d36af0066e30f2ca70793783541/src/container/Dockerfile#L19

https://github.com/cryostatio/cryostat/blob/main/src/container/include/cryostat.jfc

That's how Cryostat's own customized event template works, to show a concrete example.

image

This is probably better anyway because if there are application-specific custom events registered in the application's source code, then there should also be customizations to the event template .jfc file(s), which need to be synchronized with the events as defined in the sources. So from the application developer's perspective, it makes sense for the template to be bundled with the packaged application, so that they are versioned together. This should really be the preferred way most of the time, but Cryostat's custom templates are meant to enable the "no rebuild enablement" story. If the user is using the Cryostat Agent then an application rebuild is already required, so also asking them to bundle event templates at the same time is little extra work.

@andrewazores andrewazores merged commit ee930c6 into cryostatio:main Jan 13, 2023
@andrewazores andrewazores deleted the exit-max-age-size branch January 13, 2023 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat New feature or request
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[Task] Intercept SIGTERM and upload recording before clean shutdown
2 participants