-
Notifications
You must be signed in to change notification settings - Fork 670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tmpnet
: Enable collection of logs and metrics
#2820
Merged
Merged
Changes from 13 commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
a0f75c3
`tmpnet`: Write config enabling metrics collection by prometheus
marun 2496a58
fixup: Add links to metrics
marun 50b7e23
fixup: Further refine metrics links
marun c99de8a
fixup: Enable filtering metrics by owner
marun 0783d59
fixup: Ensure network-shutdown-delay reflects the prometheus scrape i…
marun 90191fc
fixup: Add mention of metrics configuration to tmpnet README
marun e88b83d
fixup: Avoid collecting prometheus.yaml in artifact
marun d78c0cc
fixup: Fix shellcheck error
marun 6d73484
`tmpnet`: Enable log collection with promtail
marun bbeb4b3
fixup: s/grafana_url/prometheus_url/
marun 9380137
fixup: More promtail cleanup
marun fbd8386
fixup: Ensure the correct filename on x86
marun 4d72be9
fixup: Update shutdown delay to use time.Duration
marun 762032e
fixup: Respond to review feedback
marun File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
#!/usr/bin/env bash | ||
|
||
set -euo pipefail | ||
|
||
# Timestamps are in seconds | ||
from_timestamp="$(date '+%s')" | ||
monitoring_period=900 # 15 minutes | ||
to_timestamp="$((from_timestamp + monitoring_period))" | ||
|
||
# Grafana expects microseconds, so pad timestamps with 3 zeros | ||
metrics_url="${GRAFANA_URL}&var-filter=gh_job_id%7C%3D%7C${GH_JOB_ID}&from=${from_timestamp}000&to=${to_timestamp}000" | ||
|
||
# Optionally ensure that the link displays metrics only for the shared | ||
# network rather than mixing it with the results for private networks. | ||
if [[ -n "${FILTER_BY_OWNER:-}" ]]; then | ||
metrics_url="${metrics_url}&var-filter=network_owner%7C%3D%7C${FILTER_BY_OWNER}" | ||
fi | ||
|
||
echo "::notice links::metrics ${metrics_url}" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,120 @@ | ||
#!/usr/bin/env bash | ||
|
||
set -euo pipefail | ||
|
||
# Starts a prometheus instance in agent-mode, forwarding to a central | ||
# instance. Intended to enable metrics collection from temporary networks running | ||
# locally and in CI. | ||
# | ||
# The prometheus instance will remain running in the background and will forward | ||
# metrics to the central instance for all tmpnet networks. | ||
# | ||
# To stop it: | ||
# | ||
# $ kill -9 `cat ~/.tmpnet/prometheus/run.pid` && rm ~/.tmpnet/prometheus/run.pid | ||
# | ||
|
||
# e.g., | ||
# PROMETHEUS_ID=<id> PROMETHEUS_PASSWORD=<password> ./scripts/run_prometheus.sh | ||
if ! [[ "$0" =~ scripts/run_prometheus.sh ]]; then | ||
echo "must be run from repository root" | ||
exit 255 | ||
fi | ||
|
||
PROMETHEUS_WORKING_DIR="${HOME}/.tmpnet/prometheus" | ||
PIDFILE="${PROMETHEUS_WORKING_DIR}"/run.pid | ||
|
||
# First check if an agent-mode prometheus is already running. A single instance can collect | ||
# metrics from all local temporary networks. | ||
if pgrep --pidfile="${PIDFILE}" -f 'prometheus.*enable-feature=agent' &> /dev/null; then | ||
echo "prometheus is already running locally with --enable-feature=agent" | ||
exit 0 | ||
fi | ||
|
||
PROMETHEUS_URL="${PROMETHEUS_URL:-https://prometheus-experimental.avax-dev.network}" | ||
if [[ -z "${PROMETHEUS_URL}" ]]; then | ||
echo "Please provide a value for PROMETHEUS_URL" | ||
exit 1 | ||
fi | ||
|
||
PROMETHEUS_ID="${PROMETHEUS_ID:-}" | ||
if [[ -z "${PROMETHEUS_ID}" ]]; then | ||
echo "Please provide a value for PROMETHEUS_ID" | ||
exit 1 | ||
fi | ||
|
||
PROMETHEUS_PASSWORD="${PROMETHEUS_PASSWORD:-}" | ||
if [[ -z "${PROMETHEUS_PASSWORD}" ]]; then | ||
echo "Plase provide a value for PROMETHEUS_PASSWORD" | ||
exit 1 | ||
fi | ||
|
||
# This was the LTS version when this script was written. Probably not | ||
# much reason to update it unless something breaks since the usage | ||
# here is only to collect metrics from temporary networks. | ||
VERSION="2.45.3" | ||
|
||
# Ensure the prometheus command is locally available | ||
CMD=prometheus | ||
if ! command -v "${CMD}" &> /dev/null; then | ||
# Try to use a local version | ||
CMD="${PWD}/bin/prometheus" | ||
if ! command -v "${CMD}" &> /dev/null; then | ||
echo "prometheus not found, attempting to install..." | ||
|
||
# Determine the arch | ||
if which sw_vers &> /dev/null; then | ||
echo "on macos, only amd64 binaries are available so rosetta is required on apple silicon machines." | ||
echo "to avoid using rosetta, install via homebrew: brew install prometheus" | ||
DIST=darwin | ||
else | ||
ARCH="$(uname -i)" | ||
if [[ "${ARCH}" != "x86_64" ]]; then | ||
echo "on linux, only amd64 binaries are available. manual installation of prometheus is required." | ||
exit 1 | ||
else | ||
DIST="linux" | ||
fi | ||
fi | ||
|
||
# Install the specified release | ||
PROMETHEUS_FILE="prometheus-${VERSION}.${DIST}-amd64" | ||
URL="https://github.com/prometheus/prometheus/releases/download/v${VERSION}/${PROMETHEUS_FILE}.tar.gz" | ||
curl -s -L "${URL}" | tar zxv -C /tmp > /dev/null | ||
mkdir -p "$(dirname "${CMD}")" | ||
cp /tmp/"${PROMETHEUS_FILE}/prometheus" "${CMD}" | ||
fi | ||
fi | ||
|
||
# Configure prometheus | ||
FILE_SD_PATH="${PROMETHEUS_WORKING_DIR}/file_sd_configs" | ||
mkdir -p "${FILE_SD_PATH}" | ||
|
||
echo "writing configuration..." | ||
cat >"${PROMETHEUS_WORKING_DIR}"/prometheus.yaml <<EOL | ||
# my global config | ||
global: | ||
# Make sure this value takes into account the network-shutdown-delay in tests/fixture/e2e/env.go | ||
scrape_interval: 10s # Set the scrape interval to every 15 seconds. Default is every 1 minute. | ||
StephenButtolph marked this conversation as resolved.
Show resolved
Hide resolved
|
||
evaluation_interval: 10s # Evaluate rules every 15 seconds. The default is every 1 minute. | ||
StephenButtolph marked this conversation as resolved.
Show resolved
Hide resolved
|
||
scrape_timeout: 5s # The default is every 10s | ||
|
||
scrape_configs: | ||
- job_name: "avalanchego" | ||
metrics_path: "/ext/metrics" | ||
file_sd_configs: | ||
- files: | ||
- '${FILE_SD_PATH}/*.json' | ||
|
||
remote_write: | ||
- url: "${PROMETHEUS_URL}/api/v1/write" | ||
basic_auth: | ||
username: "${PROMETHEUS_ID}" | ||
password: "${PROMETHEUS_PASSWORD}" | ||
EOL | ||
|
||
echo "starting prometheus..." | ||
cd "${PROMETHEUS_WORKING_DIR}" | ||
nohup "${CMD}" --config.file=prometheus.yaml --web.listen-address=localhost:0 --enable-feature=agent > prometheus.log 2>&1 & | ||
echo $! > "${PIDFILE}" | ||
echo "running with pid $(cat "${PIDFILE}")" |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It probably makes sense to simplify this so that the test scripts optionally start promtail and promtail internally. Having this many steps and envs to worry about does not seem ideal and each job in all our repos that want to collect metrics will need similar configuration.