Skip to content

Commit

Permalink
save kibana exports
Browse files Browse the repository at this point in the history
As explained in #10853, we recently lost our ES cluster. While I'm not
planning on trusting Google's "rolling restart" feature ever again, we
can't exclude the possibility of future similar outages (without a
significant investment in the cluster, which I don't think we want to
do).

Losing the cluster is not a huge issue as we can always reingest the
data. Worst case we lose visibility for a few days. At least, as far as
the bazel logs are concerned.

Losing the Kiaban data is a lot more annoying, as that is not derived
data and thus cannot be reingested. This PR aims to add a backup
mechanism for our Kibana configuration.

CHANGELOG_BEGIN
CHANGELOG_END
  • Loading branch information
garyverhaegen-da committed Sep 13, 2021
1 parent c113954 commit 7591b0d
Showing 1 changed file with 61 additions and 2 deletions.
63 changes: 61 additions & 2 deletions infra/es_cluster.tf
Original file line number Diff line number Diff line change
Expand Up @@ -391,6 +391,27 @@ resource "google_project_iam_member" "es-feed" {
member = "serviceAccount:${google_service_account.es-feed.email}"
}

resource "google_project_iam_custom_role" "es-feed-write" {
role_id = "es_feed_write"
title = "es-feed-write"
description = "es-feed-write"
permissions = [
"storage.objects.write"
]
}

resource "google_project_iam_member" "es-feed-write" {
project = local.project
role = google_project_iam_custom_role.es-feed-write.id
member = "serviceAccount:${google_service_account.es-feed-write.email}"

condition {
title = "es_feed_write"
description = "es_feed_write"
expression = "resource.name.startsWith(“projects/_/buckets/${google_storage_bucket.data.name}/objects/kibana-export”)"
}
}

resource "google_compute_instance_group_manager" "es-feed" {
provider = google-beta
name = "es-feed"
Expand Down Expand Up @@ -844,8 +865,8 @@ index() {
}
pid=$$
exec 2> >(while IFS= read -r line; do echo "$(date -uIs) [$pid] [err]: $line"; done)
exec 1> >(while IFS= read -r line; do echo "$(date -uIs) [$pid] [out]: $line"; done)
exec 2> >(while IFS= read -r line; do echo "$(date -uIs) [ingest] [$pid] [err]: $line"; done)
exec 1> >(while IFS= read -r line; do echo "$(date -uIs) [ingest] [$pid] [out]: $line"; done)
LOCK=/root/lock
Expand Down Expand Up @@ -888,9 +909,46 @@ for tar in $todo; do
done
CRON
cat <<'HOURLY' >/root/hourly.sh
#!/usr/bin/env bash
set -euo pipefail
pid=$$
exec 2> >(while IFS= read -r line; do echo "$(date -uIs) [kibex] [$pid] [err]: $line"; done)
exec 1> >(while IFS= read -r line; do echo "$(date -uIs) [kibex] [$pid] [out]: $line"; done)
HOUR="$(date -u -Is | cut -c 1-13)"
TMP=$(mktemp)
TARGET="gs://daml-data/kibana-export/$HOUR.gz"
echo "Starting Kibana export..."
# Kibana export API does not support wildcard, so we list all of the object
# types that exist as of Kibana 7.13.
curl http://$KIBANA_IP/api/saved_objects/_export \
-XPOST \
-d'{"excludeExportDetails": true,
"type": ["visualization", "dashboard", "search", "index-pattern",
"config", "timelion-sheet"]}' \
-H 'kbn-xsrf: true' \
-H 'Content-Type: application/json' \
--fail \
--silent \
| gzip -9 > $TMP
echo "Pushing $TARGET"
$GSUTIL -q cp $TMP $TARGET
echo "Done."
HOURLY
chmod +x /root/cron.sh
chmod +x /root/hourly.sh
ES_IP=${google_compute_address.es[0].address}
KIB_IP=${google_compute_address.es[1].address}
DATA=/root/data
mkdir -p $DATA
Expand Down Expand Up @@ -938,6 +996,7 @@ fi
cat <<CRONTAB >> /etc/crontab
* * * * * root GSUTIL="$(which gsutil)" DONE="$DONE" DATA="$DATA" ES_IP="$ES_IP" /root/cron.sh >> /root/log 2>&1
1 * * * * root GSUTIL="$(which gsutil)" KIBANA_IP="$KIB_IP" /root/hourly.sh >> /root/log 2>&1
CRONTAB
echo "Waiting for first run..." > /root/log
Expand Down

0 comments on commit 7591b0d

Please sign in to comment.