Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing upgrade flow from 7.17 to 8.0 #7087

Closed
6 tasks done
simitt opened this issue Jan 17, 2022 · 8 comments
Closed
6 tasks done

Testing upgrade flow from 7.17 to 8.0 #7087

simitt opened this issue Jan 17, 2022 · 8 comments
Assignees
Milestone

Comments

@simitt
Copy link
Contributor

simitt commented Jan 17, 2022

Conduct dedicated upgrade testing from 7.17 to 8.0 for following scenarios.

ESS

  • Standalone mode
  1. Create 7.17 deployment with APM&Fleet Server and ingest data
  2. Upgrade stack to 8.0 and wait until all components are up
  3. Verify that the 8.0 APM Integration is installed and used in the deployment
  4. Verify that data ingestion still works
  5. Verify that APM Server is still running in standalone mode (by navigating to APM Settings and having the migration button available)
  6. Optional: Trigger the migration to managed mode and ensure APM Server is restarted and data ingestion works.
  • Managed mode
  1. Create 7.17 deployment with APM&Fleet Server and ingest data
  2. Navigate to Kibana/APM/Settings/Schema page and switch to Elastic Agent
  3. Upgrade stack to 8.0 and wait until all components are up
  4. Verify that data ingestion still works
  5. Verify that APM Server is still running in managed mode (by navigating to APM Settings)

Additionally try to break things through timing, manual configuration changes etc. and ensure deployments are eventually recovered to a working state when all components are upgraded.

ECE

Repeat the ESS steps with ECE 3.0. Upgrading to 8.0 should not be supported in earlier versions.

On-Prem

  • Standalone mode (in prefered upgrade order):
  1. Run Elastic Stack and APM Server 7.17 and ingest data
  2. Upgrade the Elastic Stack to 8.0
  3. Verify that data ingestion still works
  4. Manually install the APM Integration via Kibana Fleet Integration UI

Screenshot 2022-01-17 at 10 33 41

  1. Upgrade APM Server to 8.0
  2. Ingest data and ensure they show up in the APM UI
  • Standalone mode (in any upgrade order):
  1. Run Elastic Stack and APM Server 7.17 and ingest data
  2. Upgrade the Elastic Stack and APM Server to 8.0
  3. Verify that APM Server issues logs about preconditions not being met when trying to ingest data
  4. Manually install the APM Integration via Kibana Fleet Integration UI
  5. Ingest data and ensure they show up in the APM UI
  • Managed mode (in prefered upgrade order):
  1. Run Elastic Stack, Elastic Agent w. Fleet Server and Elastic Agent enrolled to an agent policy with an APM Integration, all in 7.17 and ingest data
  2. Upgrade the Elastic Stack to 8.0
  3. Verify that data ingestion still works
  4. Upgrade the APM Integration to 8.0
  5. Verify that data ingestion still works
  6. Upgrade the Elastic Agent including the APM Server to 8.0
  7. Ingest data and ensure they show up in the APM UI
  • Managed mode (in any upgrade order):
    Change the upgrade order, but essentially repeat the above steps and ensure that eventually the data ingestion works as expected when all components are upgraded.
@simitt simitt added this to the 8.0 milestone Jan 17, 2022
@marclop marclop self-assigned this Jan 17, 2022
@marclop
Copy link
Contributor

marclop commented Jan 19, 2022

On-Prem (Local docker-compose with Elasticsearch data volume)

Architecture: linux/arm64

7.17.0 hash: 7.17.0-079761a0

8.0.0 hash: 8.0.0-b44c2e43

  • Standalone mode (in prefered upgrade order): ✅
  1. Run Elastic Stack and APM Server 7.17 and ingest data ✅
  2. Upgrade the Elastic Stack to 8.0 ✅
  3. Verify that data ingestion still works ✅
  4. Manually install the APM Integration via Kibana Fleet Integration UI ✅ (8.0.0-dev6)
  5. Upgrade APM Server to 8.0 ✅
  6. Ingest data and ensure they show up in the APM UI ✅

image

  • Standalone mode (in any upgrade order):
  1. Run Elastic Stack and APM Server 7.17 and ingest data ✅
  2. Upgrade the Elastic Stack and APM Server to 8.0 ✅
  3. Verify that APM Server issues logs about preconditions not being met when trying to ingest data ✅
...
{"log.level":"error","@timestamp":"2022-01-19T10:45:34.923+0800","log.logger":"beater","log.origin":{"file.name":"beater/waitready.go","file.line":62},"message":"precondition 'apm integration installed' failed: error querying Elasticsearch for integration index templates: unexpected HTTP status: 404 Not Found ({\"error\":{\"root_cause\":[{\"type\":\"resource_not_found_exception\",\"reason\":\"index template matching [traces-apm.sampled] not found\"}],\"type\":\"resource_not_found_exception\",\"reason\":\"index template matching [traces-apm.sampled] not found\"},\"status\":404}): to remediate, please install the apm integration: https://ela.st/apm-integration-quickstart","service.name":"apm-server","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2022-01-19T10:45:37.461+0800","log.logger":"beater","log.origin":{"file.name":"beater/waitready.go","file.line":62},"message":"precondition 'apm integration installed' failed: error querying Elasticsearch for integration index templates: unexpected HTTP status: 404 Not Found ({\"error\":{\"root_cause\":[{\"type\":\"resource_not_found_exception\",\"reason\":\"index template matching [logs-apm.error] not found\"}],\"type\":\"resource_not_found_exception\",\"reason\":\"index template matching [logs-apm.error] not found\"},\"status\":404}): to remediate, please install the apm integration: https://ela.st/apm-integration-quickstart","service.name":"apm-server","ecs.version":"1.6.0"}
...
  1. Manually install the APM Integration via Kibana Fleet Integration UI ✅ (8.0.0-dev6)

apm-server stopped logging precondition errors.

  1. Ingest data and ensure they show up in the APM UI ✅

image

  • Managed mode (in prefered upgrade order):
  1. Run Elastic Stack, Elastic Agent w. Fleet Server and Elastic Agent enrolled to an agent policy with an APM Integration, all in 7.17 and ingest data ✅ (7.16.0 integration) encountered a Kibana Error when trying to upgrade the integration to 7.16.1 🔴, which I thought we had fixed. I added a comment on Can't upgrade APM integration to version 7.16.1 on ESS kibana#121238 (comment).

image

  1. Upgrade the Elastic Stack to 8.0 ✅
  2. Verify that data ingestion still works ✅
  3. Upgrade the APM Integration to 8.0 ✅ (The integration was automatically upgraded for me, didn't need to take action)
  4. Verify that data ingestion still works ✅
  5. Upgrade the Elastic Agent including the APM Server to 8.0 ✅
  6. Ingest data and ensure they show up in the APM UI ✅

image

image

  • Managed mode (in any upgrade order): ✅
    Change the upgrade order, but essentially repeat the above steps and ensure that eventually the data ingestion works as expected when all components are upgraded.
  1. Run Elastic Stack, Elastic Agent w. Fleet Server and Elastic Agent enrolled to an agent policy with an APM Integration, all in 7.17 and ingest data ✅ (7.16.0 integration)
  2. Upgrade the Elastic Stack and Elastic Agent to 8.0 ✅
  3. Verify that data ingestion still works ✅ (The integration was automatically upgraded for me, didn't need to take action)

@marclop
Copy link
Contributor

marclop commented Jan 19, 2022

ESS (QA)

Using 7.17.0 and 8.0.0-rc2

  • Managed mode ✅
  1. Create 7.17 deployment with APM&Fleet Server and ingest data ✅
  2. Navigate to Kibana/APM/Settings/Schema page, switch to Elastic Agent ✅ Ingest data ✅
  3. Upgrade stack to 8.0.0-rc2 and wait until all components are up ✅
  4. Verify that data ingestion still works ✅
  5. Verify that APM Server is still running in managed mode (by navigating to APM Settings) ✅

image

image

  • Standalone mode ✅
  1. Create 7.17 deployment with APM&Fleet Server and ingest data ✅

image

  1. Upgrade stack to 8.0 and wait until all components are up ✅
  2. Verify that the 8.0 APM Integration is installed and used in the deployment ✅
  3. Verify that data ingestion still works ✅

image

  1. Verify that APM Server is still running in standalone mode (by navigating to APM Settings and having the migration button available) ✅

image

  1. Optional: Trigger the migration to managed mode and ensure APM Server is restarted and data ingestion works. ✅

image

@marclop
Copy link
Contributor

marclop commented Jan 19, 2022

ECE (3.0.0-latest)

  • Create 8.0.0-SNAPSHOT deployment: ✅
  1. Create a new deployment using the 8.0.0-SNAPSHOT
  2. Verify that the APM Integration is automatically installed ✅

image

image

  1. Verify that data can be ingested and is shown in the APM UI ✅

image

@simitt
Copy link
Contributor Author

simitt commented Jan 24, 2022

Testing on ESS non-production env is currently blocked due to missing upgrade option to 8.0.0-rc candidates.

@marclop
Copy link
Contributor

marclop commented Jan 27, 2022

ESS (Staging)

Using 7.17.0 (0a126886) and 8.0.0-rc2 (5e6da84e)

  • Standalone mode
  1. Create 7.17 deployment with APM&Fleet Server and ingest data ✅
  2. Upgrade stack to 8.0 and wait until all components are up ✅
  3. Verify that the 8.0 APM Integration is installed and used in the deployment ✅
  4. Verify that data ingestion still works ✅
  5. Verify that APM Server is still running in standalone mode (by navigating to APM Settings and having the migration button available) ✅
  6. Optional: Trigger the migration to managed mode and ensure APM Server is restarted and data ingestion works. ✅
  • Managed mode ✅
  1. Create 7.17 deployment with APM&Fleet Server and ingest data ✅
  2. Navigate to Kibana/APM/Settings/Schema page and switch to Elastic Agent ✅
  3. Upgrade the APM Integration to 7.16.2 ✅
  4. Check that the bug that set the host in the APM Integration to localhost isn't reproducible ✅
  5. Upgrade stack to 8.0 and wait until all components are up ✅
  6. Verify that data ingestion still works ✅
  7. Verify that APM Server is still running in managed mode (by navigating to APM Settings) ✅
  • Create a deployment using version 8.0.0-rc2
  1. Verify that APM server is running in managed mode and can ingest data ✅

@marclop
Copy link
Contributor

marclop commented Jan 27, 2022

ECE (3.0.0-latest)

Using 7.17.0 (0a126886) and 8.0.0-rc2 (5e6da84e)

  • Standalone mode
  1. Create 7.17 deployment with APM&Fleet Server and ingest data ✅
  2. Upgrade stack to 8.0 and wait until all components are up ✅
  3. Verify that the 8.0 APM Integration is installed and used in the deployment ✅
  4. Verify that data ingestion still works ✅
  5. Verify that APM Server is still running in standalone mode (by navigating to APM Settings and having the migration button available) ✅
  6. Optional: Trigger the migration to managed mode and ensure APM Server is restarted and data ingestion works. ✅
  • Managed mode
  1. Create 7.17 deployment with APM&Fleet Server and ingest data ✅
  2. Navigate to Kibana/APM/Settings/Schema page and switch to Elastic Agent ✅ Ingest data ✅
  3. Upgrade the APM Integration to 7.16.2 ✅ Ingest data ✅
  4. Check that the bug that set the host in the APM Integration to localhost isn't reproducible ✅
  5. Upgrade stack to 8.0 and wait until all components are up ✅
  6. Verify that data ingestion still works ✅
  7. Verify that APM Server is still running in managed mode (by navigating to APM Settings) ✅
  • Create a deployment using version 8.0.0-rc2
  1. Verify that APM server is running in managed mode and can ingest data ✅

@simitt
Copy link
Contributor Author

simitt commented Jan 31, 2022

Conducted some additional testing last week with the BC-2 8.0.0-rc2 and today with BC-3:

Testcase 1

Env: staging
Version: 7.17.0
Mode: standalone

Changing config options to:

apm-server.rum.enabled: false
apm-server.ilm.enabled: false
apm-server.ilm.setup.overwrite: true
  1. Ingesting data ✅
  2. data are indexed into apm indices (not ILM) ✅
  3. Stack Monitoring for 7.17 broken 🔴 (metricbeat monitoring of Elasticsearch and Kibana broke on recent 7.17.0-SNAPSHOT rpm beats#29920)
  4. Upgrading to 8.0.0-rc2 ✅
  5. Ingesting data successfully ✅

Testcase 2

Testing stack monitoring with latest 8.0.0-rc2( BC3)
Env: staging
Version 8.0.0-rc2 (BC3)
Mode: managed
logs&metrics enabled

  1. Ingesting data ✅
  2. Stack Monitoring UI working for APM ✅

Testcase 3

Testing to switch to managed mode and upgrade
Env: staging
Version: 7.17.0 -> 8.0.0-rc2 (BC2)

  1. Start with standalone 7.17.0 deployment and index data ✅
  2. Edit config to disable ILM, index data ✅
  3. Migrate (note the user configs showing up) ✅
  4. Index data via integration now, using data streams ✅
  5. Upgrade apm-package to 7.16.2 ✅
  6. Index data ✅
  7. Update APM Integration via UI; verify that information is really updated (via apm policy editor and via view policy that is passed to elastic-agent) ✅
  8. Index data ✅
  9. Upgrade to 8.0.0-rc2 (BC2) ✅
  10. Index data ✅

BUG: previous user configurations are lost 🔴 elastic/kibana#123945

@simitt
Copy link
Contributor Author

simitt commented Jan 31, 2022

Closing this issue after through testing from the APM Server team.

@simitt simitt closed this as completed Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants