Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test enable and disable dr #1012

Closed
wants to merge 4 commits into from
Closed

Test enable and disable dr #1012

wants to merge 4 commits into from

Conversation

nirs
Copy link
Member

@nirs nirs commented Jul 30, 2023

Change the basic test to test the more generic flow - deploying an application using OCM, and enable DR for the deployed application. Finally test disabling DR after the application was failed over and relocated.

This change depends on ocm-ramen-samples change splitting the channel and drpc from the subscription.

Issues:

  • Need to test delete timeout with ramen without Fix disable DR for regional-dr and RBD storage #1007
  • Need to split to smaller commits
  • Do we want to test enable/disable in all states (Deployed,
    FailedOver, Relocated)?
  • some settings ("busybox", "busybox-placement") can move to config.yaml
  • With this change we don't test deploy of DR enabled application - is
    this important to test? (looks less useful use case)

@ShyamsundarR
Copy link
Member

We keep tinkering on the python script based tests without coming to a conclusion on if we are going to write e2e tests in go or otherwise. This work will hence be a duplicate of the same. @raghavendra-talur and @nirs any idea why we are going down this path...?

@nirs
Copy link
Member Author

nirs commented Jul 31, 2023

We keep tinkering on the python script based tests without coming to a conclusion on if we are going to write e2e tests in go or otherwise. This work will hence be a duplicate of the same. @raghavendra-talur and @nirs any idea why we are going down this path...?

From my point of view, there is no reason to duplicate this test in another way, and this
type of test which I can easily run all of it or only parts of it is a critical for working
on ramen.

@raghavendra-talur
Copy link
Member

From my point of view, there is no reason to duplicate this test in another way, and this
type of test which I can easily run all of it or only parts of it is a critical for working
on ramen.

I disagree with this statement, what we need from E2E tests is the following:

  1. We need to be able to test the complete failover and relocate scenarios.
  2. We need to test the error paths for which we need to be able to modify the responses of the adjacent systems, like CSI, ACM etc
  3. We need to test the API that is provided by Ramen in the go api system.

Using a different language for tests is going to be very detrimental. We will not be able to use the API provided by Ramen directly. We should not be writing glue code to make that happen. We are taxing the new contributors by forcing them to learn two different systems to add even a basic feature. In my opinion, drenv should be limited to only bring up of the test environment, even which I would have preferred to have in go but that ship has sailed.

Please don't add any more tests to this directory.

@nirs
Copy link
Member Author

nirs commented Jul 31, 2023

From my point of view, there is no reason to duplicate this test in another way, and this
type of test which I can easily run all of it or only parts of it is a critical for working
on ramen.

I disagree with this statement, what we need from E2E tests is the following:

  1. We need to be able to test the complete failover and relocate scenarios.

There is no problem to test more scenarios in the same way we do now.

  1. We need to test the error paths for which we need to be able to modify the responses of the adjacent systems, like CSI, ACM etc

Testing error paths is not a good idea in system tests in general, unless
the error path is the core of the tested program. System tests are very slow,
we don't want tests that takes hours to run. We want to test only the most
important flows.

But regardless testing error paths is not related to how we write the tests.

  1. We need to test the API that is provided by Ramen in the go api system.

Good point, but we can have specific tests for the API - these can be very quick
unit tests.

Using a different language for tests is going to be very detrimental. We will not be able to use the API provided by Ramen directly.

We can test the API as user consume - the resources.

We should not be writing glue code to make that happen.

Can you show such code in the current test?

We are taxing the new contributors by forcing them to learn two different systems to add even a basic feature. In my opinion, drenv should be limited to only bring up of the test environment, even which I would have preferred to have in go but that ship has sailed.

But already need to use python for drenv, ramenctl and the addons scripts, and
contributors need to do this anyway. For example for supporting new features like
application sets that are not integrated yet in drenv yet.

We can port drenv, ramenctl, and the addons to Go, but I'm not sure this is the best
way to spend our time.

Please don't add any more tests to this directory.

This is my only way to work on the code now. To work on disable dr, I must
improve the current test since this test does not exists e2e framework (which is also
only in review).

When the e2e framework is ready we can port basic-test to it but we are not there yet.

@nirs
Copy link
Member Author

nirs commented Jul 31, 2023

Example test with the current version and ramen built from #1007

$ basic-test/run $env
2023-08-01 00:27:20,967 INFO    [deploy] Creating temporary directory /tmp/ramen-test/basic-test
2023-08-01 00:27:20,968 INFO    [deploy] Cloning ocm-ramen-samples
2023-08-01 00:27:20,968 INFO    [deploy] Deploying busybox channel
2023-08-01 00:27:21,384 INFO    [deploy] Deploying busybox subscription
2023-08-01 00:27:21,911 INFO    [deploy] waiting for namespace busybox-sample
2023-08-01 00:27:21,992 INFO    [deploy] Waiting for placement decision
2023-08-01 00:27:22,141 INFO    [deploy] Waiting until busybox subscription reports phase
2023-08-01 00:27:22,224 INFO    [deploy] Waiting until busybox subscription is propagated
2023-08-01 00:27:22,396 INFO    [deploy] Waiting until busybox is rolled out on cluster 'dr2'
2023-08-01 00:27:22,455 INFO    [deploy] Application was deployed successfully
2023-08-01 00:27:22,574 INFO    [protect] Enabling DR for the busybox application
2023-08-01 00:27:22,574 INFO    [protect] Setting placement scheduler to ramen
2023-08-01 00:27:22,792 INFO    [protect] Creating kustomization for using cluster 'dr2'
2023-08-01 00:27:22,793 INFO    [protect] Deploying drpc for busybox application
2023-08-01 00:27:23,151 INFO    [protect] Waiting until busybox drpc reports phase
2023-08-01 00:28:04,458 INFO    [protect] Waiting until busybox drpc is deployed
2023-08-01 00:28:04,742 INFO    [protect] Waiting until application is replicated
2023-08-01 00:29:23,135 INFO    [protect] Application was protected successfully
2023-08-01 00:29:23,365 INFO    [failover] Failing over from cluster 'dr2' to cluster 'dr1'
2023-08-01 00:29:23,366 INFO    [failover] Waiting until application is replicated
2023-08-01 00:29:23,439 INFO    [failover] Starting failover
2023-08-01 00:29:23,564 INFO    [failover] Waiting until application is failed over...
2023-08-01 00:29:53,172 INFO    [failover] Waiting until application is replicated
2023-08-01 00:32:53,261 INFO    [failover] Application was failed over to cluster dr1 successfully
2023-08-01 00:32:53,435 INFO    [relocate] Relocate from cluster 'dr1' to cluster 'dr2'
2023-08-01 00:32:53,435 INFO    [relocate] Waiting until peer is ready
2023-08-01 00:32:53,602 INFO    [relocate] Waiting until application is replicated
2023-08-01 00:32:53,675 INFO    [relocate] Starting relocate
2023-08-01 00:32:53,793 INFO    [relocate] Waiting until application is relocated...
2023-08-01 00:35:23,216 INFO    [relocate] Waiting until relocation completes...
2023-08-01 00:35:54,270 INFO    [relocate] Application was relocated to cluster dr2 successfully
2023-08-01 00:35:54,427 INFO    [unprotect] Disabling DR for the busybox application
2023-08-01 00:35:54,532 INFO    [unprotect] Deleting busybox drpc
2023-08-01 00:36:23,136 INFO    [unprotect] Removing placement rule scheduler
2023-08-01 00:36:23,421 INFO    [unprotect] Application was unprotected successfully
2023-08-01 00:36:23,625 INFO    [undeploy] Deleting busybox subscription
2023-08-01 00:36:29,275 INFO    [undeploy] Deleting busybox channel
2023-08-01 00:36:34,684 INFO    [undeploy] Application was undeployed successfully

@ShyamsundarR
Copy link
Member

@nirs we need to develop and add tests to the e2e framework not in drenv. This only forks the tests and needs to be redone. It is better if this was worked on based on golang based e2e.

ramenctl would go the same way BTW, once we move it to a kubectl plugin as discussed earlier.

@nirs
Copy link
Member Author

nirs commented Aug 1, 2023

@nirs we need to develop and add tests to the e2e framework not in drenv. This only forks the tests and needs to be redone. It is better if this was worked on based on golang based e2e.

I'm not sure the go test framework allows the kind of control the basic test allows. Maybe this is possible, but this requires weeks of development to get back to the current state.

Please review the current code and the related ocm-ramen-samples pr, this is needed for testing the disable dr fix. Without this change you need to test manually, something like
#1007 (comment)
(which does not include verification steps added to basic-test/{protect,unprotect}).

ramenctl would go the same way BTW, once we move it to a kubectl plugin as discussed earlier.

ramenctl work only when run in ramen source directory. I'm not sure what it does is good match for a kubectl plugin, but we can rewrite this in Go if it is better for the team.

@nirs nirs force-pushed the disable-dr-test branch 3 times, most recently from 921bbb5 to 28052a3 Compare August 6, 2023 21:10
nirs added a commit to nirs/ocm-ramen-samples that referenced this pull request Aug 30, 2023
The way test in upstream is not compatible with the way openshift works.
In openshift we install the application using ACM, and then we assign a
drpolicy, which adds a drpc resource.

Change the example yamls so we can test the same flow in minikube,
making upstream flows similar to downstream and possible upstream usage.

This way we can also test enabling and disabling DR, which was not
tested before.

This change should be merged together with this ramen pr:
RamenDR/ramen#1012

Signed-off-by: Nir Soffer <[email protected]>
nirs added a commit to nirs/ocm-ramen-samples that referenced this pull request Aug 30, 2023
The way test in upstream is not compatible with the way openshift works.
In openshift we install the application using ACM, and then we assign a
drpolicy, which adds a drpc resource.

Change the example yamls so we can test the same flow in minikube,
making upstream flows similar to downstream and possible upstream usage.

This way we can also test enabling and disabling DR, which was not
tested before.

This change should be merged together with this ramen pr:
RamenDR/ramen#1012

Signed-off-by: Nir Soffer <[email protected]>
Documenting latest test version is useful for new contributors or for
rebuilding a developer machine.

Signed-off-by: Nir Soffer <[email protected]>
We are still using 1.10, updated 6 month ago.

Signed-off-by: Nir Soffer <[email protected]>
To dump kubeconfigs on clusters created with the --name-prefix option we
need to prefix the cluster names in the environment. The best place to
do this is in envfile.load() when we prefix all other names.

Without this fix, starting with --name-prefix cannot locate the clusters
in the kubeconfig:

    drenv.commands.Error: Command failed:
       command: ('kubectl', 'config', '--context', 'hub', 'view', '--flatten', '--minify')
       exitcode: 1
       error:
          error: cannot locate context hub

Signed-off-by: Nir Soffer <[email protected]>
Change the basic test to test the more generic flow - deploying an
application using OCM, and enable DR for the deployed application.
Finally test disabling DR after the application was failed over and
relocated.

This change depends on ocm-ramen-samples change splitting the channel
and drpc from the subscription.

Issues:
- Need to split to smaller commits
- We test disable dr only in Relocate phase. We will test all disable
  dr flows in other tests.
- With this change we don't test deploy of DR enabled application. We
  will add another test for this use case.

Signed-off-by: Nir Soffer <[email protected]>
@nirs
Copy link
Member Author

nirs commented Sep 7, 2023

Replacing with a cleaner version based on #1056.

@nirs nirs closed this Sep 7, 2023
@nirs nirs deleted the disable-dr-test branch September 7, 2023 23:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants