Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Helm] GP Tests failing on first run when upgrading from v14.1.1 to v15 due to WS issues between TTK and SDKs #3164

Open
mdebarros opened this issue Mar 15, 2023 · 4 comments
Labels
bug Something isn't working or it has wrong behavior on a Mojaloop Core service oss-core This is an issue - story or epic related to a feature on a Mojaloop core service or related to it

Comments

@mdebarros
Copy link
Member

mdebarros commented Mar 15, 2023

Summary:

GP Tests fail intermitantly when upgrading a release from v14.1.1 to v15 due to WS issues between TTK and SDK's Mojaloop Simulators.

Refer to the following GP Test Report.

This only impacts the Mojaloop Simulator SDK-Scheme-Adapter component (specifically the TEST API using WebSockets), and thus why the GP Tests fail assertions.

The issue seems to be related to a connectivity issue being caused by the "restarting" of the SDK-Scheme-Adapter components during the upgrade process.

NOTE: It should not impact any "live" transactions through the system.

Severity:
Low

Priority:
Medium

Expected Behavior

GP Tests should pass with 100% assertion checks.

Steps to Reproduce

  1. Deplpy Mojaloop v14.1.1
  2. Upgrade Mojaloop v14.1.1 to v15
  3. Execute Helm Tests

Specifications

  • Component (if known): Helm
  • Version: v14.1.1 --> v15
  • Platform: n/a
  • Subsystem: n/a
  • Type of testing: n/a
  • Bug found/raised by: @mdebarros

Notes:

  • Severity when opened: Low
  • Priority when opened: Medium

Important Update on 2023-05-23:

This issue can also occur in general, i.e. not only when the environment has been upgraded.

E.g. --> https://mojaloop.slack.com/archives/CG3MAJZ5J/p1684839469352009

@mdebarros mdebarros added the bug Something isn't working or it has wrong behavior on a Mojaloop Core service label Mar 15, 2023
@mdebarros
Copy link
Member Author

mdebarros commented Mar 15, 2023

Investigation

Findings

Looking at the failed Active and inactive participant GP Test Collection, one can see that ALL assertions fail that require requests or callbacks to be "collected" by a WS Notification mechanism between the TTK and the Simulator SDK-Scheme-Adapters.

Taking the failing Quote requests, one can see that the Quote was successful by looking at the quoting-service, sim-testfsp1-scheme-adapter, and sim-testfsp2-scheme-adapter logs:

  1. Quote Requests and Callbacks can be seen in the quoting-service and the sim-testfsp1-scheme-adapter logs
  2. Quote request being processed with Callbacks being sent as a response can be seen in the sim-testfsp2-scheme-adapter logs.

Thus I can only conclude that there is an issue with the WS client/server connectivity between the TTK and the Simulator's SDK-Sheme-Adapter.

Artifacts:

Work Around

Restarting the Simulator SDK-Scheme-Adapters resolves the WS connectivity issue between the TTK and the Simulator SDK-Scheme-Adapters, thereby allowing the GP Tests to pass with 100% assertions.

@elnyry-sam-k elnyry-sam-k changed the title [Helm] GP Tests fail intermitantly when upgrading a release from v14.1.1 to v15 due to WS issues between TTK and SDKs [Helm] GP Tests fail intermittently when upgrading from v14.1.1 to v15 due to WS issues between TTK and SDKs Mar 20, 2023
@elnyry-sam-k elnyry-sam-k added the oss-core This is an issue - story or epic related to a feature on a Mojaloop core service or related to it label Mar 20, 2023
@elnyry-sam-k elnyry-sam-k changed the title [Helm] GP Tests fail intermittently when upgrading from v14.1.1 to v15 due to WS issues between TTK and SDKs [Helm] GP Tests failing on first run when upgrading from v14.1.1 to v15 due to WS issues between TTK and SDKs Mar 20, 2023
@elnyry-sam-k
Copy link
Member

Not high-priority, removing from this Sprint; to be re-prioritized at a later time if there is a repro

@mdebarros
Copy link
Member Author

Important Update on 2023-05-23

This issue can also occur in general, i.e. not only when the environment has been upgraded.

E.g. --> https://mojaloop.slack.com/archives/CG3MAJZ5J/p1684839469352009

@mdebarros
Copy link
Member Author

mdebarros commented Oct 30, 2023

Important Update on 2023-10-30

A temporary work-around has been attempted in Mojaloop Helm v15.2.0-rc release tracked by this issue --> #3597.

The Test-cases that show this issue have been temporarily switched to HTTP API calls instead of WS subscribers.

The WS has been verified as a result, and is most likely being caused due to stale connections that are not properly being recycled (i.e. reconnected on failures, etc). This is a result of the WS v18.x lib changes that were introduced, where Mojaloop Core services will need to include a "health ping" capability to properly ensure this is happening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working or it has wrong behavior on a Mojaloop Core service oss-core This is an issue - story or epic related to a feature on a Mojaloop core service or related to it
Projects
None yet
Development

No branches or pull requests

2 participants