Staging LC picks up PFS's from the disjoint RSBs at random #2521

agatsoh · 2021-01-21T11:30:40Z

Thanks for filing a bug report :-)

Steps to Reproduce

Environment setup:
node1(LC) <----> hub(python client) <------> node2 (LC)

establish bi-directonal channels between node1 and hub, node2 and hub
Hub has been specifically configured to run on the following
2.1 PFS on pfs.demo001.env.raiden.network
2.2 Transport on transport.demo001.env.raiden.network
Node1 and Node2 select the
3.1 Transport as transport.demo001.env.raiden.network but
3.2 PFS they sometimes select
3.2 a pfs.demo001.env.raiden.network and this is time when the tranfer has correct route and the transfer goes through. At the time the nodes dont see the other PFSs and act as if pfs.demo001.env.raiden.network is the only PFS available. Unfortunately i did not take a screenshot for this.
3.2 b Sometimes the nodes use all the other PFS's like pfs.transport0x.raiden.network (where x is a number) but do not see the pfs.demo001.env.raiden.network as if it never exists.

The next step they get No route between you and target

This causes one way transfers even though capacity exists either way.
Lots of users on gitter are facing the same issue and are utterly confused why they cannot make a simple transfer and why the hub is always offline or something which can happen when hub is on transport.transport0x.raiden.network and pfs.transport0x.raiden.network

Expected Result

Each time depending on the transport server that has been picked up the correct PFS should be picked up.

Actual Result

Has been described above

The text was updated successfully, but these errors were encountered:

agatsoh · 2021-01-26T14:42:08Z

Observations using local LC nodes as opposed to the staging ones
using the staging environment file .env.staging for locally deployed nodes
yarn workspace raiden-dapp run serve --mode staging

the nodes pickup transport02 server for transport and query all the PFS's at the time of route selection.
The hub needs to be on the transport0x servers otherwise the again we would get No route between you and the target above.

Finally if we use a environment file for example like this

VUE_APP_PFS=https://pfs.transport02.raiden.network
VUE_APP_MATRIX_SERVER=https://transport.transport02.raiden.network
VUE_APP_HUB=hub.raiden.eth
VUE_APP_ALLOW_MAINNET=false
VUE_APP_IMPRINT=https://raiden.network/privacy.html

and start the servers
yarn workspace raiden-dapp run serve --mode testnet

in this way the PFS selection is forced in the dapp to use only that one PFS which we have specified in the environment file.

I reach the conclusion that till we have this disjoint transport servers with staging environment not explicitly specifying a PFS this confusion is there to remain
@Talel

taleldayekh · 2021-02-09T13:46:27Z

@weilbith Describe issue and our solution to the problem.

weilbith · 2021-02-10T09:43:25Z

Alight, let's give this a try.

The root of all the described symptoms is that while the development and staging environment share the same contract deployment, their associated RSB deployments are different. Thereby we get an awkward 50% connection of both environments.

In our discussions we pointed out two possible solutions. Either make both fully connected (100%) or completely separate them as two (0% connection). Merging both environments completely seems to be a bad idea as they have different purposes. The staging environment is meant to be used for test network demonstrations of the Raiden clients. As a environment for external developers to try the system and get familiar with the libraries to build their own products on top. Therefore it is meant to be stable and reliably working. Though it has not as strict reliability constraints as the mainnet.
The development environment is meant to be used for internal engineering. Basically the step after the end-to-end tests, like for the Scenario Player. An environment where the RSB is always on latest version. Where unstable/unreleased features of the services can be tested. Thereby it is less stable and reliable than the staging environment. But all eventual issues stay internal and only hinder our own daily development flow. Deployments can be way less formal and hacky if necessary.

Therefore it is the currently favored approach to separate both networks. As the stable version the staging environment should stay as is and the development environments pops out. This requires us to

make a new contract deployment on Goerli
the current active services for the development need to change their contract pointers and probably a fresh state
all client configurations related to the different environments need to adopted

The following steps on this topic are:

align on the decision and solution approach
complete the task list
decide if each client implementation needs its own development environment
create actual issues for the tasks in the list
assign issues and prioritize them in the imperative order

@palango @andrevmatos could you eventually provide some feedback, especially regarding the task list, as I'm not familiar with the Raiden service bundle.

palango · 2021-02-10T11:35:08Z

Looks good to me, I think it's important to first align on the approach, we can work out the details afterwards.

andrevmatos · 2021-02-10T12:23:34Z

I think that should be about it. Some observations:

I'd like the both clients to use the same network, it makes it easier to cross-test.
at least on LC, the state key/db/name uses contains the TokenNetworkRegistry address, so a new deployment automatically generates a new state
Please, let's try to re-use as much contracts as possible; e.g. at least the tokens and service token contracts can be reused without issues, and save us calls to mint them again.
- Maybe SecretRegistry is safe to be reused as well.
the clients (at least LC) picks automatically the goerli addresses from the JSON file from the raiden-contracts repo. I think the file there should remain pointing to the stable deployment, but then we need to ensure that the right contracts are used on development/staging servings of the LC and PC. The LC supports auto-detecting the addresses from a single entry-point contract (UDC address), we just need to expose this on dApp to pick from the env vars, and have them set in the development/staging .envs.

weilbith · 2021-02-10T13:29:13Z

Looks good to me, I think it's important to first align on the approach, we can work out the details afterwards.

@palango who would you suggest to take into account? So we could just mention them here.

Please, let's try to re-use as much contracts as possible; e.g. at least the tokens and service token contracts can be reused without issues, and save us calls to mint them again.

Good idea. Should be safe to re-use the tokens from my perspective. In my opinion they are actually not part of the Raiden contracts themselves.

The LC supports auto-detecting the addresses from a single entry-point contract (UDC address), we just need to expose this on dApp to pick from the env vars, and have them set in the development/staging .envs.

Yes. I remember. We just have to do it. 😬

weilbith · 2021-02-10T14:16:23Z

@eorituz @ezdac @karlb @ulope could you have a look from here and provide some feedback? We could also schedule a meeting if the opinions diverge too much or there are many questions.

karlb · 2021-02-11T09:19:50Z

I'd like the both clients to use the same network, it makes it easier to cross-test.

I agree.

Maybe SecretRegistry is safe to be reused as well.

It is and there are plans to reuse it even on mainnet for a long time: raiden-network/raiden-contracts#1182

e.g. at least the tokens and service token contracts can be reused without issues

Reusing the tokens is definitively a good idea and has been advocated multiple times already. It would cause less confusion and make the testnets more similar to the mainnet, since the mainnet also uses the same tokens on each deployment. The necessary changes to the deployment have just never been done.

karlb · 2021-02-11T09:23:03Z

Using different contract deployments will provide a nice clean split. Coordinating the contract changes across all the different repos could be error-prone, though. Does anyone have a clear view on how that would work in detail, yet?

taleldayekh · 2021-02-12T14:03:08Z

@taleldayekh organise meeting about this.

karlb · 2021-02-16T14:58:27Z

Another approach would be to reject PFSs that are not operating in the same federation (we could add a federation id to the info endpoint if the existing info is not sufficient) and look for another PFS. That behavior is useful anyway and would avoid the need to add any new configuration. This would be less prone to strange errors due to misconfiguration.

konradkonrad · 2021-02-22T14:59:21Z

I would suggest to move the demo environment to rinkeby.

Pros:

completely separate deployment
no extra code necessary

Cons:

don't know?

Allow switching between the two contract deployments on goerli and default to the old/demo contracts. See raiden-network/light-client#2521 on why we want to do this. Closes #6862

Relates to raiden-network/light-client#2521 (comment)

Part of raiden-network/light-client#2521 Merge at the same time as raiden-network/raiden-services#951.

Relates to raiden-network/light-client#2521 (comment)

Since raiden-network#951, we can use the unstable contracts environment. We want to do that for the development machines that are run using the docker-compose. Since that environment only exists for goerli, we only enable it for the goerli services. Relates to raiden-network/light-client#2521

Since #951, we can use the unstable contracts environment. We want to do that for the development machines that are run using the docker-compose. Since that environment only exists for goerli, we only enable it for the goerli services. Relates to raiden-network/light-client#2521

karlb · 2021-04-12T07:54:43Z

The new contracts are set up and used in the services-dev services and the services on transport05 as well as by the SP CI development.json environment.

Remaining TODOs:

Update services on transport01-04 to use these contracts
Allow LC to use these contracts
Configure LC to use these contracts when run in SP CI
Demo env and development_environment: unstable work. Service discovery does not work in demo env (until the next contract deployment), but is also not used, usually. Decide if that is enough.
Should we try a different service when there is a contract mismatch between PFS and raiden node? Right now we fail with an error.
Update docs if neccessary

Allow switching between the two contract deployments on goerli and default to the old/demo contracts. See raiden-network/light-client#2521 on why we want to do this. Closes #6862

taleldayekh · 2021-05-12T13:30:13Z

Can we close this ones our next branch has been merged to master? @taleldayekh

taleldayekh · 2021-06-23T13:38:45Z

Closing, next is noe merged to master.

taleldayekh added the bug 🕷️ Something isn't working label Jan 21, 2021

taleldayekh added this to the Product Backlog milestone Jan 21, 2021

taleldayekh removed this from the Product Backlog milestone Jan 21, 2021

taleldayekh assigned agatsoh Jan 22, 2021

taleldayekh assigned weilbith and unassigned agatsoh Feb 9, 2021

weilbith removed their assignment Feb 12, 2021

taleldayekh added this to the Product Backlog milestone Feb 17, 2021

karlb mentioned this issue Mar 5, 2021

Allow selection of contracts from unstable env raiden-network/raiden#6862

Closed

2 tasks

karlb mentioned this issue Mar 25, 2021

Allow selection of unstable env/contracts raiden-network/raiden#6905

Merged

karlb self-assigned this Mar 26, 2021

karlb added a commit to karlb/raiden-services that referenced this issue Mar 26, 2021

Allow selection of unstable contracts env

ec0a57a

Relates to raiden-network/light-client#2521 (comment)

karlb mentioned this issue Mar 26, 2021

Allow selection of unstable env raiden-network/raiden-services#951

Closed

karlb added a commit to karlb/scenario-player that referenced this issue Mar 26, 2021

Use unstable contract env for development env

aef40bd

Part of raiden-network/light-client#2521 Merge at the same time as raiden-network/raiden-services#951.

karlb mentioned this issue Mar 26, 2021

Use unstable contract env for development env raiden-network/scenario-player#663

Merged

karlb added a commit to raiden-network/scenario-player that referenced this issue Apr 1, 2021

Use unstable contract env for development env

d693760

Part of raiden-network/light-client#2521 Merge at the same time as raiden-network/raiden-services#951.

karlb added a commit to raiden-network/raiden-services that referenced this issue Apr 8, 2021

Allow selection of unstable contracts env

4bc5c40

Relates to raiden-network/light-client#2521 (comment)

karlb mentioned this issue Apr 8, 2021

Use unstable contract env for goerli docker setup raiden-network/raiden-services#956

Merged

karlb removed their assignment Apr 12, 2021

taleldayekh modified the milestones: Product Backlog, Product Backlog - post mainnet Jun 17, 2021

taleldayekh closed this as completed Jun 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Staging LC picks up PFS's from the disjoint RSBs at random #2521

Staging LC picks up PFS's from the disjoint RSBs at random #2521

agatsoh commented Jan 21, 2021

agatsoh commented Jan 26, 2021

taleldayekh commented Feb 9, 2021

weilbith commented Feb 10, 2021

palango commented Feb 10, 2021

andrevmatos commented Feb 10, 2021 •

edited

Loading

weilbith commented Feb 10, 2021 •

edited

Loading

weilbith commented Feb 10, 2021

karlb commented Feb 11, 2021

karlb commented Feb 11, 2021

taleldayekh commented Feb 12, 2021

karlb commented Feb 16, 2021

konradkonrad commented Feb 22, 2021 •

edited

Loading

karlb commented Apr 12, 2021 •

edited by palango

Loading

taleldayekh commented May 12, 2021

taleldayekh commented Jun 23, 2021

Staging LC picks up PFS's from the disjoint RSBs at random #2521

Staging LC picks up PFS's from the disjoint RSBs at random #2521

Comments

agatsoh commented Jan 21, 2021

Thanks for filing a bug report :-)

Steps to Reproduce

Expected Result

Actual Result

agatsoh commented Jan 26, 2021

taleldayekh commented Feb 9, 2021

weilbith commented Feb 10, 2021

palango commented Feb 10, 2021

andrevmatos commented Feb 10, 2021 • edited Loading

weilbith commented Feb 10, 2021 • edited Loading

weilbith commented Feb 10, 2021

karlb commented Feb 11, 2021

karlb commented Feb 11, 2021

taleldayekh commented Feb 12, 2021

karlb commented Feb 16, 2021

konradkonrad commented Feb 22, 2021 • edited Loading

karlb commented Apr 12, 2021 • edited by palango Loading

taleldayekh commented May 12, 2021

taleldayekh commented Jun 23, 2021

andrevmatos commented Feb 10, 2021 •

edited

Loading

weilbith commented Feb 10, 2021 •

edited

Loading

konradkonrad commented Feb 22, 2021 •

edited

Loading

karlb commented Apr 12, 2021 •

edited by palango

Loading