-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doc: Add concept document for Bidirectional Data Transfer #1398
doc: Add concept document for Bidirectional Data Transfer #1398
Conversation
DSP start message containing the response channel `DataAddress` is received by the client. This is due to the nature of | ||
asynchronous communications. In this case, the client would either need to skip sending a response or store the response | ||
messages to send when it receives the response channel `DataAddress`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some remarks:
- The concept concentrates on the interaction between two connectors on how they exchange the data. I would expect some information on how the data is processed, i.e., how it is stored and connected to the transfer, as well as how it can be retrieved from the connectors.
- In your remark on the race conditions in the push case, it becomes clear that there are separate messages/streams, one that pushes the data to the consumer and one that contains the return channel. I assume, that the latter is part of the negotiations, the two data planes do to start the transfer. Is this the relation to DataPlaneManagerImpl which would, in the push case, create an EDR token. I assume that this addition means, the EDR representating the response channel is simply an addon on the existing protocol message.
- Is there any implication on a combination where two data planes communicate, but only one of them is capable of handling a response channel. If the consumer side wants to use that, I do not see an issue, but in the other direction, the consumer gets out of a sudden more information, does it handle that gracefully?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments:
- How the data is processed does not concern the fact that communication is bi-directional. Bi-directional channels do not need to be concerned with what happens to the data after it is received or any qualities of service (e.g., reliability) associated with the particular wire protocol.
- The transfer type is advertised in the DCAT
Distribution
linked to theOffer
, and that carries the fact that the wire protocol is bi-directional. Hence, there is a need for Catena-X (or another dataspace/project, etc.) to standardize a transfer type. The response channel endpoint information is contained within the forwardDataAddress
- A client data plane must support one of the wire protocols associated with an offer via DCAT
Distributions.
Otherwise, it will not have access to the data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand that right, a data transfer with a response channel would require a new transfer type, i.e., it duplicates the amount of transfer types, right?
As far as I understood the original requirement, the channel is about giving feedback on the received data. E.g., to indicate, that the data quality is poor. Is this really related to the data transfer, as actually, there is an observed mismatch on the consumer side between the expectations based on the offer and the concrete data received. Wouldn't that be actually a concept on the DSP level, as the feedback is about contract fulfillment.
If the data is broken or incomplete, the consumer could simply reinitiate the data transfer, so that is not really a reason to use the response channel, right? So it is really about a higher level concept on the received data, imho.
On the other hand side, there could be many data transfers on the same contract, so if one transfer lead to poor quality, there is reason to not mark the whole contract with the feedback issue.
Still, the concept only describes a form of sending data back to the provider, but the intention of the requirement was to give feedback on the received data. In my opinion, this still requires a reaction on the data on the provider side. Something like a label on the data transfer or a special state. Even, if the message is not formalized at all, an indicator, that there is feedback on the data transfer should be part of the concept. In the current state, the relation between the send data and the metadata on the feedback channel gets lost after it is received.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the concept is about how to represent a bidirectional data transfer. It does not involve qualities of service such as reliability, which involve retransmission (for example, all reliable messaging protocols require idempotency). Qualities of service are implemented by the underlying wire protocol used for the forward and response channels, for example, AMQP. The response channel would never be used to send quality of service information back to the prodivder. Rather, one use could be to send information about errors in the data sent via the forward channel.
The scope of this concept should be only to describe how forward and back channels are established between a consumer and producer. It should not discuss what purposes clients and producers use those channels for. That is the job of the particular transfer protocol that would use this feature.
- `DataPlaneAuthorizationServiceImpl` must be enhanced to support `responseChannel` generation. This should be keyed off | ||
of the transfer type. As part of this process, a `DataPlaneAuthorizationServiceImpl.createEndpointDataReference` must | ||
generate a `responseChannel` endpoint by delegating to a new | ||
method `PublicEndpointGeneratorService.generateCallbackFor(sourceDataAddress).` Access Tokens can be generated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exactly is sourceDataAddress in this case? I stuggle with the synchronicity between push and pull scenario here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To understand this requires specialized EDC knowledge. The sourceDataAddress is the reference to the backend asset being transferred. This address is internal to the EDC deployment and not available externally (e.g. to a consumer). The endpoint generation service is responsible for interpreting the address and mapping a publically available endpoint that is associated with retrieving the data.
WHAT
Adds a concept for Bidirectional Data Transfers. Associated with this issue.
Closes #1397