Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for transfer summary information. #1

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

paulmillar
Copy link
Member

Motivation:

We want to know why a transfer failed.

Modification:

Add ability for the active party to provide summary information about a transfer. This is in the form of ancillary lines: key-value pairs after the one-line summary.

Add a data-over-HTTP specific ancillary line that carries the HTTP status line from the GET/PUT request.

These are all optional, so existing servers are conforming.

Result:

A client can discover more information about failing transfers.

Motivation:

We want to know why a transfer failed.

Modification:

Add ability for the active party to provide summary information about a
transfer.  This is in the form of ancillary lines: key-value pairs after
the one-line summary.

Add a data-over-HTTP specific ancillary line that carries the HTTP
status line from the GET/PUT request.

These are all optional, so existing servers are conforming.

Result:

A client can discover more information about failing transfers.
@pskopnik
Copy link

I'm sure you're considering this already: The ancillary lines match the format of the progress markers.
Could we hence unify the two and define a model and format which cover both?

I propose describing the text/perf-marker-stream format as a stream of individual reports.
Each report is "opened" by a single line - indicating the report's type - followed by a list of properties in the Key . ":" . Value format (with optional spaces) and then finished by a single line containing End.
There would be a Performance Marker Report type, starting with a Perf Marker line and a Summary Report type starting either with Success or Failure .*.
Properties and their semantics are then clearly related to the report type.

Of course this primarily affects naming and structure of the specification, the only effect on the wire protocol would be to (optionally?) add an End line to conclude the summary report.

@paulmillar
Copy link
Member Author

Hi @pskopnik

I like your idea: I think describing the output as you describe makes a lot of sense.

However, I'd like to pursue that as a separate activity: not to do too much in a single commit and to avoid mixing refactoring (updating existing descriptions) and adding new features.

Therefore, I've created an new issue, #2, that (I hope) captures your intent here.

@pskopnik
Copy link

pskopnik commented Oct 5, 2022

Thanks for considering, that sounds sensible 👍

The only things which remains to be discussed from my previous comment is whether to add an End line to conclude the summary report/ancillary lines. I'm in favour of this change to bring the format in line with the progress markers.

@paulmillar
Copy link
Member Author

I'm a little worried about making breaking changes, unless there's a good reason. Requiring an End would make all existing services non-compliant.

So, one possibility would be make End optional (server MAY include it) if the event is a single line, but the server MUST include End if the event has multiple lines?

@pskopnik
Copy link

I see your point, of course the protocol evolution has to be managed carefully.

The 92476c2 (current main) version of the specification requires clients to ignore anything after the Success/Failure: * line.
That means any existing clients will remain compatible with servers implementing summary reports (regardless of whether End makes it into the spec).
Any clients implementing summary reports must retain backwards compatibility with servers at 92476c2 version by also accepting a Success/Failure: * without any property lines and End line.

So there are some alternatives for this PR:

  1. Servers must not send an End line.
  2. Servers may send an End line at the end of the summary report.
  3. Servers must sent an End line if properties are transmitted as part of the summary report and may send an End line if not.
  4. Additionally, if an End line is allowed: Servers should always send an End line, with the goal of deprecating not sending End.

My favoured alternative would be 3. + 4., for bringing the summary report in line with the format of the progress markers.
Clients must still accept a Success/Failure: * line without any further data.

In your comment #2 (comment), you also raise the benefit of a client being certain that the summary report is concluded.

However, it's not complete redundant. The TCP connection could be prematurely closed, which risks that information was lost. This scenario would be identified by the HTTP client not receiving the final empty block from chunked encoding. However, the code processing the event stream might not have access to this information (whether the TCP connection was closed prematurely). In this case, sending End could help identify that information may have been lost.

@paulmillar
Copy link
Member Author

Thanks for you input. I think you've summarised the different options very well.

I've given this some more thought.

I would go with something similar to 3., but phrased a little stronger:

Server MUST send End if final properties are transmitted and SHOULD send End if no final properties are transmitted.

When I get a spare moment, I'll update the pull request accordingly.

This way, we have a bit of pressure on the storage-elements to update their output, without forcing anything. We need to check that the storage-element providers are happy with this change.

@pskopnik
Copy link

Alright, sounds good. This matches my preference of 3. + 4. as well.

This way, we have a bit of pressure on the storage-elements to update their output, without forcing anything. We need to check that the storage-element providers are happy with this change.

Yes, that's true. If the major HTTP-TPC server software projects are changed, I guess the operators will be happy to gradually update to new versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants