Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support streaming snapshot over fd #39

Merged
merged 3 commits into from
Apr 28, 2023
Merged

Conversation

mhofman
Copy link
Contributor

@mhofman mhofman commented Apr 27, 2023

Refs: Agoric/agoric-sdk#6363

This introduces support for using an existing fd to read and write the snapshot, allowing streaming to/from the parent process.

A filename starting with @ will denote a fd to use.
This also changes the internal structure used when reading/writing the snapshot to keep track of the bytes read/written if applicable. (read limits not currently used as it hasn't been necessary).

int writeError = fxWriteOkay(toParent, meterIndex, machine, "", 0);
// Allows us to format up to 999,999,999,999 bytes (1TiB - 1)
char fsize[13];
int fsizeLength = snprintf(fsize, sizeof(fsize) - 1, "%d", stream.size);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why sizeof(fsize) - 1? snprintf won't NUL-terminate the buffer if it hits this limit, so you have 1 byte of initialized stack in it. In that case, you might actually mean:

Suggested change
int fsizeLength = snprintf(fsize, sizeof(fsize) - 1, "%d", stream.size);
int fsizeLength = snprintf(fsize, sizeof(fsize) - 1, "%d", stream.size);
fsize[fsizeLength] = '\0';

Or on the other hand, if you don't need fsize to be NUL-terminated, maybe you mean:

Suggested change
int fsizeLength = snprintf(fsize, sizeof(fsize) - 1, "%d", stream.size);
int fsizeLength = snprintf(fsize, sizeof(fsize), "%d", stream.size);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing this was copied from the snprintf() in fxWriteOkay(), around line 813, which is probably mine. My read of the snprintf man page (on my Mac):

The snprintf() and vsnprintf() functions will write at most size-1 of the characters printed into the output string (the size'th character then gets the terminating ‘\0’); if the return value is greater than or equal to the size argument, the string was too short and some of the printed characters were discarded. The output is always null-terminated, unless size is 0.

is that using sizeof(fsize)-1 means fsize will always be NUL-terminated, even if the argument is too big (the string will be truncated in that case, but it will still be NUL-terminated). I remember worrying about clobbering the stack, and wanted to make sure that even if I miscalculated how large the buffer needed to be, the worst case would be corrupted meter data and not crashing or corrupting the execution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another man quote:

The functions snprintf() and vsnprintf() write at most size bytes (including the terminating null byte ('\0')) to str.

So I assume that yes this is being overly cautious. It does also mean it will only write 11 chars right now, aka 100GiB not 1TiB...

I'll fix both.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, looks like C99 defines the always-terminate behaviour. I wrote C where not all platforms were C99, so had to be more defensive against the various footguns.

Copy link
Collaborator

@warner warner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, although consider the size_t remaining change

SnapshotStream stream;
if (path[0] == '@') {
int fd = atoi(path + 1);
int tmpfd = dup(fd);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the dup() for? Does it allow us to fclose(stream.file) but then re-use the original fd later? Do we have a use for that, or is this a generic pattern that we do need for the writing side?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do need it for the writing side, and I figured in the future we may want the ability to reuse the load snapshot stream.

int writeError = fxWriteOkay(toParent, meterIndex, machine, "", 0);
// Allows us to format up to 999,999,999,999 bytes (1TiB - 1)
char fsize[13];
int fsizeLength = snprintf(fsize, sizeof(fsize) - 1, "%d", stream.size);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing this was copied from the snprintf() in fxWriteOkay(), around line 813, which is probably mine. My read of the snprintf man page (on my Mac):

The snprintf() and vsnprintf() functions will write at most size-1 of the characters printed into the output string (the size'th character then gets the terminating ‘\0’); if the return value is greater than or equal to the size argument, the string was too short and some of the printed characters were discarded. The output is always null-terminated, unless size is 0.

is that using sizeof(fsize)-1 means fsize will always be NUL-terminated, even if the argument is too big (the string will be truncated in that case, but it will still be NUL-terminated). I remember worrying about clobbering the stack, and wanted to make sure that even if I miscalculated how large the buffer needed to be, the worst case would be corrupted meter data and not crashing or corrupting the execution.

static int fxSnapshotRead(void* stream, void* address, size_t size)
{
return (fread(address, size, 1, stream) == 1) ? 0 : errno;
SnapshotStream* snapshotStream = stream;
if (snapshotStream->size >= 0 && (signed)(snapshotStream->size - size) < 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I believe the snapshot format is self-delimiting: unless something is confused or corrupted, the fxSnapshotRead() call should exactly consume the last byte of the snapshot, and it will never read beyond the end.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW the (signed) cast is a little bit unnerving. I think the logic is correct, but it might be more obvious if you used a separate boolean flag to mean "we know a size", use size_t remaining instead of int size, and then do purely unsigned comparisons like if (snapshotStream->hasSize && size > snapshotStream->remaining) return EIO;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah since we're not using this "EOF" style check, I might rip it out entirely.

mhofman added 3 commits April 28, 2023 20:13
Introduce a snapshot stream struct to store the size of the data written
Respond to snapshot write with written size
@mhofman mhofman force-pushed the mhofman/stream-snapshot branch from 9e78492 to 9d408aa Compare April 28, 2023 20:21
@mhofman mhofman merged commit 056faf8 into Agoric Apr 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants