Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix session corruption #45786

Merged
merged 3 commits into from
Aug 28, 2024
Merged

fix session corruption #45786

merged 3 commits into from
Aug 28, 2024

Conversation

fspmarshall
Copy link
Contributor

@fspmarshall fspmarshall commented Aug 23, 2024

Fixes an issue where session recording could become corrupted during multipart upload if two conditions were met simultaneously:

  • A slice needed to be uploaded with padding applied (occurs when the event source is taking too long to push events).
  • An error occurs on the first upload attempt.

In this case, the slice would have its reader() method called multiple times. The reader() method determines the length of the slice based on the length of the underlying buffer and appends padding as needed, noting the length of the payload and length of the padding in the slice header. On subsequent calls to reader() this logic would be re-run and would count the padding bytes as part of the payload, resulting in corruption of the header and preventing the session recording from being read. Attempts to read the session recording would fail with the error gzip: invalid header.

This PR updates the multipart upload logic to prevent this condition from being hit, and updates the session recording read logic to be able to read existing corrupted sessions. Test coverage has been added to catch future issues with paddings/retry conditions (neither of which were properly covered), and the test coverage has been added to the reader to ensure it can decode a fake corrupted session that was generated manually using some of our existing session recording test utilities (lib/events/testdata/corrupted-session).

This issue is loosely related to #45877 as the two issues tend to happen together since they are both triggered when the session recording system is under extremely high load.

changelog: fixed an issue where attempts to play/export certain session recordings would fail with gzip: invalid header.

lib/events/stream.go Outdated Show resolved Hide resolved
@fspmarshall fspmarshall force-pushed the fspmarshall/fix-session-corruption branch 3 times, most recently from c76b0ef to 4d5631f Compare August 26, 2024 15:37
@rosstimothy
Copy link
Contributor

/excludeflake *

lib/events/stream.go Outdated Show resolved Hide resolved
@public-teleport-github-review-bot public-teleport-github-review-bot bot removed the request for review from camscale August 28, 2024 14:48
@fspmarshall fspmarshall force-pushed the fspmarshall/fix-session-corruption branch from 88231e1 to 1590d87 Compare August 28, 2024 16:42
@fspmarshall fspmarshall enabled auto-merge August 28, 2024 16:54
@fspmarshall fspmarshall added this pull request to the merge queue Aug 28, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 28, 2024
@rosstimothy rosstimothy added this pull request to the merge queue Aug 28, 2024
github-merge-queue bot pushed a commit that referenced this pull request Aug 28, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 28, 2024
@fspmarshall fspmarshall added this pull request to the merge queue Aug 28, 2024
Merged via the queue into master with commit 83daf69 Aug 28, 2024
41 checks passed
@fspmarshall fspmarshall deleted the fspmarshall/fix-session-corruption branch August 28, 2024 21:43
@public-teleport-github-review-bot

@fspmarshall See the table below for backport results.

Branch Result
branch/v14 Create PR
branch/v15 Create PR
branch/v16 Create PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants