Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DASH] Possibility of playback freezes when switching streams in the middle of StreamingEngine updates (fetchAndAppend_()). #7156

Closed
JulianDomingo opened this issue Aug 15, 2024 · 3 comments · Fixed by #7157 or #7217
Assignees
Labels
component: DASH The issue involves the MPEG DASH manifest format platform: Cast Issues affecting Cast devices priority: P1 Big impact or workaround impractical; resolve before feature release status: archived Archived and locked; will not be updated type: bug Something isn't working correctly
Milestone

Comments

@JulianDomingo
Copy link
Contributor

JulianDomingo commented Aug 15, 2024

Have you read the FAQ and checked for duplicate open issues?
Yes.

If the problem is related to FairPlay, have you read the tutorial?

N/A.

What version of Shaka Player are you using?

main

Can you reproduce the issue with our latest release version?
Yes.

Can you reproduce the issue with the latest code from main?
Yes.

Are you using the demo app or your own custom app?
Custom cast receiver app for Paramount+.

If custom app, can you reproduce the issue using our demo app?
No, as constructing test content to reproduce this scenario is difficult outside of the test environment provided directly from Paramount+.

It is a 3 hour DVR window, live, multiperiod, DRM, DAI, DASH stream.

What browser and OS are you using?
Chrome / CastOS (reproducible on all Cast devices).

For embedded devices (smart TVs, etc.), what model and firmware version are you using?
Google Nest Hub Max / Google Chromecast with TV (4K), latest production firmware.

What are the manifest and license server URIs?

N/A (this was reproduced using internal test content generated by Paramount+).

What configuration are you using? What is the output of player.getConfiguration()?

{
  "abr": {
    "defaultBandwidthEstimate": 2000000,
  },
  "drm": {
    "retryParameters": {
      "maxAttempts": 4,
      "baseDelay": 400,
      "backoffFactor": 2,
      "timeout": 30000
    },
    "servers": {
      "com.widevine.alpha": [REDACTED]
    },
    "advanced": {
      "com.widevine.alpha": {
        "audioRobustness": "HW_SECURE_CRYPTO",
        "videoRobustness": "HW_SECURE_ALL"
      }
    }
  },
  "manifest": {
    "retryParameters": {
      "maxAttempts": 4,
      "baseDelay": 400,
      "backoffFactor": 2,
      "timeout": 30000
    },
    "dash": {
      "disableXlinkProcessing": true
    }
  },
  "streaming": {
    "rebufferingGoal": 10,
    "retryParameters": {
      "maxAttempts": 4,
      "baseDelay": 400,
      "backoffFactor": 2,
      "timeout": 30000
    },
    "stallThreshold": 5,
    "stallSkip": 1.1
  },
  "mediaSource": {
    "codecSwitchingStrategy": "reload"
  }
}

What did you do?

  1. Cast the Paramount+ live content on any Cast device.

What did you expect to happen?

  1. The live stream to play without playback freezing / stalling indefinitely.

What actually happened?

Playback freezes when an ABR switch occurs in the middle of an ongoing StreamingEngine.fetchAndAppend_().

To be more specific, when the above happens, the switchInternal_() logic (which gets triggered from the ABR switch) closes the SegmentIndex of the old stream:

// Releases the segmentIndex of the old stream.
// Do not close segment indexes we are prefetching.
if (!this.audioPrefetchMap_.has(mediaState.stream)) {
if (mediaState.stream.closeSegmentIndex) {
mediaState.stream.closeSegmentIndex();
}
}

In the context of DASH, eventually this leads to the clearing of the references array, which holds things like the uris of each segment:

this.references = [];

This can be problematic if the old stream's segment index closes before the fetch() happens, since everything in reference no longer exists:

const fetchSegment = this.fetch_(mediaState, reference);
const result = await fetchSegment;

As a result, a PendingRequest object with no URI will be created downstream:

const request = shaka.util.Networking.createSegmentRequest(
reference.getUris(),
reference.startByte,
reference.endByte,
retryParameters,
streamDataCallback);
request.contentType = stream.type;
shaka.log.v2('fetching: reference=', reference);
return netEngine.request(requestType, request, {type, stream, segment});

goog.asserts.assert(
request.uris && request.uris.length, 'Request without URIs!');

When the presentation time eventually reaches the media start time of the failed segment request, playback stalls.


Here is a snippet of the console logs illustrating the error I see:
freeze_logs_explanation

Are you planning send a PR to fix it?
Yes. The plan is to defer the closing of a stream's SegmentIndex to the next onUpdate_() call when an ABR switch happens during a update, similar to how Shaka can defer clearing of the buffer:

} else if (mediaState.performingUpdate) {
// We are performing an update, so we have to wait until it's finished.
// onUpdate_() will call clearBuffer_() when the update has finished.
// We need to save the safe margin because its value will be needed when
// clearing the buffer after the update.
mediaState.waitingToClearBuffer = true;
mediaState.clearBufferSafeMargin = safeMargin;
mediaState.waitingToFlushBuffer = true;
} else {

I will guard this change with a new config defaulting to turned off, since it's a core StreamingEngine change and may not necessarily be a problem on some platforms.

@JulianDomingo JulianDomingo added type: bug Something isn't working correctly priority: P1 Big impact or workaround impractical; resolve before feature release component: DASH The issue involves the MPEG DASH manifest format platform: Cast Issues affecting Cast devices labels Aug 15, 2024
@JulianDomingo JulianDomingo added this to the v4.11 milestone Aug 15, 2024
@JulianDomingo JulianDomingo self-assigned this Aug 15, 2024
avelad pushed a commit that referenced this issue Aug 19, 2024
avelad pushed a commit that referenced this issue Aug 19, 2024
@joeyparrish joeyparrish reopened this Aug 26, 2024
@joeyparrish
Copy link
Member

joeyparrish commented Aug 26, 2024

A regression was reported in video-dev.org Slack, and they narrowed it down to the PR that closed this issue. Playing https://d24rwxnt7vw9qb.cloudfront.net/v1/dash/e6d234965645b411ad572802b6c9d5a10799c9c1/All_Reference_Streams/4577dca5f8a44756875ab5cc913cd1f1/index.mpd with this PR, reportedly:

playback stops rendering with a warning "cannot find segment endTime" just a few seconds prior to the rendering halt.

This was initially observed on Samsung TVs, but then later on LG and even in Chrome.

I'm reverting the PR and reopening the issue.

@joeyparrish
Copy link
Member

Another note from the reporter:

It happens on any content on our end as long as it had many variants. When the player starts bumping to the upper variant in resolution, after the buffer time is exhausted, the playback halts at this frame.

@JulianDomingo
Copy link
Contributor Author

JulianDomingo commented Aug 27, 2024

Thanks for reverting it Joey - I was able to reproduce it as well, and the root cause is due to only calling segmentIndex.release() within handleDeferredCloseSegmentIndexes_(), but not setting segmentIndex = null;.

Since checking the nullability of segmentIndex is how StreamingEngine determines if it should create a new segmentIndex for the active stream, the "cannot find segment endTime" error can commonly occur if:

1. We switch from stream A -> B during an update
2. Update finishes, so call A.release() only
3. Regular playback resumes...
4. We decide to switch back to A (regardless of whether an update is currently happening or not)
5. StreamingEngine sees A.segmentIndex is NOT null, since we never actually set it to null
6. StreamingEngine therefore never populates A.segmentIndex with new references, 
   so A.segmentIndex is completely empty for the rest of the playback session
7. We get the "cannot find segment endTime" error perpetually

IIRC the reason we decided to only call release() was from this past conversation within the reverted PR. But my response was incorrect; even if we switch in an A -> B -> A manner during an update + considering the second paragraph statement, there will always only be one, distinct instance of segmentIndex for A.

In other words, the initial iteration of the reverted PR would be the corrected version. There is no risk of accidentally closing the active stream's segmentIndex, because we would be closing old streams right before the check to create a new one if needed:

// Make sure the segment index exists. If not, create the segment index.
if (!mediaState.stream.segmentIndex) {
const thisStream = mediaState.stream;
try {
await mediaState.stream.createSegmentIndex();

I will send out another PR in a bit with better unit tests to check against this behavior.

JulianDomingo added a commit to JulianDomingo/shaka-player that referenced this issue Aug 28, 2024
…pdates (shaka-project#7217)

Resolves the issues reported by
shaka-project#7213, which correctly
fixes shaka-project#7156.

The latest comment
shaka-project#7156 (comment)
goes further into detail on the problems of the initial PR.
avelad pushed a commit that referenced this issue Aug 29, 2024
…pdates (#7217)

Resolves the issues reported by
#7213, which correctly
fixes #7156.

The latest comment
#7156 (comment)
goes further into detail on the problems of the initial PR.
avelad pushed a commit that referenced this issue Aug 29, 2024
…pdates (#7217)

Resolves the issues reported by
#7213, which correctly
fixes #7156.

The latest comment
#7156 (comment)
goes further into detail on the problems of the initial PR.
@shaka-bot shaka-bot added the status: archived Archived and locked; will not be updated label Oct 27, 2024
@shaka-project shaka-project locked as resolved and limited conversation to collaborators Oct 27, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
component: DASH The issue involves the MPEG DASH manifest format platform: Cast Issues affecting Cast devices priority: P1 Big impact or workaround impractical; resolve before feature release status: archived Archived and locked; will not be updated type: bug Something isn't working correctly
Projects
None yet
3 participants