Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix FPS and latency bugs with VP9 screenshare. #1069

Merged
merged 3 commits into from
Mar 13, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/smooth-glasses-jam.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
"livekit-client": patch
---

Fix FPS and latency issues with VP9 screenshare
11 changes: 10 additions & 1 deletion src/room/participant/LocalParticipant.ts
Original file line number Diff line number Diff line change
Expand Up @@ -736,7 +736,16 @@ export default class LocalParticipant extends Participant {
if (isSVCCodec(videoCodec)) {
// vp9 svc with screenshare has problem to encode, always use L1T3 here
if (track.source === Track.Source.ScreenShare && videoCodec === 'vp9') {
opts.scalabilityMode = 'L1T3';
// Chrome does not allow more than 5 fps with L1T3, and it has encoding bugs with L3T3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it allows 30 fps. The array in Chrome code is for three temporal layers. If we do T3, it should do 30 fps if I am not mistaken (that layer array is for temporal layers as it is FPS). But, my interpretation could be incorrect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it would if L3T3 functioned correctly. I think both you and @cnderrauber confirmed that it does not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can see, they are using spatial layer as the index into the array: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/video_coding/codecs/vp9/svc_config.cc;l=63;drc=b8a4daa31c86ef61f7a78ea73711b86ab732db32

So I think the T3 is indicating 3 temporal bounded by selected maxFramerate

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not get either L3T3 or L3T3_KEY to work. I must have done something incorrect with L3T3_KEY as it is working for you.

I meant that the array is setting FPS for temporal layers. With L1T3, it should still be one spatial and three temporal layers. So, it should do 30 fps and allow temporal scale down to lower frame rate. I am confused as to how it is limiting to only at 5 fps (which looks like temporal layer 0 in that array). But, if it is actually doing three temporal layers with one spatial, it should still do 30 fps. That's how I was reading that array.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can see, they are using spatial layer as the index into the array: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/video_coding/codecs/vp9/svc_config.cc;l=63;drc=b8a4daa31c86ef61f7a78ea73711b86ab732db32

So I think the T3 is indicating 3 temporal bounded by selected maxFramerate

oh okay, that is interesting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this branch work for you? the main difference that's needed is setting contentHint to motion.. which bypasses the screenshare code path in libwebrtc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can see, they are using spatial layer as the index into the array: https://source.chromium.org/chromium/chromium/src/+/main:third_party/webrtc/modules/video_coding/codecs/vp9/svc_config.cc;l=63;drc=b8a4daa31c86ef61f7a78ea73711b86ab732db32
So I think the T3 is indicating 3 temporal bounded by selected maxFramerate

oh okay, that is interesting.

Did not realise that the fps is scoped to the spatial layer. Sorry, my bad.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this branch work for you? the main difference that's needed is setting contentHint to motion.. which bypasses the screenshare code path in libwebrtc

Definitely much better. Seemed to start off with low fps/high latency, but got much smoother after a few seconds.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial choppiness could be due to some server side changes I am trying to test.

// It has a different path for screenshare handling and it seems to be untested/buggy
// As a workaround, we are setting contentHint to force it to go through the same
lukasIO marked this conversation as resolved.
Show resolved Hide resolved
// path as regular camera video. While this is not optimal, it delivers the performance
// that we need
if ('contentHint' in track.mediaStreamTrack) {
track.mediaStreamTrack.contentHint = 'motion';
davidzhao marked this conversation as resolved.
Show resolved Hide resolved
} else {
lukasIO marked this conversation as resolved.
Show resolved Hide resolved
opts.scalabilityMode = 'L1T3';
}
}
// set scalabilityMode to 'L3T3_KEY' by default
opts.scalabilityMode = opts.scalabilityMode ?? 'L3T3_KEY';
Expand Down
2 changes: 1 addition & 1 deletion src/room/track/RemoteAudioTrack.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import { TrackEvent } from '../events';
import { computeBitrate } from '../stats';
import type { AudioReceiverStats } from '../stats';
import { computeBitrate } from '../stats';
import type { LoggerOptions } from '../types';
import { isReactNative, supportsSetSinkId } from '../utils';
import RemoteTrack from './RemoteTrack';
Expand Down
Loading