Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leak of goroutines with WHEP sources #3118

Closed
1 of 13 tasks
RouquinBlanc opened this issue Mar 7, 2024 · 4 comments
Closed
1 of 13 tasks

Leak of goroutines with WHEP sources #3118

RouquinBlanc opened this issue Mar 7, 2024 · 4 comments
Labels
bug Something isn't working webrtc

Comments

@RouquinBlanc
Copy link
Contributor

Which version are you using?

Issue seen on v1.6.0 and on master after the merge of #3110 .
Issue not present AFAICS on v1.5.1

Which operating system are you using?

  • Linux amd64 standard
  • Linux amd64 Docker
  • Linux arm64 standard
  • Linux arm64 Docker
  • Linux arm7 standard
  • Linux arm7 Docker
  • Linux arm6 standard
  • Linux arm6 Docker
  • Windows amd64 standard
  • Windows amd64 Docker (WSL backend)
  • macOS amd64 standard
  • macOS amd64 Docker
  • Other (please describe)

Describe the issue

The configuration contains a mix of WHEP sources (less than 10), with 3 connecting successfully (the remote endpoints are up) and the others not.

The number of goroutines related to pion/webrtc very quickly raise to thousands (reached 11K in less than 24 hours) - see attached goroutines from pprof.

Describe how to replicate the issue

  1. start the server with some WHEP paths pointing to nothing. It should start leaking quickly

Did you attach the server logs?

goroutines.txt

yes, goroutines

Did you attach a network dump?

no

@aler9 aler9 added bug Something isn't working webrtc labels Mar 9, 2024
@aler9
Copy link
Member

aler9 commented Mar 9, 2024

Hello, i tried replicating the issue but in my case the number of goroutines remained constant, so there must be some specific configuration combination that triggers the leak. This is my configuration:

paths:
  nonexisting_url:
    source: whep://127.0.0.1:8889/nonexisting/whep

  nonexisting_host:
    source: whep://nonexisting:8889/nonexisting/whep

  working:
    source: whep://127.0.0.1:8889/stream/whep

can you provide a configuration that allows to trigger the leak?

also, can you provide a goroutine dump by using the integrated pprof server? you need to set pprof: yes inside the configuration and post the output of

go tool pprof -text http://localhost:9999/debug/pprof/goroutine

@RouquinBlanc
Copy link
Contributor Author

Hello, the initial report was far from enough, apologies!

In fact, trying to isolate what's happening, I end up with the following minimal config:

pprof: yes

paths:
    m400_dltv_whep:
       source: whep://1.2.3.4:8889/m400_dltv/whep

The other side is a mediamtx instance (v1.5.1) on another machine with the following relevant config:

paths:
    m400_dltv:
        source: rtsp://5.6.7.8:5000/dltv1

The camera in question is a shitty one in terms of RTSP. It sends RTP/H264 but:

  • SPS and PPS are not part of the stream, only sent in SDP
  • Last RTP packet of a NAL should be marked, and it does not do it.

I noticed that in the mediamtx log, I get the following error in loop, which may hint on the issue location:

2024/03/09 19:07:52 ERR [path m400_dltv_whep] [WebRTC source] deadline exceeded while waiting tracks
2024/03/09 19:07:58 INF [path m400_dltv_whep] [WebRTC source] peer connection established, local candidate: host/udp/172.16.212.245/58274, remote candidate: host/udp/1.2.3.4/8189
[... again and again ...]

Attached is a dump of goroutines after a few minutes (basically the time it took to fill this message)
goroutine_3118.txt

If that's not enough, I can try to take an anonymized recording of the video in question. Or find a way to modify to craft one.

It looks like it does connect successfully with WHEP, but because the video format is crap, it ends up retrying to connect, and probably misses some cleanup in the process?

@aler9
Copy link
Member

aler9 commented Mar 10, 2024

fixed by #3124

@aler9 aler9 closed this as completed Mar 10, 2024
Copy link
Contributor

This issue is being locked automatically because it has been closed for more than 6 months.
Please open a new issue in case you encounter a similar problem.

@github-actions github-actions bot locked and limited conversation to collaborators Sep 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working webrtc
Projects
None yet
Development

No branches or pull requests

2 participants