Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Shoutcast live sending not to break stream #665

Merged
merged 81 commits into from
Oct 28, 2015

Conversation

illuusio
Copy link
Contributor

@illuusio illuusio commented Aug 1, 2015

Shoutcast streams are breaking heavily on Mixxx 1.12 beta (launchpad bug #1277274). Mainly the problem is that Mixxx is not using shout_sync() (and output can't block) function and this the reason why it gets out of the sync.
This one creates new thread inside Shoutcast engine that handles writing things from cache and so there is not big changes needed to Mixxx engine.

   - There is another thread that handles sending
   - Write only adds stuff to cache
   - Delay is kept minimal as possible (It can be made shorter
                                        but this is on the safe side)
@@ -455,6 +496,45 @@ bool EngineShoutcast::serverConnect() {

void EngineShoutcast::write(unsigned char *header, unsigned char *body,
int headerLen, int bodyLen) {
struct shoutcastCacheObject *l_SCacheObj = (struct shoutcastCacheObject *) malloc(sizeof(struct shoutcastCacheObject));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

malloc and free are potentially blocking system calls. We should not use them here regularly.
Can we preallocate the memory?
Nit: break long lines <= cloumn 80

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes if we can stand little bit over head and 1 MB pre-allocated memory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be no problem. But do we actually need that much? Can't we pass all data instantly to the server? In which case the buffer is required?

@daschuer
Copy link
Member

daschuer commented Aug 1, 2015

You cannot use shout_sync(), because the Mixxx engine runs at the master sound card Crystal and not at the shoutcast server clock.

It is unlikely that the master sound card and the shoutcast server clocks are fully in sync.
So after "some time" either the master soundcard or the shoutcast server is one frame ahead.
After repeated "some times" the buffers are consumed and the stream may drop or click.

In the case the master sound card is faster, it will fill up your m_pShoutcastCache until the device is out of memory. This will not happen in the age of GB memory, but once you have removed the mallocs, m_pShoutcastCache will overflow, leading to a hear-able click.

In the case the server is faster, it will empty the server cache until the stream drops.

Workaround:

  • remove shout_sync()
  • put as many samples that the shout_queuelen(m_pShout) returns a value between two reasonable limits.
  • if the queuelen is becoming too long, drop 1 frame
  • if the queuelen is becoming to short, duplicate one frame.

Advanced thoughts:

Since the shoutcast stream is the "product", messing around with the samples it probably not desired.

If is this is identified as a problem, we need to drop/add a frame from the master sound card stream. I am afraid this is hard with the current architecture, since this would made the Mixxx engine audio buffer non const.

We can get around this by registering the Shoutcast server as master soundcard.
Shoutcast becomes a SoundDeviceShoutcast and the shoutcast thread becomes the Mixxx engine thread running at real time prio. (instead of the PA thread)
Than we can again use shout_sync() as the master clock of Mixxx

I have no clue if this would be hard or easy. :-/

@ywwg
Copy link
Member

ywwg commented Aug 1, 2015

The documentation also says "Alternatively, the caller may use shout_delay to determine the number of milliseconds to wait and delay itself." Maybe this is a way we can track the current latency between sound card and stream?

I think that if we do any correction to a stream, it should be to the shoutcast stream and not the master soundcard output. Broadcasters should expect pristine output from their own speakers, and if they are recording the stream they will also want that to be pristine (recording either inside mixxx or with an external device). Whereas, people listening to a stream are going to more tolerant of slight glitches and latency issues in the stream (which they already experience due to rebuffering and so forth). Also someone might be playing to a live audience while simulcasting to a streaming station.

@ywwg
Copy link
Member

ywwg commented Aug 1, 2015

Since the shoutcast stream is the "product", messing around with the samples it probably not desired.

(This is the part I am disagreeing with in my previous comment, since the stream is already more unstable anyway and it's likely that the actual soundcard output will have listeners present)

@daschuer
Copy link
Member

daschuer commented Aug 2, 2015

Yes thank you, shout_delay() should work better, since you do not have to deal with the compressed data.

Sure the recording feature should be able to record a bit perfect stream in any case. (We have other bugs that prevent this)
If a listeners has a stable connection he should also be able to record a bit perfect stream.
The stream sound should be not messed by design.

Once you pass the samples to the DAC from the master sound card it can't be bit perfect.
Since you do not hear a single missing or duplicated frame, it does not matter if we pass a bit perfect stream to the DAC or not.

But I agree, that using the master sound card as master Mixxx clock is a good band aid for 1.12 until someone does the effort to allow to use the shoutcast clock for the Mixxx engine.
IMHO this will be required if we try to compete with pure broadcasting player.

@daschuer
Copy link
Member

daschuer commented Aug 2, 2015

We may also still use shout shout_sync() and add/drop the frame if the m_pShoutcastCache level is moving out of reasonable regions.

@illuusio
Copy link
Contributor Author

illuusio commented Aug 2, 2015

Actually I don't understand this at all. Problem ain't crystals or delays problem is about the way the things works with compressed audio (which is very different world than playing with soundcard). Soundcard is rather easy it just clocks you and you just feed asked amout of bytes but then this Icecast hustle breaks loose most times Icecast server can't estimate how much data you are about to send and how much PCM data it will generate.
Ok we can use shout_delay but I just can see the reason. Documentation says very sharply (http://www.aelius.com/njh/libshout-doc/libshout.html) that shout_sync should be used. If you see how it's done https://git.xiph.org/?p=icecast-libshout.git;a=blob;f=src/shout.c;h=6b7752ee01e7133b6cf07288857af6be918282a6;hb=HEAD you'll notice it's little bit brain dead and actually I should make a small patch that shout_sync uses shout_delay as calculating values because there is copy'n'paste problem.
Playing with shout_queue_len is not good idea as it sound like because it doesn't tell you the shout real need of packets (ok you can estimate when you queue is empty you can write again but using shout_sync or shout_delay you get better idea)
Main problem with this is that Mixxx encoding comes little bit out-of-sync (like @daschuer said) what shout wants. It's not much something like -30 ms with 320 kbs (this is still in good-zone with shoutcast but you can cache anything to client because there is nothing to cache because those bytes are too old to serve for client).Problems starts to rise when Mixxx is doing something else like opening file or/and if we use filters.
We could use (As I though my plan B) shout_delay and trust Mixxx to feed 2-3 times in second new compressed audio packets. Mixxx caches a bit first and send first big amount of packets and start waiting shout_delay to go 0 and caching then do it again (headers and bodys separated). I tried to KISS this and keep it out of trouble for this B-plan that I had because it's little bit complicated (and my version ain't simplest of them all). Good thing in this would be that FIFO can be used to store bytes because we send everything that we have (sake to keep encoded packets integrity).
And please explain shortly or I'm too lazy to read why on earth we must drop packet? I just see this as slow transfer problem. Best solution would be that every sideengines live their own life and they are feeded from Mixxx as they can do their thing (if they block or not). This was more lengthy that I though and I assume you are very pro with compressed audio but if someone else reading this trying to make my points clear.
But before I/We do anything somebody should test this as this is and see am I wrong or right with my assumptions where problem is. I can be very very wrong or correct there is no middle way here.

@daschuer
Copy link
Member

daschuer commented Aug 2, 2015

Sorry for my confusing mails, I was myself confused by the fact that we have to deal with compressed data.

I just had a look at the libshout code. It uses internally the gettimeofday() to sync the stream. This is driven by the cmos wall clock crystal of the host machine and should be corrected by NTP servers, to be synced world wide. Clock changes by NTP are allowed by normalizing the time to the stream start time.

Icecast server can't estimate how much data you are about to send and how much PCM data it will generate.

It can, since all streaming formats have a fixed data consume rate. Even if you broadcast with VBR.
In the later case a bit reservoir is used to fake a constant data consume rate.
The related code for mp3 is located here:
https://github.com/codders/libshout/blob/a17fb84671d3732317b0353d7281cc47e2df6cf6/src/mp3.c#L138

Playing with shout_queue_len is not good idea

Right, I was wrong in this. It reflects only the network buffer, which is somhow independent from the sample timing.

Problems starts to rise when Mixxx is doing something else like opening file or/and if we use filters.

This is a problem of a slightly differ kind. I was talking about the clock crystal miss match which is stacking up during a long broadcast session. It looks like you experience here a audio buffer underflow of the main soundcard, which makes the issue worse, since this is stacked up much faster.
In case of such a underflow, Mixxx does not fill the missing samples with silence internally. The silence if filled inside PA or ALSA. This has the advantage that the recoded stream is not effected from the underflow. But it is a no go in case of broadcast, because instead of filling silence samples the network buffer is consumed, until it is finally empty and breakes the stream.

This issue is also coverd by this bug https://bugs.launchpad.net/mixxx/+bug/1198306 since the audience also dislikes such behaviour because the track slows down by the extra buffer of silence and the dances bay fall out of rhythm.

And please explain shortly or I'm too lazy to read why on earth we must drop packet?

We must not drop compressed packages. We have to add or drop frames (a single sample for each R/L channel) of uncompressed data. It is require once the sound card stream is ahead or bejond the uncompressed streaming stream.

Possibles solution:

  • Use a broadcasting task cycling using shout_sync().
  • prepare a buffer of exactly two times of the uncompressed samples needed by one shout_sync() cycle.
  • Compress always one half of the buffer.
  • keep track of the second half. If it is running low, add silence.
  • if it is running high, remove samples.

:-/ this sounds like a hacky band aid solution. but it will work for 1.12.
A much better solution would be to setup a realtime thread driven by gettimeofday() that drived the mixxx engine. In this case we have no clock crystal issue and we are able to fill missed updates by silence inside the Mixxx engine and advance the input stream by that amount to avoid the dancers to get out of rhythm.

I can be very very wrong or correct there is no middle way here.

Our latest discussion clarifies a lot to me. I hope you feel the same.

@illuusio
Copy link
Contributor Author

illuusio commented Aug 3, 2015

Yes.. this clears things and now everyone is talking same thing. It will be hacky even it done the way that I have done it. I think we have to rearrange engines little bit in 1.13 like you can just pop them from them list and every side engine is independent what mixxx does.
My dream is to have way you can send Ogg/Mp3 and record FLAC/mp3 same stream and time.

@illuusio
Copy link
Contributor Author

illuusio commented Aug 6, 2015

I was re-reading QT QObject documentation and started to thing should this be fixed with Signals and slots? Then it can be get rid of Thread and other unwanted things if there would be 'write' signal which caches and send signal which send stuff to server when there is enough material. Only thing I didn't understand do signals block each other (are they real callbacks?) it they are not returned.

@daschuer
Copy link
Member

daschuer commented Aug 6, 2015

I would not use signal and slots to pass audio data. This may involve locking and slow malloc / free calls-
I have already made bad experience with it in a real time context.

@daschuer
Copy link
Member

daschuer commented Aug 6, 2015

Only thing I didn't understand do signals block each other (are they real callbacks?) it they are not returned.

If you use moveToThread(), the thread runs its on main loop with a Qt message queue.
Singals from other threads are queued and the connected slots are processed one by one after each other. If you enter a endless loop inside a slot, the queued signals are not processed.
Slots connected to signals in the same thread are processed immediately as a callback before returning from emmit (signal).

I would prefer to control the cycle of the Shoutcast thread manually.

@illuusio
Copy link
Contributor Author

illuusio commented Aug 6, 2015

Main problems are thread and malloc/free? Does making Qthread instace it free from mainloop?

@daschuer
Copy link
Member

daschuer commented Aug 6, 2015

I am not sure if I understand your last comment but I will try to answer anyway.

If you inherit from QThread, override the run() function and newer exit it, the thread has no other main loop that processes signals. If you move Qt Objects to such a thread, their slots are never processed.

@daschuer
Copy link
Member

daschuer commented Aug 6, 2015

IMHO processing the event queue is the default run() implementation. Once it is overridden, it is gone.

@illuusio
Copy link
Contributor Author

illuusio commented Aug 6, 2015

If malloc is considered as blocking is new the same?

@daschuer
Copy link
Member

daschuer commented Aug 6, 2015

new calls malloc

@illuusio
Copy link
Contributor Author

Gotta figure out some non blocking non malloc hack to get this working..

@ywwg
Copy link
Member

ywwg commented Aug 19, 2015

I'd really like all hands on deck for this bug -- this is the thing holding up the release right now. Who else is able to help make this bulletproof? @uklotzde ?

@illuusio
Copy link
Contributor Author

There is problem also that it's not bulletproof that problem is libshout (which I highly hope). I can provide testing server for those how are willing to work on this (but you have to write me directly) because I currently lost my Windows dev enviroment.

@ywwg
Copy link
Member

ywwg commented Aug 23, 2015

buildbot: test this please

@ywwg
Copy link
Member

ywwg commented Aug 23, 2015

retest this please

@illuusio
Copy link
Contributor Author

Now I updated some problems away and added signals for connected and disconnected.

@daschuer
Copy link
Member

Thank you for the fixes.
Is you last commit missing?

@illuusio
Copy link
Contributor Author

It seems so I missed this push.. but now there is signals

@ywwg
Copy link
Member

ywwg commented Oct 27, 2015

How do you all feel about this? If it's pretty much done, I would like to merge this in, produce another beta build, and get this to testers.

@daschuer
Copy link
Member

IMHO it is already much better than the 1.12 branch.

Open, but no blocker:

  • possible deadlock on wait()
  • reconnect on any failure

@illuusio: Are you currently working on the reconnect feature?

@ywwg
Copy link
Member

ywwg commented Oct 27, 2015

I would consider reconnect a nice-to-have. @illuusio if you think this is pretty good as is, we can merge. Then it's just up to me to get the build server to cooperate :(

@illuusio
Copy link
Contributor Author

@daschuer Is it ok that we exit without all threads are dead? I think this wait() have caused that hang in microphone bug so it's real deadlock issue should we change it just some couple of sleep() and because disconnect should only take couple ms or it's not going to happen and we can just release socket our side and server will do so in a while also.
@ywwg Re-connection would be really nice-to-have. I'm just wondering where to but that. Is it shoutcast dialog which listens disconnect-signal or shoutacast-sideengine which it's own re-connects if it notices that auto-reconnect is on and can I just add auto-reconnect button to Dialog and to config without making any changes to SQL. Do you have opinions on these and I'll implement re-connect at weekend.

@daschuer
Copy link
Member

we change it just some couple of sleep()

IMHO we have to make sure the thread is gone at this point. Instead of healing the symptoms at wait(), lets analyses all places where the thread is waiting and add timeouts there.

Re-connection:

This seams to be a working approach. However the solution I had in my mind was like that:

Just reconnect from inside EngineShoutcast, if the connection was lost for any reason except the user has changed the "enable" CO.

  • Don't change the "enable" CO from EngineShoutcast.
  • If the connection is dropped and "enable" is still 1.0, reconnect.
    IMHO that does not require an additional preference option, because we do it anyway for some issues and reconnecting when the user has manually disabled the connection feels wrong.

@ywwg
Copy link
Member

ywwg commented Oct 28, 2015

This PR is already gigantic. Let's do reconnection in another PR

@illuusio
Copy link
Contributor Author

@ywwg yes let's merge this and make new Release Candidate or beta to have more testers.
@daschuer let's keep it simple because it was the same idea that I had. I think I'll just add some notification for user with that signal that your connection is dropping or generate some warning.

ywwg added a commit that referenced this pull request Oct 28, 2015
Fix Shoutcast live sending not to break stream
@ywwg ywwg merged commit 9cfc164 into mixxxdj:1.12 Oct 28, 2015
@Be-ing
Copy link
Contributor

Be-ing commented Oct 28, 2015

It could help to make an announcement on the blog calling for Internet broadcast users to test this.

@ywwg
Copy link
Member

ywwg commented Oct 28, 2015

first we have to fix the build server :(

@ywwg
Copy link
Member

ywwg commented Oct 28, 2015

This PR fails to build on windows, please fix asap:

http://builds.mixxx.org:8081/jenkins/job/1.12-release/architecture=i386,platform=windows/144/console

/IC:\usr\include\taglib /Ilib\hidapi-0.8.0-pre\hidapi /Ilib\xwax /Ilib
\scratchlib /Ilib\vamp-2.3
engineshoutcast.cpp
src\engine\sidechain\engineshoutcast.cpp(24) : warning C4005: 'WIN32' :
macro redefinition
command-line arguments : see previous definition of 'WIN32'
src\engine\sidechain\engineshoutcast.cpp(778) : error C2065:
'sigset_t' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(778) : error C2146: syntax
error : missing ';' before identifier 'sigpipe_mask'
src\engine\sidechain\engineshoutcast.cpp(778) : error C2065:
'sigpipe_mask' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(779) : error C2065:
'sigpipe_mask' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(779) : error C3861:
'sigemptyset': identifier not found
src\engine\sidechain\engineshoutcast.cpp(780) : error C2065:
'sigpipe_mask' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(780) : error C2065: 'SIGPIPE' :
undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(780) : error C3861:
'sigaddset': identifier not found
src\engine\sidechain\engineshoutcast.cpp(781) : error C2065:
'sigset_t' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(781) : error C2146: syntax
error : missing ';' before identifier 'saved_mask'
src\engine\sidechain\engineshoutcast.cpp(781) : error C2065:
'saved_mask' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(782) : error C2065:
'SIG_BLOCK' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(782) : error C2065:
'sigpipe_mask' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(782) : error C2065:
'saved_mask' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(782) : error C3861:
'pthread_sigmask': identifier not found
scons: *** [win32_build\engine\sidechain\engineshoutcast.obj] Error 2
scons: building terminated because of errors.

@daschuer
Copy link
Member

I will have a look

@daschuer
Copy link
Member

I have just pushed a possible fix.
If this doe snot work, we may try this:
https://github.com/fanwenyi0529/qemu-fvm/blob/3e6cce0e54f6a52ed1fc81d4e7e0ad03dbbccd6d/hosts/w32/include/signal.h
Or we can #define out the whole function.
Does windows send SIGPIPE?

@sblaisot
Copy link
Member

There is no SIGPIPE under Windows.
// TODO: find reference for this ;)

@ywwg
Copy link
Member

ywwg commented Oct 29, 2015

still failing:

http://builds.mixxx.org:8081/jenkins/job/1.12-release/architecture=amd64,platform=windows/146/console
cl : Command line warning D9002 : ignoring unknown option '/arch:SSE2'
engineshoutcast.cpp

C:\mixxx\environments\2.0-x64-Release\include\shout/shout.h(163) : error C2146: syntax error : missing ';' before identifier 'shout_send_raw'
C:\mixxx\environments\2.0-x64-Release\include\shout/shout.h(163) : error C4430: missing type specifier - int assumed. Note: C++ does not support default-int
C:\mixxx\environments\2.0-x64-Release\include\shout/shout.h(167) : error C2146: syntax error : missing ';' before identifier 'shout_queuelen'
C:\mixxx\environments\2.0-x64-Release\include\shout/shout.h(167) : error C4430: missing type specifier - int assumed. Note: C++ does not support default-int

C:\mixxx\environments\2.0-x64-Release\include\shout/shout.h(167) : error C2086: 'int ssize_t' : redefinition
C:\mixxx\environments\2.0-x64-Release\include\shout/shout.h(163) : see declaration of 'ssize_t'
src\engine\sidechain\engineshoutcast.cpp(488) : error C2146: syntax error : missing ';' before identifier 'queuelen'
src\engine\sidechain\engineshoutcast.cpp(488) : error C2065: 'queuelen' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(489) : error C2065: 'queuelen' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(490) : error C2065: 'queuelen' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(491) : error C2065: 'queuelen' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(771) : error C2065: 'sigset_t' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(771) : error C2146: syntax error : missing ';' before identifier 'sigpipe_mask'
src\engine\sidechain\engineshoutcast.cpp(771) : error C2065: 'sigpipe_mask' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(772) : error C2065: 'sigpipe_mask' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(772) : error C3861: 'sigemptyset': identifier not found
src\engine\sidechain\engineshoutcast.cpp(773) : error C2065: 'sigpipe_mask' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(773) : error C2065: 'SIGPIPE' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(773) : error C3861: 'sigaddset': identifier not found
src\engine\sidechain\engineshoutcast.cpp(774) : error C2065: 'sigset_t' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(774) : error C2146: syntax error : missing ';' before identifier 'saved_mask'
src\engine\sidechain\engineshoutcast.cpp(774) : error C2065: 'saved_mask' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(775) : error C2065: 'SIG_BLOCK' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(775) : error C2065: 'sigpipe_mask' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(775) : error C2065: 'saved_mask' : undeclared identifier
src\engine\sidechain\engineshoutcast.cpp(775) : error C3861: 'pthread_sigmask': identifier not found
scons: *** [win64_build\engine\sidechain\engineshoutcast.obj] Error 2
scons: building terminated because of errors.

Build step 'Execute Windows batch command' marked build as failure

Recording test results

ERROR: Publisher 'Publish JUnit test result report' failed: Test reports were found but none of them are new. Did tests run?
For example, c:\mixxx\workspace\1.12-release\architecture\amd64\platform\windows\test_results.xml is 1 day 0 hr old

SSH: Current build result is [FAILURE], not going to run.

Finished: FAILURE

@daschuer
Copy link
Member

It does not really make sense to fix it remote. Who can try this:

// shout.h checks for WIN32 to see if we are on Windows.
#ifdef WIN64
#define WIN32
#endif
#include <shout/shout.h>
#ifdef WIN64
#undef WIN32
#endif

And make ignoreSigpipe() conditional by

#ifndef __WINDOWS__

@daschuer
Copy link
Member

I have just pushed dedcd27

@sblaisot
Copy link
Member

#define WIN23Did you mean 32?

@daschuer
Copy link
Member

Stupid me! Thank you @ywwg for fixing.
94957bb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants