Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Triple buffer mechanism and vc_dispmanx_resource_write_data() deadlock #120

Closed
vanfanel opened this issue Nov 11, 2012 · 6 comments
Closed

Comments

@vanfanel
Copy link

Hello there

I'm trying to implement a triple buffer mechanism for my dispmanx libSDL backend.
The basic mechanism works: it's a producer-consumer system, where the producer is tha SDL app,
wich runs wild without waits and renders as many fps as it wants to one of two alternating dispmanx
resources, and the consumer is the dispmanx vsync callback function, wich, If I'm not mistaken,
runs on it's own thread (GDB at least says so, and a look at the userland dispmanx functions seem to confirm it).

The triple buffering logic works, and my logs show the last complete frame (resource) from the producer is picked correctly
by the consumer, put on screen, and the producer writes alternating to the other two.

However, I'm having a deadlock after a while. This is what GDB has to say if I ask for a backtrace:
#0 0x400d8258 in __lll_lock_wait () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1 0x400d2cac in pthread_mutex_lock () from /lib/arm-linux-gnueabihf/libpthread.so.0
#2 0x402ebab8 in vc_dispmanx_resource_write_data () from /opt/vc/lib/libbcm_host.so
#3 0x400883cc in DISPMANX_FlipHWSurface (this=0x12008, surface=0x1ad70)

at ./src/video/dispmanx/SDL_fbvideo.c:860

#4 0x400780ac in SDL_Flip (screen=0x1ad70) at ./src/video/SDL_video.c:1169
#5 0x00008b9c in main (argc=2, argv=) at parallax2.c:254

It's always locked in the same point after a while.

And these are the functions. The first one is the consumer and the second one runs on the producer thread.
NOTE that I don't really know if using the vsync callback to issue the changes to take place in the next vsync signal arrival is a good idea: I don't see WHY it wouldn't be a good idea, but seems strange to me for some reason. Anyway, I didn't find a better way to issue the changes at the correct time (ie, when previous changes are SURE to be completed).

DISPMANX_CALLBACK_FUNC_T DISPMANX_VsyncEnds(){
DISPMANX_RESOURCE_HANDLE_T temp1 = writable;
DISPMANX_RESOURCE_HANDLE_T temp2 = onscreen;
onscreen = written;
writable = temp1;
written = temp2;
dispvars->update = vc_dispmanx_update_start( 0 );
vc_dispmanx_element_remove(dispvars->update, dispvars->element);
dispvars->element = vc_dispmanx_element_add( dispvars->update,
dispvars->display, 0 /layer/, &(dispvars->dst_rect),
onscreen, &(dispvars->src_rect),
DISPMANX_PROTECTION_NONE, dispvars->alpha, 0 /clamp/,
/VC_IMAGE_ROT0/ 0 );
vc_dispmanx_update_submit( dispvars->update,
(DISPMANX_CALLBACK_FUNC_T) DISPMANX_VsyncEnds,
dispvars );
return 0;
}
static int DISPMANX_FlipHWSurface(_THIS, SDL_Surface *surface)
{
pthread_mutex_lock(&rlock);
vc_dispmanx_resource_write_data(
writable,
dispvars->pix_format, dispvars->pitch, dispvars->pixmem,
&(dispvars->bmp_rect) );
pthread_mutex_unlock(&rlock);
DISPMANX_RESOURCE_HANDLE_T temp = writable;
writable = written;
written = temp;
return (0);
}

Note that I tried to isolate some parts with mutex locks. They made no differente at all..
Any ideas are welcome.

@popcornmix
Copy link
Contributor

I don't believe it is safe to call vchiq sending functions from a callback.
The callback is called from dispmanx_notify_func which is the task that consumes vchiq messages from GPU.
If the callback is blocked, it will cause the GPU to be blocked waiting to send a message which could cause the update you are submitting to block.

I'd recommend creating a new task for submitting updates, and waking that task from DISPMANX_VsyncEnds callback.

@vanfanel
Copy link
Author

Ok, thanks a lot for your helpful response, popcornmix.

I came up with another implementation where I create a new pthread as a consumer. This thread blocks (pthread_cond_wait) until signaled from the vsync callback function (pthread_cond_signal).

However, I'm still getting a deadlock in vc_dispmanx_resource_write_data(), as shown by this gdb backtrace:

#0  0x400d8258 in __lll_lock_wait () from /lib/arm-linux-gnueabihf/libpthread.so.0
#1  0x400d2cac in pthread_mutex_lock () from /lib/arm-linux-gnueabihf/libpthread.so.0
#2  0x402ebab8 in vc_dispmanx_resource_write_data () from /opt/vc/lib/libbcm_host.so
#3  0x40088394 in DISPMANX_FlipHWSurface (this=0x12008, surface=0x1ad50)
    at ./src/video/dispmanx/SDL_fbvideo.c:854
#4  0x400780ac in SDL_Flip (screen=0x1ad50) at ./src/video/SDL_video.c:1169
#5  0x00008b9c in main (argc=2, argv=<optimized out>) at parallax2.c:254

I can "fix" it by using I use my own mutex, locking it while updating in the consumer thread (update start - update submit) and locking it too while writing data (vc_dispmanx_resource_write_data) in the producer.

This is the function passed to pthread_create, wich starts the "consumer" thread: this thread, as you can see, blocks on the pthread_cond_wait() call and is waken by the vsync callback:

void *DISPMANX_SendUpdates (){
        DISPMANX_RESOURCE_HANDLE_T temp1;
        DISPMANX_RESOURCE_HANDLE_T temp2;

        //Temporal solution: we should be able to set the vsync cb wihout this, but...how?
        dispvars->update = vc_dispmanx_update_start( 0 );
        vc_dispmanx_update_submit( dispvars->update, 
             (DISPMANX_CALLBACK_FUNC_T) DISPMANX_VsyncEnds,
             dispvars );

        while (keep_consumer_alive){
                pthread_cond_wait (&vsync_completed, &rlock);
                temp1 = writable;
                temp2 = onscreen;
                onscreen = written;
                writable = temp1;
                written = temp2;

                printf ("\nIn the next VSYNC %d will be put", onscreen);

                pthread_mutex_lock (&rlock);
                dispvars->update = vc_dispmanx_update_start( 0 );
                vc_dispmanx_element_remove(dispvars->update, dispvars->element);

                dispvars->element = vc_dispmanx_element_add( dispvars->update,
                dispvars->display, 0 , &(dispvars->dst_rect),
                onscreen, &(dispvars->src_rect),
                DISPMANX_PROTECTION_NONE, dispvars->alpha, 0 , 0 );

                vc_dispmanx_update_submit( dispvars->update,
                        (DISPMANX_CALLBACK_FUNC_T) DISPMANX_VsyncEnds, dispvars );
                pthread_mutex_unlock (&rlock);
        }
        pthread_exit (NULL);
}

This is the function that runs on the "producer" thread, wich writes to alternating resources:

static int DISPMANX_FlipHWSurface(_THIS, SDL_Surface *surface)
{
        pthread_mutex_lock (&rlock);
        vc_dispmanx_resource_write_data(
           writable,
           dispvars->pix_format, dispvars->pitch, dispvars->pixmem,
           &(dispvars->bmp_rect) );
        pthread_mutex_unlock (&rlock);

        DISPMANX_RESOURCE_HANDLE_T temp = writable;
        writable = written;
        written = temp;
        return (0);
}

And this is the vsync callback function, wich awakes the "consumer" thread by signaling vsync_completed:

DISPMANX_CALLBACK_FUNC_T DISPMANX_VsyncEnds(){
        pthread_cond_signal (&vsync_completed);
        return 0;
}

So, my question is: why do I have to lock and unlock in those points to avoid the deadlock? Is it impossible for update_* and write_data to cooperate internally? I took a look at the dispmanx userland code and it seems they use their own pthreads, wich they lock/unlock as needed.
I'd like to avoid using my own mutexes this way, because I'm blocking in souch a way that things don't work as expected: an vsync callback can be blocked, an update can be blocked, etc...

@popcornmix
Copy link
Contributor

I don't immediately see why the locks are needed.
Can I suggest you use vc_dispmanx_element_change_source rather than remove and add.

@vanfanel
Copy link
Author

I don't immediately see why the locks are needed.
They are needed because, if I don't use them, vc_dispmanx_resource_write_data() deadlocks with the vc_dispmanx_update_start() and/or vc_dispmanx_update_submit() calls, internally.

Can I suggest you use vc_dispmanx_element_change_source rather than remove and add.
I'm now using vc_dispmanx_element_change_source() instead, thanks!
However, vc_dispmanx_resource_write_data() still deadlocks with vc_dispmanx_update_*. I find it strange and it shouldn't happen at all, but I have to use the mutex anyway because, as you can see, both thread share some variables that are accessed in those code segments.
If you think they could be blocking something they shouldn't, please tell me.

Even if with the mutex this implementation is stable, I'm still getting a lot of tearing, wich makes me think I'm using incomplete/dirty buffers. That shouldn't happen unless vc_dispmanx_resource_write_data() returns immediately: if we suppose it returns only after write has completed, I should'n get the "on screen" buffer overwritten... So, how is vc_dispmanx_resource_write_data() supposed to work? I mean, does it wait immediately or after write is completed?

@popcornmix
Copy link
Contributor

This doesn't look right:

                temp1 = writable;
                temp2 = onscreen;
                onscreen = written;
                writable = temp1;
                written = temp2;

This doesn't change writable and swaps written and onscreen. Did you intend it to do a three element rotate?

@popcornmix
Copy link
Contributor

Closed as no response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants