Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

emscripten main loop hack #843

Closed
voidware opened this issue Jun 19, 2023 · 21 comments
Closed

emscripten main loop hack #843

voidware opened this issue Jun 19, 2023 · 21 comments

Comments

@voidware
Copy link

Hi,

Just got the latest code. A few minor changes needed here, no problems. Seems working fine. thanks.

So;

I'd really like to eliminate my Sokol hacks in sokol_app.h so to use the official codebase. Firstly, thanks for fixing the emsc_event->code[0] keyboard problem. One less hack of mine needed!

This is an emscripten build and I thought I'd try to eliminate another hack. I've had this hack in my code for over a year and it works fine, but the official code just produces a blank screen with no errors.

_SOKOL_PRIVATE void _sapp_emsc_run(const sapp_desc* desc) {
...

    /* start the frame loop */

    // this is the sokol version 
    ///emscripten_request_animation_frame_loop(_sapp_emsc_frame, 0);

    // this is my version
    emscripten_set_main_loop(_sapp_emsc_frame_void, 0, true);
}

and;

_SOKOL_PRIVATE void _sapp_emsc_frame_void()
{
    // XX HACK!
    _sapp_emsc_frame(0, 0);
}

I'm at a loss to explain why the official main loop doesn't work and mine does because i can't see anything wrong with it.

Any ideas? Thanks.

@floooh
Copy link
Owner

floooh commented Jun 19, 2023

Don't know tbh, I just tried the reverse by putting your code into sokol_app.h and that worked (even though it passes zero to the time parameter of the _sapp_emsc_frame() function (which I would expect to cause problems, e.g. divisions by zero).

Is your _sapp_emsc_frame() function otherwise identical? I would check next to put some printf() into the frame callback to check whether this is actually called for each frame.

PS: compiling and linking also works without problems? At first I had linker problems when adding emscripten_set_main_loop because of this problem in emsdk 3.1.38: emscripten-core/emscripten#19363, I worked around by removing the NO_FILESYSTEM linker option.

@voidware
Copy link
Author

Hi. Thanks for your reply.

Standard Sokol emsc flow, i get one frame called and that's it.

Here's what happens;

int main(int argc, char* argv[]) {
    Sp_desc desc = sokol_main(argc, argv);
    _sapp_emsc_run(&desc);
    return 0;
}

then;

_SOKOL_PRIVATE void _sapp_emsc_run(const sapp_desc* desc) {
...               
    emscripten_request_animation_frame_loop(_sapp_emsc_frame, 0);
}

This renders a single frame, and then returns from _sapp_emsc_run which returns to main, which return.

Then everything stops!

However

Doing this;

_SOKOL_PRIVATE void _sapp_emsc_run(const sapp_desc* desc) {
...               
    emscripten_set_main_loop(_sapp_emsc_frame_void, 0, true);
}

Does not return! and does not return to main.

So;

Is it something i'm doing that means returning to main halts everything (although it doesn't crash). or should there be something else stopping return to main ?

Also, FYI, I do have (and use) a filesystem which means i've removed the NO_FILESYSTEM link option.

@floooh
Copy link
Owner

floooh commented Jun 20, 2023

Does your _sapp_emsc_frame() return EM_TRUE? E.g. it should look like this:

_SOKOL_PRIVATE EM_BOOL _sapp_emsc_frame(double time, void* userData) {
    // ....
    return EM_TRUE;
}

Looking at git-blame, the old frame function which used emscripten_set_main_loop didn't have a return value, but now with the request-animation-frame version, the return value indicates whether the frame loop should continue or not.

For instance see here: if _sapp.quit_ordered is set, the function will return EM_FALSE, and the loop stops:

sokol/sokol_app.h

Lines 5804 to 5810 in 177e2c7

if (_sapp.quit_ordered) {
_sapp_emsc_unregister_eventhandlers();
_sapp_call_cleanup();
_sapp_discard_state();
return EM_FALSE;
}
return EM_TRUE;

(...so the other thing to check would be if _sapp.quit_ordered or before that _sapp.quit_requested is set already in the first frame for some reason.

@voidware
Copy link
Author

Thanks for the suggestion.

I tried that, but it wasn't the problem. I commented the whole thing out in sokol_app.h. No change

Then;

I had a go at building the sokol-sample html5/imgui-emsc.cc using the same build options I'm using for my app.

IMGDIR=b:/tools/imgui
SOKDIR=b:/tools/sokol
OPTS="-std=c++17 -Os -DNDEBUG"
EMS="-s USE_WEBGL2=1"
EMS+=" -s ASYNCIFY=1"
EMS+=" -s ALLOW_MEMORY_GROWTH=1"
LDFLAGS=" -lidbfs.js"
em++ imgui.cpp $1.cc -o web/index.html -I$IMGDIR -I$IMGDIR/misc/single_file -I$SOKDIR/util $OPTS -DSOKOL_IMPL -DSOKOL_GLES3 -I$SOKDIR $EMS -fno-rtti -fno-exceptions $LDFLAGS
cp shell-async.html web/index.html

I thought perhaps stuff like ASYNCIFY might be connected.

The imgui-emsc app works fine (as you'd expect).

So then i went in and changed the code of imgui-emsc to use sokol_app, adding sokol_main and so on.

Also works fine!

As you see from the script above, I'm pointing it to the same imgui and sokol I'm using for my app (any hacks included).

This is really weird; build my app and blank screen, build imgui-emscc, works fine!

Put my hack, emscripten_set_main_loop(_sapp_emsc_frame_void, 0, true) into sokol_app.h - both work fine!

Now I'm stumped.

@floooh
Copy link
Owner

floooh commented Jul 10, 2023

@voidware any new info on this? Can the ticket be closed?

@guillaumeblanc
Copy link

Hi,

I'm glad to find an existing discussion on the topic, cos I'm also struggling to make sense of what I'm experimenting. I'll share my findings.

Using saap with SOKOL_NO_ENTRY, I noticed that saap_run returns (ie is not blocking) even before init cb is called.
From my understanding, emscripten_request_animation_frame_loop is not a blocking function. It registers the emsc frame callback, and returns. The frame cb is called by the browser, later.

I think it works with sokol examples, because of the -NO_EXIT_RUNTIME=1 linker option, which means the program will continue to execute even after main (and saap_run) has exited.

emscripten_set_main_loop with simulate_infinite_loop at true fixes the issue, which completely makes sense.

I couldn't make NO_EXIT_RUNTIME work, but that's another discussion.
I'm not really happy with NO_EXIT_RUNTIME anyway, as I have objects on the stack (thanks to SOKOL_NO_ENTRY) I need to keep alive.

I'll test emscripten_set_main_loop, thanks for the tips.

Hope it helps,
Guillaume

@floooh
Copy link
Owner

floooh commented Sep 14, 2023

Hmm, looking at https://github.com/emscripten-core/emscripten/blob/c65a321b603de497aa0168f542a88e95e88264a4/src/settings.js#L83-L93, NO_EXIT_RUNTIME=1 is actually the default setting (because it's derived from EXIT_RUNTIME setting, the NO_ is just the inversion. I think I'll go ahead and just remove the NO_EXIT_RUNTIME setting from fips, since it is redundant.

I can confirm that the sokol samples have trouble when the exit runtime is enabled though (most likely because the samples keep state in global variables and this might be zeroed out when main() returns and the exit runtime runs.

The behaviour that the main() function returns immediately is entirely normal on Emscripten, otherwise the browser event loop would appear to be stuck and the tab would be killed after a few seconds (unless tricks like ASYNCIFY are used), and from that perspective it's also correct that the exit runtime is disabled by default, because global state must stay alive after main() returns.

Also any actions on exit don't make a lot of sense anyway when running in the browser, because the user can close the tab at any time, and this will just stop the application without it having a chance to run any "exit code" (there's a beforeunload event, but this isn't reliable (see https://developer.mozilla.org/en-US/docs/Web/API/Window/beforeunload_event#usage_notes) - it's only useful for the "Leave site?" popup to prevent unsaved data loss.

@guillaumeblanc
Copy link

guillaumeblanc commented Sep 15, 2023

Hi,

Removing NO_EXIT_RUNTIME (as it's the default behavior) is a good idea indeed.

I'm starting understanding my main blocker: -sASSERTIONS=1 seems to force EXIT_RUNTIME (at least throw assertions when crt functions are called after exit). For now I need to force -sASSERTIONS=0 even in debug.

The behaviour that the main() function returns immediately is entirely normal on Emscripten

This is not what I was expecting after reading this comment about sapp_run : https://github.com/floooh/sokol/blob/master/sokol_app.h#L970

Also any actions on exit don't make a lot of sense anyway

The code executed after sapp_run is actually destructors from objects on the stack, which works ok on other platforms.

Cheers,
Guillaume

@floooh
Copy link
Owner

floooh commented Sep 17, 2023

This is not what I was expecting after reading this comment about sapp_run :

Ooops, yeah that comment is most likely a leftover from when sokol_app.h was using emscripten_set_main_loop() with the simulate_infinite_loop arg. This is throwing a JS exception in order to exit the emscripten_set_main_loop() function so that execution is passed back to the browser, but without crawling back up the C call stack.

which works ok on other platforms.

Be aware though that on iOS, it also isn't guaranteed that exit code is called. There are situations where the OS will simply terminate the application (even without calling the iOS specific applicationWillTerminate method).

@floooh
Copy link
Owner

floooh commented Sep 18, 2023

@guillaumeblanc I have clarified sapp_run() behaviour in that outdated comment section:

sokol/sokol_app.h

Lines 952 to 993 in d10614a

OPTIONAL: DON'T HIJACK main() (#define SOKOL_NO_ENTRY)
======================================================
NOTE: SOKOL_NO_ENTRY and sapp_run() is currently not supported on Android.
In its default configuration, sokol_app.h "hijacks" the platform's
standard main() function. This was done because different platforms
have different entry point conventions which are not compatible with
C's main() (for instance WinMain on Windows has completely different
arguments). However, this "main hijacking" posed a problem for
usage scenarios like integrating sokol_app.h with other languages than
C or C++, so an alternative SOKOL_NO_ENTRY mode has been added
in which the user code provides the platform's main function:
- define SOKOL_NO_ENTRY before including the sokol_app.h implementation
- do *not* provide a sokol_main() function
- instead provide the standard main() function of the platform
- from the main function, call the function ```sapp_run()``` which
takes a pointer to an ```sapp_desc``` structure.
- from here on```sapp_run()``` takes over control and calls the provided
init-, frame-, event- and cleanup-callbacks just like in the default model.
sapp_run() behaves differently across platforms:
- on some platforms, sapp_run() will return when the application quits
- on other platforms, sapp_run() will never return, even when the
application quits (the operating system is free to simply terminate
the application at any time)
- on Emscripten specifically, sapp_run() will return immediately while
the frame callback keeps being called
This different behaviour of sapp_run() essentially means that there shouldn't
be any code *after* sapp_run(), because that may either never be called, or in
case of Emscripten will be called at an unexpected time (at application start).
An application also should not depend on the cleanup-callback being called
when cross-platform compatibility is required.
Since sapp_run() returns immediately on Emscripten you shouldn't activate
the 'EXIT_RUNTIME' linker option (this is disabled by default when compiling
for the browser target), since this would be called immediately at
application start, causing any global objects to be destroyed and global
variables to be zeroed.

...I also wonder if it would make sense to add a config-boolean to sapp_desc to select between emscripten_request_animation_frame_loop() and emscripten_set_main_loop() with the simulate_infinite_loop enabled, the latter would prevent any code being executed after sapp_run() (and thus no destructors or exit runtime being called).

@guillaumeblanc
Copy link

Hi,

Fixing the documentation was the most important part. It's pretty clear now.

Why do you say that "An application also should not depend on the cleanup-callback being called" ?
I didn't experience that. It becomes pretty difficult to implement a portable solution if it's the case.

Using emscripten_set_main_loop would make emscripten behavior closer to other platforms, with 2 code paths instead of 3.
Why did you switch from emscripten_set_main_loop to emscripten_request_animation_frame_loop ?

Thanks a lot !!
Guillaume

@floooh
Copy link
Owner

floooh commented Sep 19, 2023

An application also should not depend on the cleanup-callback being called

There's (at least) two platforms where this can happen:

  • on the web, the user can close the tab at any time, and that's it, the application is simply gone
  • likewise, Chrome has started to 'discard' background tabs more aggressively: https://developer.chrome.com/blog/memory-and-energy-saver-mode/, this also mentions that the various 'unload callbacks' are unreliable (and I can confirm this because I tinkered with them for exactly that reason)
  • on iOS, there's the applicationWillTerminate event, but this also is rarely called (see comment here:

    sokol/sokol_app.h

    Lines 4503 to 4512 in b803c9a

    /* NOTE: this method will rarely ever be called, iOS application
    which are terminated by the user are usually killed via signal 9
    by the operating system.
    */
    - (void)applicationWillTerminate:(UIApplication *)application {
    _SOKOL_UNUSED(application);
    _sapp_call_cleanup();
    _sapp_ios_discard_state();
    _sapp_discard_state();
    }
    )

In general, you should put your cleanup code into the sokol_app.h cleanup callback, because that is the most portable solution. sokol_app.h just cannot guarantee that the cleanup function will be called on all platforms (generally in cases where the 'system' simply terminates the application, without giving the application an opportunity to react).

As for why emscripten_request_animation_loop: IIRC the main reason was because the frame callback of that function gives me the timestamp of the current frame which I'm directly using as input to the frame duration computation.

(note that emscripten_set_main_loop() doesn't change anything about cleanup code, with the simulate_infinite_loop param it will simply never call any code following the call - its only purpose is to avoid that any cleanup or exit-runtime code is accidentally called at startup).

@guillaumeblanc
Copy link

All clear.

I agree that cleanup cb is the most portable solution, I switched to it.

If you ask me, I'd switch to emscripten_set_main_loop without adding a new option. It would make sapp_run behavior closer to other platforms, the documentation simpler and the api less error-prone. You can get timestamp with performance.now() I suppose.

Hope it helps.
Thanks a lot !
Guillaume

@floooh
Copy link
Owner

floooh commented Sep 20, 2023

My hope with getting the timestamp from the requestAnimationFrame() callback is that it contains less jitter then calling performance.now() at the start of my own callback. Ideally, the rAF timestamp would be the time of when the last frame was presented, similar to native swapchain APIs, but that may be a little much to hope for.

Also, there this an interesting property of the timestamp that even though it has limited precision on some browsers (e.g. on Safari it's only 1ms granularity), taking the average over a couple of frames gives an exact result (e.g. 16.667ms on 60Hz displays or 8.333ms on 120Hz displays), I don't know if this is an intended feature of the RAF timestamp, or just the regular sideeffect of filtering a noisy random signal though (but since Safari only returns full milliseconds, e.g. 16.0 or 17.0, but simply rounding a jittery signal would always tend towards 17.0, I'm thinking it's intentional.

@Dvad
Copy link

Dvad commented Feb 22, 2024

My experience on this topic:

I am using the PROXY_TO_PTHREAD option which is usually recommended by emscripten:
https://emscripten.org/docs/porting/pthreads.html#blocking-on-the-main-browser-thread
Does anybody have any experience with this option?

I can not make the current main loop setup (using emscripten_request_animation_frame_loop) work with that option on. Whatever combination of EXIT_RUNTIME and other option I use. Switching to emscripten_set_main_loop() works well.

To add to the discussion about the timestamp. From:
https://emscripten.org/docs/api_reference/html5.h.html#c.emscripten_request_animation_frame_loop
it is specified that the time passed to the callback is actually performance.now() called just before calling the callback. Does that imply there should not be much more jitter with the set_main_loop + performance.now() approach?

On the other end MDN states hand wavily:

The timestamp value is also similar to calling performance.now() at the start of the callback function, but it is never the same value.

I wonder if the jitter is something that could be measured.

@floooh
Copy link
Owner

floooh commented Feb 22, 2024

My current preferred solution would be to add a new config bool to sapp_desc called html5_use_emscripten_set_main_loop which would then switch to a different main loop code path, but keep the default behaviour on emscripten_request_animation_frame(). I can try to squeeze this in after the big sokol-gfx render pass cleanup is merged, or if anybody wants to pick this up as PR I'd be fine with that too.

Here's a different PR for reference (which is missing the selection between rAF and set_main_loop though):

#915

@voidware
Copy link
Author

option on sapp_desc;

Yes please. That would be splendid. Thanks.

@Dvad
Copy link

Dvad commented Feb 22, 2024

What about something like this?
#997

Edit: Oops, I just realized you were talking about a runtime bool and not compiled time option as in #997. I wonder if the ability to choose at runtime option is useful though. Specially since some of the other compile time options can make the whole app freeze if not using set_main_loop .

@floooh
Copy link
Owner

floooh commented Feb 23, 2024

@Dvad I would really prefer a runtime bool because it keeps the number of builds down (e.g. I can add a sample which uses set-main-loop method without having to mess around in the build system and/or build a separate sokol library).

@floooh floooh added the html5 label Mar 2, 2024
@floooh
Copy link
Owner

floooh commented Mar 2, 2024

Ok, PR #997 has been merged. I also added a separate config option for the simulate_infinite_loop arg to sapp_desc (not sure if it's useful but in any case it doesn't hurt).

@floooh floooh closed this as completed Mar 2, 2024
@voidware
Copy link
Author

voidware commented Mar 3, 2024

Thanks for this. I can report everything is working here spiffingly.

Also the other breaking changes i needed to fix. Thanks for the excellent docs on migration. All good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants