-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove global state #68
base: master
Are you sure you want to change the base?
Conversation
This is really cool; I've tested this with -instances:2 on midi files that had polyphony >28 such that parts would cut out or go flat on a real sc-55 mk2, and this plays those beautifully. |
Yes please, make it easier to use as lib. |
Very nice @jcmoyer, this makes it a lot easier to turn this into a plugin that can be instantiated multiple times. Then clear separation and boundaries between the processing engine and the frontend is just good design. You should be able to run the processing engine completely "headless", without any UI, just from the command line. |
// Should be called after loading roms | ||
void EMU_Reset(emu_t& emu); | ||
|
||
void EMU_SetSampleCallback(emu_t& emu, mcu_sample_callback callback, void* userdata); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EMU_SetRenderCallback
and mcu_render_callback
would be better.
} | ||
} | ||
|
||
void MCU_Init(void) | ||
void MCU_DefaultSampleCallback(void* userdata, int* sample) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest MCU_DefaultRenderCallback
} | ||
} | ||
|
||
void FE_ReceiveSample(void* userdata, int *sample) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing something like an AudioFrame
struct we use in DOSBox Staging would be nicer than these raw unsafe pointers:
https://github.com/dosbox-staging/dosbox-staging/blob/main/include/audio_frame.h
Another general comment: using std::array
or std::vector
is zero-cost compared to C arrays but safer and there's potential to handle them in a nicer way (e.g., using iterators instead of error prone indexing and oldschool for loops).
These are all zero-cost and make for readable and more maintainable code. On MSVC on Windows you even get bounds checking for free with the standard STL containers in debug builds which is great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think this function should be changed (as well as MCU_PostSample) but I left it as I found it. The int*
is a performance pessimization and it's not an intuitive API.
using std::array or std::vector is zero-cost compared to C arrays but safer
I totally agree with you and I would have written it that way myself, but I didn't want to introduce a bunch of C++ features since the current code is not using them. As far as I can tell the only STL type used anywhere is std::string, and std::vector made it into the rtmidi module only because it's required for the rtmidi callback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
performance pessimization and it's not an intuitive API
Haha, okay, I can see I'm among friends—we're 100% on the same page 😏 I get it, you didn't want to change too much.
else if (sample[1] < INT16_MIN) | ||
sample[1] = INT16_MIN; | ||
RB_Write(fe.sample_buffer, sample[0], sample[1]); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, you're overthinking this a bit and micro-optimising early... You could just simply pass AudioFrame
structs around (or similar). Doing something like this is perfectly fine, so just using simple values:
AudioFrame RenderNextSample() { ... }
The compiler is more than capable of getting rid unnecessary copies, plus you'll get guaranteed copy elision and return value optimisation unless you're using some archaic compiler.
https://en.wikipedia.org/wiki/Copy_elision
So no need to complicate life with pointers and such (which can make things run slower in fact, because unsafe pointers can point to well anything, which makes the compiler's job a lot harder than just expressing intent by returning simple structs/values).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, you could simply try it and re-run your nice test suite that you used for the measurements. My guess is you won't see any performance degradation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice improvement by introducing audio_frame_t
👍🏻
src/ringbuffer.h
Outdated
struct ringbuffer_t { | ||
int16_t* samples; | ||
size_t frame_count; | ||
size_t sample_count; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion is to always use the terms frames, always deal with frames and frame counts, unless you absolutely need to do things at sample level. Simplifies the code a bit and makes terms less ambiguous.
In my own code, the only exception to this is using "sample rate" because that's such an established term (instead of the technically correct "frame rate").
So std::vector<AudioFrame>
would make this a lot simpler, then you don't need separate count vars which can be error-prone to maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good suggestion, and it fixes some parts of the API I was kind of unhappy with. I will try to get around to simplifying this tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice job @jcmoyer , left a few comments. You might take them onboard, or completely ignore them 😄 I'll probably use your code for my own stuff and do the suggested refactorings anyway (and I will credit you, of course).
This PR is an important step to take @nukeykt 's excellent work to the next level and introduce proper separation of concerns and proper multi-instance support by getting rid of globals.
mcu_timer_t* timer; | ||
lcd_t* lcd; | ||
pcm_t* pcm; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another general comment: could use std::unique_ptr
for these, and in general it would be good to get rid of pointers and use references and standard containers instead. All that would be zero-cost, no performance impact, or even could make things faster by expressing intent to the compiler instead of attempting to outsmart it by direct pointer access (doesn't always end well).
I saw you started using emu_t& emu
and similar stuff, so probably you have the same views as myself, just didn't want to blast the whole codebase apart in a single PR 😎
static const int SRAM_SIZE = 0x8000; | ||
static const int NVRAM_SIZE = 0x8000; // JV880 only | ||
static const int CARDRAM_SIZE = 0x8000; // JV880 only | ||
static const int ROMSM_SIZE = 0x1000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: could be static constexpr uint32_t
. Using uint32_t
(or uint16_t
) for hex addresses probably is a bit more correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, but I just moved these constants from elsewhere. I didn't change anything that didn't need to be.
void* callback_userdata; | ||
mcu_sample_callback sample_callback; | ||
|
||
SDL_mutex *work_thread_lock; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you're at it, you could take it further and initialise every var. So for example:
uint16_t ad_val[4] = {};
uint8_t ad_nibble = {};
uint8_t sw_pos = {};
uint8_t io_sd = {};
submcu_t* sm = nullptr;
pcm_t* pcm = nullptr;
mcu_timer_t* timer = nullptr;
lcd_t* LCD = nullptr;
Not initialising can lead to obscure bugs, and I wasted too much time on that, so these days I always initialise everything, it's cheap to do so and eliminated weird bugs that manifest only sometime, depending on the concrete memory layout (e.g. obscure bugs that crash your program every 10th run or something because some var you forgot to init is sometimes zero, and sometimes some random number).
Same comment for every var everywhere—always initialising is just good practice to save some headache later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but since memset has been used throughout the codebase to clear state I don't want to do anything that would make these types non-trivial. I thought it would be best to submit a PR as more-or-less C-style C++ in case @nukeykt doesn't approve of C++ features, and if he does then these things can be pretty easily cleaned up later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but since memset has been used throughout the codebase to clear state
Right, I actually missed that 😄 Not used to seeing memset much these days 😄
void MCU_WorkThread_Lock(void); | ||
void MCU_WorkThread_Unlock(void); | ||
void MCU_WorkThread_Lock(mcu_t& mcu); | ||
void MCU_WorkThread_Unlock(mcu_t& mcu); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how you feel about const-correctness, but a lot of args and vars could be made const
.
|
||
int romset = ROM_SET_MK2; | ||
|
||
frontend_t frontend; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm reading this correctly, this would end up allocating a giant chunk of memory in stack, which could cause problems. I did this exact refactor in my branch (https://github.com/giulioz/Nuked-SC55/tree/audiounit_jv880) and I started to have some stack overflows in some conditions by using this method. I think using a malloc here would be a bit safer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This structure is only 1688 bytes and you'll get 1-2mb of stack space by default on windows depending on the compiler, usually more on linux. At worst it's 0.16% of the total stack reservation. I did run into some stack overflow problems with the larger structures but those are malloc'd now and this type mostly contains pointers to those ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I misunderstood how the structures are created! You are actually creating the mcu and pcm data structures with a malloc, sorry :)
struct audio_frame_t { | ||
int16_t left; | ||
int16_t right; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve 😎
Does it mean it can be more easily integrated as a VST form? |
Yes. |
Hi,
|
Hi @Falcosoft I think you are building the EDIT: 1 is a bug, thanks for reporting it. As for 2, |
OK, thanks! |
This also reorders main() so that the mcu_t structure isn't zeroed after loading ROMs. ROM loading will probably need to be extracted from main at some point, especially for library usage.
The existing SDL application becomes the frontend. The backend consists of all the emulator modules (mcu, timer, pcm, etc). The goal here is to make the backend reusable between different frontend applications as a library.
emu is a collection of all the emulator subsystems that simplifies initialization and allows multiple emulators to be easily instantiated
Pass `-instances:<n>` to run `n` backend emulators. Up to 16 emulators are supported in a single process. Events are routed to the emulator modulo the number of instances, so if `n=2` even MIDI channels will go to the first emulator and odd MIDI channels to the second. - Rom loading code has been moved into emu module - LCDs now only handle events for their own window - MIDI module now forwards events to the frontend which in turn routes it to the correct backend emulator
This commit routes sysex messages to all emulators and fixes a bug in the win32 midi input where sysex messages would be duplicated.
- Proper error handling - Deal with all the warnings these changes introduced
It is not always zero when LCD_Init is called with memory from malloc()
This allows roms to be loaded on multiple threads simultaneously.
The midi input callback might be invoked after MIDI_Quit(). If this happens, midi_frontend is null and is not valid for FE_RouteMidi.
Missed this one, this state is now part of fe_emu_instance_t.
begin_step and wait callbacks are needlessly complicated, and controlling how often to step the MCU is really the responsibility of the frontend. These two callbacks have been removed and the logic baked into FE_RunInstance. The sample callback is fine since it provides a zero-copy way to get sample data out of the emulator.
@jcmoyer Please read your email 😏 (the one linked to your GitHub account) |
Hello, first of all thank you for releasing this project. The SC-55 is legendary and it is great to see such an important piece of history preserved.
The main goal of this PR is to remove global state from the emulator so that multiple instances can be created in a single process. This PR splits the emulator into a frontend (the SDL application) and a backend (the SC-55 emulator). A frontend can create and manage as many backends as it sees fit. As a side effect this also turns the backend into a library that can be linked into new frontends.
Global state is blocking a couple of issues:
In an effort to make this PR as mergeable as possible, I tried to make a minimum number of changes to existing code. New code attempts to follow existing patterns.
mcu_timer
had some variable shadowing (globalmcu_timer_t timer
+ localfrt_t* timer
) - this was resolved by renaming the localfrt_t* timer
toftimer
<module>_Init
function. For example,sw_pos
is initialized to 3 inMCU_Init
. They are not initialized inline because doing so would cause the compiler to generate a default constructor, which makes the type non-trivial so it cannot bememset
.That said, the changes weren't all simple and some problems took a bit more effort to solve. Other contributors might like to be aware of these points:
lcd_width
is now a field inlcd_t
.memset
to reset their state.TIMER_Reset
wasn't used anywhere. I renamed itTIMER_Init
and use it to initialize a timer and link it to its MCU.LCD_Init
instead of two places.m_back_path
was removed because moving astd::string
intolcd_t
would make it non-trivial.LCD_SetBackPath
was replaced withLCD_LoadBack
and should be called afterLCD_Init
.lcd_init
was removed. IfLCD_Init
succeeded you have an LCD.LCD_Update
intoLCD_HandleEvent
. Since multiple LCDs are supported, each LCD needs to handle only events targeted at it.main
is now located inmain_frontend.cpp
.emu_t
type was introduced that is conceptually a single, complete backend instance. It contains all of the SC-55 emulator state and links the individual modules together. Romset detection and loading has been moved here since romsets need to be loaded into multiple emulator components.ringbuffer_t
. The frontend creates a ringbuffer for each emu instance and fills it as the frontend receives sample data. All instance buffers are then read and mixed in the SDL audio callback. Right nowringbuffer_t
is specialized for signed 16-bit sample data but I would like for it to eventually become a template type to handle different types of sample data (useful for Output 32-bit float samples to preserve dynamic range and prevent clipping #59 for instance).As an example of what this PR enables I have modified the frontend SDL application so that you can pass
-instances:<n>
to openn
instances of the emulator. Non-sysex MIDI events are routed to theirmidi_channel_id % n
(so if n=2, even channels go to the first emulator, and odd channels go to the second one). Sysex events are broadcast to all emulators. This is a quick and easy way to solve polyphony problems at the expense of twice the computing power.Performance impact
Performance seems to be unchanged or marginally better. I benchmarked running the emulator as fast as possible without locking or waiting for the ringbuffer to drain for 100_000_000 iterations. These are the time measurements for 5 consecutive runs: