Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to upstream pthread #27

Closed
wants to merge 3 commits into from
Closed

Update to upstream pthread #27

wants to merge 3 commits into from

Conversation

Meziu
Copy link
Member

@Meziu Meziu commented Apr 12, 2023

Closes #26

This is an update to base our implementation on top of the upstread pthread that has been recently added.
Most calls have been either removed (if they were fully re-implemented by newlib, like those for Mutexes and Condvars) or have been morphed into "syscalls" (as they call them), which are downstream implementations of the pthread functions.

There are a few things to note here:

  • The package size decreased immensely because a big chunk of the implementations is already implemented.
  • All of libc seems to be already compatible with the changes, so there are no changes we need to do over there.
  • The new __syscall_thread_create function CANNOT receive priority or cpu affinity as its parameters, since it doesn't receive a pthread_attr_t as argument (very infuriating)

The last part is absolutely horrendous, since it would mean a BIG step back on our progress to "normalize" priority and affinity. However, we can't ignore the massive importance of these new impls, and that probably libctru itself will one day support them directly.

As such, I made this PR to show you what a "new" pthread-3ds may look like, and to discuss its future.

I'll start the discussion on 2 questions:

  1. If we can set the priority when a thread is already running, maybe we don't need to pass the priority during creation?
  2. Since, to support cpu affinity (especially on the sys_core), we would need a custom thread creation process regardless, can we simply avoid passing the special arguments to pthread-3ds and instead use a different solution? (something like a temporary thread local variable which we can set from ctru-rs and read in pthread-3ds).

I know these solutions are pretty ugly and unstable, but I honestly don't think we can stop the changes made in newlib or libctru simply because "we are against that". There is progress being made on those fronts and perhaps we should just adapt. However, we should try to PR some changes regarding to “sched_yield” and “thread_create”, because as it stands we just lose functionality.

Edit: This PR has been thoroughly tested with std::thread::spawn and std::sync::Mutex, so I am 100% certain it works. However, it requires the latest dependencies from dkARM and this ctru-sys.

src/misc.rs Outdated Show resolved Hide resolved
@ian-h-chamberlain
Copy link
Member

ian-h-chamberlain commented Apr 13, 2023

The new __syscall_thread_create function CANNOT receive priority or cpu affinity as its parameters, since it doesn't receive a pthread_attr_t as argument (very infurianting)

Ugh, this is really unfortunate, especially since the pthread_create etc. are not weakly linked so we can't even override those implementations. I wonder if we can set priority / affinity just after thread creation instead? One of the comments on the new proposal I opened suggested an API like that on the Builder, so maybe that would be a good direction for us to go, if we can't upstream changes to newlib?

I am glad we will have less to maintain here in favor of the "canonical" upstream implementation, but the loss of some functionality is definitely unfortunate. I think we'd also need to implement stuff like pthread_setschedparam (as opposed to pthread_attr_*) pthread_setaffinity_np for the above to work as well.

Edit: closed PR by mistake

@Meziu
Copy link
Member Author

Meziu commented Apr 13, 2023

I think we'd also need to implement stuff like pthread_setschedparam (as opposed to pthread_attr_*) pthread_setaffinity_np for the above to work as well.

  1. Is there any way to test if we can actually change the priority during runtime? We already have an implementation of pthread_setschedparam, but I remember that the OS specifics (around svcSetThreadPriority) are still a bit undocumented.
  2. The same goes for svcSetThreadIdealProcessor: does it do anything?

If we are sure we're able to set priority and ideal processor after thread creation, it wouldn't hurt as much to just leave to set afterwards, like most Thread implementations do.

I could try to run some tests, but I'm very low on time.

@ian-h-chamberlain
Copy link
Member

Is there any way to test if we can actually change the priority during runtime? We already have an implementation of pthread_setschedparam, but I remember that the OS specifics (around svcSetThreadPriority) are still a bit undocumented.
...
I could try to run some tests, but I'm very low on time.

I can try to do some testing today, and see what all the various functions actually do. My guess is that those functions wouldn't exist if they didn't do something, but it's also possible they are just backports or something from Switch (or they only work in kernelspace and not userspace, require privileged app etc).

@ian-h-chamberlain
Copy link
Member

ian-h-chamberlain commented Apr 13, 2023

Ok, got some preliminary results based on the thread-info example we already had. It looks like (on my New 3DS XL anyway) the affinity and processor ID calls are not supported, but priority setting is:

test app setting all the thread properties
FIRM version: 11.16.0-49U
main thread ID: 0x16df
main thread running on processor: 0
main thread ideal processor: 0
ctru_sys::svcSetThreadIdealProcessor error: libctru result code 0xF8C007F4: [fatal kernel] not_supported: not_implemented
main thread running on processor: 0
main thread ideal processor: 0
main thread affinity: 0b0001
ctru_sys::svcSetThreadAffinityMask error: libctru result code 0xF8C007F4: [fatal kernel] not_supported: not_implemented
main thread affinity: 0b0001
main thread priority: 0x30
ctru_sys::svcSetThreadPriority success
main thread priority: 0x2f
ctru_sys::svcSetThreadPriority success
Rust thread ID: 0x16e4
Rust thread running on processor: 0
Rust thread ideal processor: 0
ctru_sys::svcSetThreadIdealProcessor error: libctru result code 0xF8C007F4: [fatal kernel] not_supported: not_implemented
Rust thread running on processor: 0
Rust thread ideal processor: 0
Rust thread affinity: 0b0001
ctru_sys::svcSetThreadAffinityMask error: libctru result code 0xF8C007F4: [fatal kernel] not_supported: not_implemented
Rust thread affinity: 0b0001
Rust thread priority: 0x30
ctru_sys::svcSetThreadPriority success
Rust thread priority: 0x2f
Rust thread exited
libctru thread ID: 0x16e5
libctru thread running on processor: 0
libctru thread ideal processor: -1
ctru_sys::svcSetThreadIdealProcessor error: libctru result code 0xF8C007F4: [fatal kernel] not_supported: not_implemented
libctru thread running on processor: 0
libctru thread ideal processor: -1
libctru thread affinity: 0b1111
ctru_sys::svcSetThreadAffinityMask error: libctru result code 0xF8C007F4: [fatal kernel] not_supported: not_implemented
libctru thread affinity: 0b1111
libctru thread priority: 0x2b
ctru_sys::svcSetThreadPriority success
libctru thread priority: 0x2f
libctru thread exited

(I spawned the libctru thread using ctru_sys::threadCreate just to show that its affinity / ideal processor could be set).

I found some references indicating that this has probably been stubbed out for a long time in the kernel:
https://www.3dbrew.org/wiki/SVC#System_calls
https://www.3dbrew.org/wiki/8.0.0-18

So, this leaves us in a tough position. I guess the best way would probably be to ask for some kind of parameter support in the __syscall_thread_create upstream (maybe by passing _attr directly instead of separate stack size?). This would most likely invalidate the switchbrew code that seems to be the only user of this syscall https://github.com/switchbrew/libnx/blob/master/nx/source/runtime/newlib.c#L172 so getting that support might be tricky.

Edit: actually, since switchbrew is devkitA64 maybe it doesn't matter as much... I believe the patch may have originated from there but devkitARM could in theory have different signatures for pthread support. Still, if they are using the same sources it might become a pain to maintain different version (although some well placed #defines might solve the issue).

As far as the rust-lang thread proposal is concerned... we could probably try to continue down the road of a callback-type API to set stuff just after thread creation (instead of in the thread syscall itself), but obviously that would only help us with priority. I'm also not sure how well this type of stuff translates to other OSes, but maybe the callback type API is more flexible for full blown operating systems like Linux etc.

@Meziu
Copy link
Member Author

Meziu commented Apr 13, 2023

Yeah, after some rudimentary tests I got to the same conclusions. Priority can and should be handled by std. The callback model seems to work fine so that's not a concern.

About the processorId: the first thing we should do is try to contact (via an issue or whatever) dkArm. If we manage to change the syscall signature we are back on having our original functionality. Otherwise, we should look in ways to circumvent the walls via ctru-rs, but that's our last solution as it would involve some pretty ugly and unstable code.

Regardless, I wasn't able to find the changes to pthread in their devkitARM branch, did they publish them anywhere?

@ian-h-chamberlain
Copy link
Member

Regardless, I wasn't able to find the changes to pthread in their devkitARM branch, did they publish them anywhere?

I have been looking into that too, the only place I can find existence of it is https://github.com/devkitPro/buildscripts/blob/cd5e224d6a14f7d32712ab10cfc08e0c6a2daea3/dkarm-eabi/patches/newlib-4.3.0.20230120.patch so far, but perhaps the intention is deliver the same patch to https://github.com/devkitPro/newlib/tree/devkitARM eventually (I see an older commit from when they updated to 4.2.0, and the patch to buildscripts is newer than any commit on the newlib/devkitARM branch).

I think probably what we should do is open an issue against https://github.com/devkitPro/newlib, asking what the general status / expectations of pthreads on 3DS/devkitARM are, without asking for anything just yet. Depending on how that goes we can try to figure out the best path forward, but I don't want to rush into asking for features from the devkitPro team especially since we are trying to do some pretty nonstandard / unsupported stuff as far as they are concerned.

@Meziu
Copy link
Member Author

Meziu commented Apr 13, 2023

I think probably what we should do is open an issue against https://github.com/devkitPro/newlib, asking what the general status / expectations of pthreads on 3DS/devkitARM are, without asking for anything just yet. Depending on how that goes we can try to figure out the best path forward, but I don't want to rush into asking for features from the devkitPro team especially since we are trying to do some pretty nonstandard / unsupported stuff as far as they are concerned.

Can I leave the diplomacy to you? You seem pretty experienced on this side of things, lol.

@ian-h-chamberlain
Copy link
Member

Can I leave the diplomacy to you? You seem pretty experienced on this side of things, lol.

Haha, sure. I don't know that I'm so experienced, just know they have discouraged people from trying to use unsupported languages etc. in the past. I'll try to write something up in the next few days.

src/thread.rs Outdated Show resolved Hide resolved
src/thread.rs Outdated Show resolved Hide resolved
src/misc.rs Outdated Show resolved Hide resolved
@Meziu Meziu marked this pull request as ready for review April 15, 2023 14:21
@Meziu Meziu changed the title Test update to upstream pthread Update to upstream pthread Apr 15, 2023
@Meziu
Copy link
Member Author

Meziu commented Jul 6, 2023

@ian-h-chamberlain Well, it seems like this issue got where we predicted. I've just noticed that @fincs has officially started supporting the new signature in libctru via this commit devkitPro/libctru@e9fa000. It hasn't made its way into a release yet, but it's only a matter of days.

In their implementation the cpu-id issue got "ignored" (obviously, there isn't much to do about it), and instead they just default to spawning threads on the main core. Those changes also invalidate the need for even more functions which are still present in this PR, so my guess is that this repo as a whole will end up exactly like shim-3ds.

From one point of view it's great, we need to maintain less code thanks to the advancements in the upstream toolchain, which is exactly what I wanted when I wrote this crate. However, unlike shim-3ds, where we basically got a free pass on the whole crate, I feel like pthread wasn't handled in the best way possible. I immensely respect the effort put into the whole project, but I believe the merge between dkA64 and dkARM made the support for the 3DS a tiny bit less precise. I just wished there was a way to pass pthread_attr to the underlying implementation (in a standard way).

At the end of the day, I can't really complain. Our best bet for now is to just adapt and stabilize our tools as much as possible. I believe we should port what remains of this repo over to ctru-rs and maybe work on system-specific functionality in there as well.

@AzureMarker
Copy link
Member

Maybe I missed a thread, but I don't see much follow-up from our side besides this issue:
devkitPro/newlib#26

We can reply again and ask for this feature request to be considered, and/or open an PR ourselves.

@Meziu
Copy link
Member Author

Meziu commented Jul 16, 2023

We can reply again and ask for this feature request to be considered, and/or open an PR ourselves.

I was thinking about re-discussing the issue in that thread/possible PRs and to generally work on it, but there is some complexity regarding the cross-support with Nintendo Switch code (which doesn't need our changes but would get impacted by them) and the newly added implementations in libctru which is making me desist from getting involved. It's still a very dynamic situation and I'm personally prioritizing working on our packages while I can (stuff such as the gazillion lines of documentation I'm currently writing for ctru-rs after the cargo-3ds release, in the hopes of closing rust3ds/ctru-rs#32).

@fincs
Copy link

fincs commented Jul 16, 2023

Fwiw, the current impl in libctru is still experimental/unfinished, do not expect it to ship in the near future. As mentioned previously, specifying custom thread priorities/affinity masks is currently not supported. The 3DS's version of the Horizon kernel has some limitations that are not a good match for the pthreads interface, and do not play nice with typical pthread-using code that expects a mainstream Unix-like kernel. For example:

  • 3DS Horizon only supports SCHED_FIFO policy, for 0x40 given priority levels. Switch Horizon additionally supports SCHED_OTHER, and only within a single (low) priority value. Most pthreads apps assume SCHED_OTHER is always available and used. More on scheduling policies
  • 3DS Horizon does not support automatic core migration at all (i.e. the kernel automatically selecting the best core to run a thread on at a certain time), and threads are always pinned to a given core. Changing the core assigned to a thread after creation does not make much sense either, due to the core layout used on 3DS (see below point). Switch Horizon on the other hand supports core migration, and we currently specify all created threads to be runnable on cores 0,1,2.
  • 3DS Horizon needs to take into account the fact that core1 is used by system processes, and applications can only create a single thread on core1, and only access a limited timeslice of it. On New3DS systems, core2 acts as a second core available to apps (but remember: no core migration), while core3 is generally unavailable to apps.

With this said, we are nonetheless open to improving the configurability of standard threading APIs in order to better exploit Horizon's threading system and facilitate compatibility with ported software. Any suggestions or proposals are of course welcome.

@ian-h-chamberlain
Copy link
Member

As an update, it looks like an initial implementation has landed in libctru: https://github.com/devkitPro/libctru/releases/tag/v2.3.0

It might be a good time to revive this work — we could probably eliminate even more of this library now that the syscalls are implemented upstream in https://github.com/devkitPro/libctru/blob/master/libctru/source/system/syscalls.c

@Meziu
Copy link
Member Author

Meziu commented Oct 29, 2023

Alright, it happened. I will look into the changes once I get a break from university exams. Thanks for the heads up @ian-h-chamberlain.

@Meziu
Copy link
Member Author

Meziu commented Aug 29, 2024

After looking into it a bit more, I've chosen not to merge the changes in the current state, since we are able to avoid using libctru's definitions (which still lack functionality compared to our bindings).

If we ever need to use them in the future, we should work on those upstream C definitions to regain compatibility, rather than change stuff here.

@Meziu Meziu closed this Aug 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

can we use upstream pthread instead of this library?
4 participants