Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-111965: Using critical sections to make io.StringIO thread safe. #112116

Merged
merged 12 commits into from
Nov 19, 2023

Conversation

aisk
Copy link
Contributor

@aisk aisk commented Nov 15, 2023

Added the "critical sections" tags for most methods of the StringIO class, except the __new__ and __init__. The reference implementation from colesbury/nogil-3.12@6323ca60f9 dosen't do it also. I think it's because with the __new__ and __init__ process, other threads will not touch the newly created instance, so it does not need lock.

And for the getter closed and linebuffering, there are only a few read-only field access for conditioning (and will raise exception if failed), so it's not need to protect.

I don't know if I'm right for these two questions. And do we need some performance benchmark for the change? If so, I can provide some help later.

Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aisk.

I think we want critical sections for stringio_closed and stringio_closed for the accesses to closed because the object might be concurrent closed. You'll need to be careful because the CHECK_ macros may return and it's not safe to return from the middle of a critical section. It may be easiest to split those functions in two.

Similarly for stringio_newlines we want the critical section to cover at least the CHECK_INITIALIZED and CHECK_CLOSED.

Some basic performance measurements would not be a bad idea, but don't go overboard. They'll be more important after other changes to I/O objects.

@colesbury
Copy link
Contributor

We don't need critical sections in __new__. The __init__ functions is a bit more debatable; in theory one can call __init__ on an already initialized object. For now I'd like to avoid locking/critical_sections in __init__, but we may want to revisit that later.

@aisk
Copy link
Contributor Author

aisk commented Nov 16, 2023

Updated, and made a small performance test on my local machine, and not seen any noticeable performance difference before and after the commits. I just realized that in the default build (no --disable-gil flag), the critical sections macros are just dummy macro, so there should not be performance issue.

@aisk aisk requested a review from colesbury November 16, 2023 15:36
Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aisk. What's the performance impact on --disable-gil builds using your benchmark?

I suggested a different structure for the three getter functions that I think will make them more consistent with the rest of the file (in particular, not having to inline CHECK_INITIALIZED/CHECK_CLOSED).

Modules/_io/stringio.c Outdated Show resolved Hide resolved
@aisk aisk requested a review from colesbury November 17, 2023 09:33
@aisk
Copy link
Contributor Author

aisk commented Nov 17, 2023

I found that it's really common for the getters and setters to add the critical section guard, and makes the work repeatability and error-prone.

Can we add some code generation process, like the argument clinic stuff, or just introduce a C macro to reduce the repeatability, like:

static PyObject *
xxx_getter_impl(PyObject *self, void * context) {
    // original codes.
}

Py_CRITICAL_SECTION_GETTER(xxx_getter);  // Using this macro to generate the `xxx_getter` function and call the old `xxx_getter_impl`

@colesbury

@colesbury
Copy link
Contributor

@aisk, Yeah I think adding support for getters/setters to Argument Clinic is the way to do it.

Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks great!

@corona10, would you please review this as well?

@@ -1,6 +1,7 @@
#include "Python.h"
#include <stddef.h> // offsetof()
#include "pycore_object.h"
#include "pycore_critical_section.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 864 to 866

state: object

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
state: object
state: object
/

Copy link
Member

@corona10 corona10 Nov 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@colesbury
Out of curiosity:
Should user explicitly care about thread-safe when implementing dunder methods of object for each case?
Or Will it be handled from interpreter side?

"\n");

#define _IO_STRINGIO___SETSTATE___METHODDEF \
{"__setstate__", _PyCFunction_CAST(_io_StringIO___setstate__), METH_FASTCALL|METH_KEYWORDS, _io_StringIO___setstate____doc__},
Copy link
Member

@corona10 corona10 Nov 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__setstate__ should be METH_O not METH_FASTCALL|METH_KEYWORDS
See: https://github.com/python/cpython/pull/112116/files#r1398121900

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized the difference, thanks a lot for point out

@bedevere-app
Copy link

bedevere-app bot commented Nov 18, 2023

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

#include "pycore_object.h"
#include "pycore_critical_section.h" // Py_BEGIN_CRITICAL_SECTION()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we update AC tool, we don't have to add a header manually from now on.
#112251
Please rebase the PR and run make clinic one more time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@corona10 corona10 merged commit 77d9f1e into python:main Nov 19, 2023
31 of 32 checks passed
@aisk aisk deleted the critical-sections-stringio branch November 19, 2023 12:50
aisk added a commit to aisk/cpython that referenced this pull request Feb 11, 2024
Glyphack pushed a commit to Glyphack/cpython that referenced this pull request Sep 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants