Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-119506: fix _io.TextIOWrapper.write() write during flush #119507

Merged
merged 14 commits into from
Jun 3, 2024

Conversation

chgnrdv
Copy link
Contributor

@chgnrdv chgnrdv commented May 24, 2024

Fixes #119506

  • check if call to _textiowrapper_writeflush() has left any data in self->pending_bytes. If so, store them in self->pending_bytes after b
  • add test

* check if call to `_textiowrapper_writeflush()` has left any data in `self->pending_bytes`. If so, store them in `self->pending_bytes` after `b`
* add test
@chgnrdv
Copy link
Contributor Author

chgnrdv commented May 24, 2024

cc @methane @eryksun

@methane methane added 3.12 bugs and security fixes 3.13 bugs and security fixes 3.14 new features, bugs and security fixes needs backport to 3.12 bug and security fixes needs backport to 3.13 bugs and security fixes labels May 24, 2024
@methane
Copy link
Member

methane commented May 24, 2024

Thank you. The patch looks good to me.

@chgnrdv
Copy link
Contributor Author

chgnrdv commented May 24, 2024

Now it crashes if any of these calls fails: length ofself->pending_bytes doesn't match self->pending_bytes_count on return.

PyObject *list = PyList_New(2);

if (PyList_Append(self->pending_bytes, b) < 0) {

Comment on lines 1730 to 1736
if (self->pending_bytes) {
// gh-119506: call to _textiowrapper_writeflush()
// can write new data to pending_bytes
pending = self->pending_bytes;
self->pending_bytes = b;
b = pending;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you swap data here?

  • self->pending can be list
  • call from other thread might be returned already.
    • "first returned first write" seems correct than "first called first write".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the test case, where flushing the stream triggers a write, it's not a threading thing, it's because of write being called recursively.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even single thread case, it is not possble to guarantee "first called, first written."
Simple implementation is more important.

else if (self->pending_bytes_count + bytes_len > self->chunk_size) {
PyObject *pending = self->pending_bytes;

if (self->pending_bytes_count + bytes_len > self->chunk_size) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about while instead of if?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would also need a check to make sure that bytes_len > self->chunk_size isn't true.

@elistevens
Copy link

This PR seems to be addressing a similar issue as #119107 but I don't think that this PR fixes the race condition present when multi-threaded writes happen.

@elistevens
Copy link

After fixing my original test case, I think that this minimal patch addresses the issue in practical terms:

diff --git a/Modules/_io/textio.c b/Modules/_io/textio.c
index 3de4c06704..e084d04428 100644
--- a/Modules/_io/textio.c
+++ b/Modules/_io/textio.c
@@ -1701,16 +1701,16 @@ _io_TextIOWrapper_write_impl(textio *self, PyObject *text)
         bytes_len = PyBytes_GET_SIZE(b);
     }
 
-    if (self->pending_bytes == NULL) {
-        self->pending_bytes_count = 0;
-        self->pending_bytes = b;
-    }
-    else if (self->pending_bytes_count + bytes_len > self->chunk_size) {
+    if (self->pending_bytes_count + bytes_len > self->chunk_size) {
         // Prevent to concatenate more than chunk_size data.
         if (_textiowrapper_writeflush(self) < 0) {
             Py_DECREF(b);
             return NULL;
         }
+    }
+
+    if (self->pending_bytes == NULL) {
+        assert(self->pending_bytes_count == 0);
         self->pending_bytes = b;
     }
     else if (!PyList_CheckExact(self->pending_bytes)) {

there is still an issue where since there isn't a lock AFAIK, another thread could modify self->pending_bytes between check and assignment here:

    if (self->pending_bytes == NULL) {
        assert(self->pending_bytes_count == 0);
        self->pending_bytes = b;
    }

but that doesn't seem to be an issue in practice.

@elistevens
Copy link

elistevens commented May 30, 2024

I should note that there's a similar race condition issue in _textiowrapper_writeflush where we don't ensure that the list hasn't been modified between for (Py_ssize_t i = 0; i < PyList_GET_SIZE(pending); i++) { and self->pending_bytes = NULL;.

That's a bit more scary, given that there's an entire list iteration between that PyList_GET_SIZE(pending) call and nuking the entire list. #119107 tries to address that, though there's still a potential race with that change as well (but I have yet to actually provoke that issue with or without the change to writeflush, so it's largely theoretical at this point).

@methane
Copy link
Member

methane commented May 31, 2024

I should note that there's a similar race condition issue in _textiowrapper_writeflush where we don't ensure that the list hasn't been modified between for (Py_ssize_t i = 0; i < PyList_GET_SIZE(pending); i++) { and self->pending_bytes = NULL;.

Note that default Python build with GIL doesn't have such race.

_textiowrapper_writeflush releases GIL when it calls buffer.write():

    self->pending_bytes_count = 0;
    self->pending_bytes = NULL;
    Py_DECREF(pending);

    PyObject *ret;
    do {
        ret = PyObject_CallMethodOneArg(self->buffer, &_Py_ID(write), b);
    } while (ret == NULL && _PyIO_trap_eintr());

Other thread may call TextIOWrapper.write(b) and it appends b to self->pending_bytes.
We can not assume self->pending_bytes == NULL and self->pending_bytes_count == 0 after _textiowrapper_writeflush. That was the bug.

@elistevens
Copy link

Note that default Python build with GIL doesn't have such race.
_textiowrapper_writeflush releases GIL when it calls buffer.write()

Ah, that makes sense. Thank you for explaining!

@methane methane force-pushed the io-textiowrapper-fix-write-during-flush branch from d8c203e to b2eb9d0 Compare June 1, 2024 04:02
@methane methane force-pushed the io-textiowrapper-fix-write-during-flush branch from b2eb9d0 to 483975c Compare June 1, 2024 04:03
@methane methane merged commit 52586f9 into python:main Jun 3, 2024
36 checks passed
@miss-islington-app
Copy link

Thanks @chgnrdv for the PR, and @methane for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12, 3.13.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jun 3, 2024
…ythonGH-119507)

(cherry picked from commit 52586f9)

Co-authored-by: Radislav Chugunov <[email protected]>
Co-authored-by: Inada Naoki <[email protected]>
@bedevere-app
Copy link

bedevere-app bot commented Jun 3, 2024

GH-119964 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label Jun 3, 2024
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jun 3, 2024
…ythonGH-119507)

(cherry picked from commit 52586f9)

Co-authored-by: Radislav Chugunov <[email protected]>
Co-authored-by: Inada Naoki <[email protected]>
@bedevere-app
Copy link

bedevere-app bot commented Jun 3, 2024

GH-119965 is a backport of this pull request to the 3.12 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.12 bug and security fixes label Jun 3, 2024
mliezun pushed a commit to mliezun/cpython that referenced this pull request Jun 3, 2024
@elistevens
Copy link

Thanks for bringing this home! Does this qualify as a security issue? I'm wondering if it will be backported to 3.10.

@@ -1737,6 +1747,9 @@ _io_TextIOWrapper_write_impl(textio *self, PyObject *text)
Py_DECREF(b);
return NULL;
}
// Since Python 3.12, allocating GC object won't trigger GC and release
// GIL. See https://github.com/python/cpython/issues/97922
assert(!PyList_CheckExact(self->pending_bytes));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about this patch will be backported to security branches.
If it will, this line needs additional fixes. So backport is not simple.

@bedevere-app
Copy link

bedevere-app bot commented Jun 10, 2024

GH-120314 is a backport of this pull request to the 3.11 branch.

kumaraditya303 pushed a commit that referenced this pull request Jun 19, 2024
…H-119507) (#119964)

gh-119506: fix `_io.TextIOWrapper.write()` write during flush (GH-119507)
(cherry picked from commit 52586f9)

Co-authored-by: Radislav Chugunov <[email protected]>
Co-authored-by: Inada Naoki <[email protected]>
kumaraditya303 pushed a commit that referenced this pull request Jun 19, 2024
…H-119507) (#119965)

gh-119506: fix `_io.TextIOWrapper.write()` write during flush (GH-119507)
(cherry picked from commit 52586f9)

Co-authored-by: Radislav Chugunov <[email protected]>
Co-authored-by: Inada Naoki <[email protected]>
noahbkim pushed a commit to hudson-trading/cpython that referenced this pull request Jul 11, 2024
estyxx pushed a commit to estyxx/cpython that referenced this pull request Jul 17, 2024
hugovk pushed a commit that referenced this pull request Aug 9, 2024
…119507) (#120314)

Co-authored-by: Hugo van Kemenade <[email protected]>
fix _io.TextIOWrapper.write() write during flush (#119507)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes 3.14 new features, bugs and security fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

_io.TextIOWrapper.write: write during flush causes pending_bytes length mismatch
3 participants