-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Superfluous call to __init__
on str(x)
/bytes(x)
when __new__
returns an instance of str
/bytes
' subclass
#104231
Comments
__init__
on instancing a class when __new__
returns a subclass instance__init__
on instantiating a class when __new__
returns a subclass instance
@gvanrossum This issue is about long-standing |
All the examples are using |
Reproducible with class mycls:
def __init__(self, text):
print('mycls.__init__', type(self), id(self), repr(text))
self.text = text
def __bytes__(self):
print('mycls.__bytes__', type(self), id(self))
return self.text
class mybytes(bytes):
def __new__(cls, obj):
print('mybytes.__new__', cls, repr(obj))
return super().__new__(cls, obj)
def __init__(self, obj):
print('mybytes.__init__', type(self), id(self), repr(obj))
super().__init__()
out = bytes(mycls(mybytes(b'hello')))
print('out', type(out), id(out), repr(out)) In theory the Here's an example using no builtin types, demonstrating the second class foo:
def __new__(cls, obj):
print('foo.__new__', cls, repr(obj))
if isinstance(obj, foo):
return obj # Mimicking `__str__` returning an instance of `str`'s subclass
else:
return super().__new__(cls)
def __init__(self, obj):
print('foo.__init__', repr(self), repr(obj))
class bar1(foo):
def __new__(cls, obj):
print('bar1.__new__', cls, repr(obj))
return super().__new__(cls, obj)
def __init__(self, obj):
print('bar1.__init__', repr(self), repr(obj))
class bar2(foo):
def __new__(cls, obj1, obj2):
print('bar2.__new__', cls, repr(obj1), repr(obj2))
return super().__new__(cls, obj1)
def __init__(self, obj1, obj2):
print('bar2.__init__', repr(self), repr(obj1), repr(obj2))
out1 = foo(bar1(123))
print('out1', repr(out1))
print()
out2 = foo(bar2(123, 456))
print('out2', repr(out2)) Sample output on Python 3.11:
|
The behavior here is caused by the inconsistency of the:
This is indeed a bug. Change 1. breaks things, and change 2. means adding some ad-hoc rules for Personal thoughts: I am in favor of changing 1., by adding one more conversion in But on the other hand, changing 2. affects users' code in a minimal way. People may consider |
Then please tone down the issue title to limit it to Alas, I am no expert on the Unicode or bytes implementations. Let's see if @serhiy-storchaka is interested. |
@sunmy2019 I don’t think we can change (1). The expectation that str(x) returns an exact str instance should not be violated. Also we should stop editing comments to add new information/opinions. |
__init__
on instantiating a class when __new__
returns a subclass instance__init__
on str(x)
/bytes(x)
when __new__
returns an instance of str
/bytes
' subclass
Acknowledged. Changed the title. |
Changing (1), I mean: Changing (2), I mean: |
Maybe you can submit pull requests for each option so we can compare. |
Similar issues were discussed for numeric types. Added @mdickinson as an expert. It was decided that Example: class mycls:
def __init__(self, value):
print('mycls.__init__', type(self), id(self), repr(value))
self.value = value
def __index__(self):
print('mycls.__index__', type(self), id(self))
return self.value
class myint(int):
def __new__(cls, obj):
print('myint.__new__', cls, repr(obj))
return super().__new__(cls, obj)
def __init__(self, obj):
print('myint.__init__', type(self), id(self), repr(obj))
super().__init__()
out = int(mycls(myint(42)))
print('out', type(out), id(out), repr(out)) Output:
I am not sure that it was the best solution (an alternative is Now, for consistency, we need to deprecate |
I opened a draft PR for changing (1) #104247. Turns out that changing (2) is not feasible without breaking some APIs. Currently, there is no easy way to find out if the
This also makes sense. I think the agreement here would be |
FWIW, if consistency is deemed important here I wouldn't have much objection to |
So that would mean dropping the existing deprecation again. I have a gut reaction against such reversals, but we should think a bit harder about what's the best thing to do here. Should we just add the deprecation to Also, I should probably research this but I'm lazy -- is the current deprecation issued from |
The deprecation warnings are issued from
I don't have any preference on the given solutions, sice all of them are compliant to the documentation and will break existing code in some way or another. If I have to choose, I'm in favor of the idea presented by the current PR - that is, make However I do wish to suggest that we clarify the documentation on the usage of subclasses in Python data models. IMHO the current wording on actual types used by the dunder methods is ambiguous to say the least. To quote the documentation on |
Okay, then I am slightly leaning towards changing A separate PR with doc changes for subclass usage in the data model would also be welcome. Let's aim for Python 3.13. |
I will do that. And I think we should also clarify in the warning that "if not doing so, the behavior could be wrong". I will drop the implicit conversion.
Great, this will give us plenty of time. |
Created #108814 |
…d bytes() (pythonGH-112551) (cherry picked from commit 2223899) Co-authored-by: Serhiy Storchaka <[email protected]>
…d bytes() (pythonGH-112551) (cherry picked from commit 2223899) Co-authored-by: Serhiy Storchaka <[email protected]>
@serhiy-storchaka Your backports are ready to merge. |
Bug report
According to the documentation on
object.__new__
:And, indeed, packages exist that utilizes this feature to provide
str
-compatible custom types. However, when trying to get a "string representation" of astr
-compatible instance like such withstr(x)
, the following happens:str
is a type, calling into it is essentially creating astr
instance (__new__
and__init__
).str
implemented__new__
, so it is called.x.__str__
is checked to be an instance ofstr
(or its subclasses), which is the case here.str.__new__
.type.__call__
checks if this object is an instance ofstr
or its subclasses, which is, again, the case here.__init__
is called and the object is returned.The problem is the return value of
x.__str__
can be already initialized, or even have a__init__
signature incompatible withstr
. This is not an issue forstr
itself sincestr.__init__
does nothing and have a wildcard signature (according to tests I've done), but it is trivial to have a custom (and incompatible)__init__
and break things.Proof of concept that shows
__init__
was called the second time bystr()
:Sample output on Python 3.9:
A real-world example that breaks
tomlkit
:Sample output:
This behavior is introduced by commit 8ace1ab 22 years ago, released in Python 2.3 and kept to the day.
A possible solution is to check for the exact class (instead of with subclasses) in
type.__call__
, however I'm not sure if this behavior is compliant with the documentation. Changestr.__new__
to only allowstr
(and not its subclasses) to be returned by__str__
could also workaround this issue, but may break even more stuffs.Your environment
Linked PRs
str(x)
a str,bytes(x)
a bytes #104247__bytes__
and__str__
when returning strict subclass #108814The text was updated successfully, but these errors were encountered: