Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-106008: Make implicit boolean conversions explicit #106003

Merged
merged 18 commits into from
Jun 29, 2023

Conversation

brandtbucher
Copy link
Member

@brandtbucher brandtbucher commented Jun 23, 2023

...and specialize them!

This adds a new TO_BOOL bytecode that prefixes all UNARY_NOT, POP_JUMP_IF_TRUE, and POP_JUMP_IF_FALSE instructions, which now require an exact boolean. We also use a spare bit in COMPARE_OP's oparg to indicate whether the result should be converted to bool (this saves a TO_BOOL for most branches, and is a no-op for all COMPARE_OP specializations).

"0% faster". Stats show a 93.5% hit rate for the new instructions.


📚 Documentation preview 📚: https://cpython-previews--106003.org.readthedocs.build/

@brandtbucher brandtbucher added performance Performance or resource usage interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Jun 23, 2023
@brandtbucher brandtbucher self-assigned this Jun 23, 2023
@brandtbucher brandtbucher changed the title Make implicit boolean conversions explicit GH-106008: Make implicit boolean conversions explicit Jun 23, 2023
@brandtbucher brandtbucher marked this pull request as ready for review June 23, 2023 03:53
Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments, otherwise LGTM.

}

inst(TO_BOOL_BOOL, (unused/1, unused/2, value -- value)) {
// Coolest (and dumbest-named) specialization ever:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but the not the most useful comment for someone trying to understand the code 🙂


inst(TO_BOOL_NONE, (unused/1, unused/2, value -- res)) {
// This one is a bit weird, because we expect *some* failures...
// it might be worth combining with TO_BOOL_ALWAYS_TRUE somehow:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we decided it wasn't, as it reflects the underlying type instability when doing if x: as a stand in for if x is None.

inst(TO_BOOL_STR, (unused/1, unused/2, value -- res)) {
DEOPT_IF(!PyUnicode_CheckExact(value), TO_BOOL);
STAT_INC(TO_BOOL, hit);
if (Py_Is(value, &_Py_STR(empty))) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use value == &_Py_STR(empty). The semantics is value == "", not value is "".
In general I wouldn't use Py_Is except when you want the exact semantics of Python's x is y.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I mean, we are checking for the identity of the singleton string here. I'll just change it, though.

Copy link
Member

@markshannon markshannon Jun 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hypothetically we could have more than one "" object. I don't think that " ".strip() is "" is part of the language spec, so Py_Is doesn't add any safety, just obfuscation.

We can also potentially have tagged ints, in which case Py_Is would need to become a lot more complex, but value == &_Py_STR(empty) would remain efficient.


inst(TO_BOOL_ALWAYS_TRUE, (unused/1, version/2, value -- res)) {
// This one is a bit weird, because we expect *some* failures...
// it might be worth combining with TO_BOOL_NONE somehow:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above.

}
}
assert(PyBool_Check(cond));
JUMPBY(oparg * Py_IsFalse(cond));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so much more pleasing 🙂

@@ -107,6 +107,8 @@ _Py_GetSpecializationStats(void) {
err += add_stat_dict(stats, COMPARE_OP, "compare_op");
err += add_stat_dict(stats, UNPACK_SEQUENCE, "unpack_sequence");
err += add_stat_dict(stats, FOR_ITER, "for_iter");
err += add_stat_dict(stats, TO_BOOL, "to_bool");
err += add_stat_dict(stats, SEND, "send");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I tend to forget about the stats dict .

@markshannon
Copy link
Member

Note for possible future PR:

We could, at the cost of two bits in tp_flags avoid the version number and combine the ALWAYS_TRUE and NONE specializations.
We need a ALWAYS_TRUE_OR_FALSE and a IS_TRUE bit.
Classes that don't override __bool__ or __len__ and None would set the ALWAYS_TRUE_OR_FALSE bit. The IS_TRUE bit would be set to 0 for None, and to 1 for always true objects.

Rather than check the version number, check the ALWAYS_TRUE_OR_FALSE, then res = (tp_flags & IS_TRUE) ? Py_True : Py_False;

For abi4, we could add a per-object bit to handle immutable objects like ints and strings.

}
}
Py_DECREF(newconst);
return index;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

windows compiler warning here, implicit cast from Py_ssize_t to int

@brandtbucher
Copy link
Member Author

Merging is currently blocked on #106250.

@brandtbucher brandtbucher merged commit 7b2d94d into python:main Jun 29, 2023
@gvanrossum
Copy link
Member

Hey @brandtbucher, I have a question about this PR. In #106393 I had to change the code in POP_JUMP_IF_TRUE/FALSE from

JUMPBY(oparg * Py_IsFalse(cond));

to

if (Py_IsFalse(cond)) {
    JUMP_POP_DISPATCH(oparg, 1);  // Macro that wraps JUMPBY()
}

The reason is that the uop executor currently exits whenever it jumps, and your original code from this PR always jumps.

Did you (or @markshannon) have a reason to prefer the oparg * Py_IsFalse(cond) version over the conditional?

If there's no deep reason I'll keep it the way I coded it up; but if there is (maybe it came out faster in a micro-benchmark?) then I suppose I could fix it another way in the uop interpreter (e.g. only exiting if the jump offset is nonzero).

@markshannon
Copy link
Member

That the implementation of branches is itself branchless has a certain aesthetic appeal 🙂. TBH, that's the main reason.

The multiplication form will be quicker for unpredictable branches, and slower for predictable ones in the tier 1 interpreter.
I doubt it makes any difference in terms of overall performance, and will likely be irrelevant once the tier 2 interpreter does most of the work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants