-
Notifications
You must be signed in to change notification settings - Fork 7.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include the source location in Closure names #13550
Conversation
This change makes stack traces involving Closures, especially multiple different Closures, much more useful, because it's more easily visible *which* closure was called for a given stack frame. The implementation is similar to that of anonymous classes which already include the file name and line number within their generated classname.
Zend/zend_compile.c
Outdated
if (op_array->fn_flags & ZEND_ACC_CLOSURE) { | ||
zend_string *filename = op_array->filename; | ||
uint32_t start_lineno = decl->start_lineno; | ||
unqualified_name = zend_strpprintf(0, "{closure:%s:%" PRIu32 "}", ZSTR_VAL(filename), start_lineno); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something that may just be emitted for stack traces, instead of changing the closure name? Otherwise, this breaks de-duplication of file names for interned strings, storing the file name many many times. They can be very long too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having the information readily available can also be useful for debuggers and tracers as outlined by my mention of Xdebug in the PR description and the adjustments to the phpdbg and observer tests.
How would this be different from including the filename for anonymous classes which is already done? Would truncating the filename to just the basename help? This would only slightly reduce the usefulness within the name, but could greatly reduce the memory consumption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anonymous classes are arguably many factors less common than closures. The default interned string buffer is only 8MB, so this change may potentially fill a significant amount of this memory with repeating strings.
If Xdebug already solves this problem, is there an aspect of it that does not suffice? Similarly, it seems to me observers and phpdbg can access this information from C, and output it to the user.
Note that I'm not questioning the usefulness of displaying the information in general. This is very clearly a win in that regard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default interned string buffer is only 8MB, so this change may potentially fill a significant amount of this memory with repeating strings.
I don't know how it works, but is this not an issue with regular function and method names as well? I expect regular functions and methods to exceed the number of closures and to also have mostly distinct names. Or is this more about the length of the resulting function name where truncating the path to the basename would help reduce the impact?
If Xdebug already solves this problem, is there an aspect of it that does not suffice?
That's probably where the “readily available” bit comes into play. By natively storing the information it does not need to be reimplemented in every tool and it also avoids the redundant computation of the extended closure name in hot loops. It's not particularly expensive, but at will likely require repeated allocations of a zend_string
(unless somehow cached, but then we're back to memory usage).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how it works, but is this not an issue with regular function and method names as well? I expect regular functions and methods to exceed the number of closures and to also have mostly distinct names. Or is this more about the length of the resulting function name where truncating the path to the basename would help reduce the impact?
Right, the base name for all the file names is not deduplicated in the interned string buffer, nor are namespaces in class/function names, so there's already some redundancy. References to the same class/function are deduplicated. If a file contains hundreds of closures, the same name will be stored hundreds of time for just this file, one time for each closure.
Maybe this is a non-issue, it's just something to think about. Maybe we just need to bump the default interned string buffer size. 8MB is a fairly low amount. It would be good to know much this increases memory consumption for something like Symfony Demo, or Wordpress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for checking! That's roughly +0.8%, which indeed doesn't sound too bad. Of course, it's hard to say whether this is representative, but at least we have a rough idea of the increase now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not sure, why do we need to keep the names for closures.
We may always reconstruct them, concatenating "filename" and "start_lineno" when necessary (in zend_fetch_debug_backtrace).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may always reconstruct them, concatenating "filename" and "start_lineno" when necessary (in zend_fetch_debug_backtrace).
The “full” name of the function is not just useful for backtraces. This PR also adds the name to the __debugInfo()
output of closures, allowing users to easily see the closure origin when dumping a variable containing it. A reusable function for closure naming could be provided in php-src, but that would still require everyone to use it to achieve consistent naming results for the developer. By including it in the function name field there would be no inconsistencies. The additional function calls in various locations would also increase machine code size.
See also the second paragraph of my previous comment: #13550 (comment)
The question is whether the time-memory-tradeoff is worth it – both from a CPU time (for the repeated recomputation) and from a developer time (for the improved consistency) perspective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In production use we should rarely have calls to debug_backtrace
and __debugInfo()
, so that closure names are going to be never used (just kept in the memory). Both debug_backtrace
and __debugInfo()
already construct arrays and the additional name reconstruction won't make a big difference.
I think moving this name reconstruction code from zend_compile.c
into zend_closures.c
will only make the things more clear. This will also remove the new zend_string_copy()/zend_string_release()
calls that you introduced there for each op_array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've realized that adding the filename to the function name made the namespace effectively redundant. As Closures previously had a static “unqualified name” there should be no risk of collision either. Dropping the namespace reduced the total memory usage with the Symfony demo to
["interned_strings_usage"]=>
array(4) {
["buffer_size"]=>
int(8388608)
["used_memory"]=>
int(5067336)
["free_memory"]=>
int(3321272)
["number_of_strings"]=>
int(48739)
}
which is 209 additional interned strings and 33816 bytes of additional memory usage. Truncating the filename to just the basename would result in 13840 bytes of additional memory usage (but just the basename without the namespace might be too unspecific).
This change also allowed me to simplify the implementation in zend_compile.c
to just an additional if (op_array->fn_flags & ZEND_ACC_CLOSURE) {
, without touching the logic for regular functions at all.
What do you think?
This is redundant with the included filename.
Finally I'm not sure if this feature is necessary at all. The description of the patch notices only the changes of the Closure names in stack traces. But stack traces already include file names and lines numbers. So it's npt a big problem to locate the closure. The compound closure names may make sense but really not necessary, especially if they eat memory. Introduction of How will this new |
Closure is a final class, so this last point shouldn't be a concern. |
agree. |
From experience, I disagree. While the information is technically there, one needs to look into the callee stack frame to get the line number of the closure and even then it's not easy to see if two closures within a single stack trace are the same or different, because multiple different lines might refer to the same closure if the closure somehow calls itself recursively. Consider the following example: <?php
function error() {
throw new \Exception();
}
function cleanup(array $array) {
return array_map(function ($item) {
if (is_array($item)) {
return cleanup($item);
}
if ($item === 'error') {
error();
}
return $item;
}, $array);
}
cleanup([['asd', 'error']]); It results in this stack trace:
I see that the closures perform a call in lines 10 and 13 respectively. But I can't easily see whether that's the same closure or not. With this PR it looks like this:
It's clearly visible that both closures are the closure defined in test2.php on line 8, despite appearing twice in the stack trace with different line numbers of the callee frame.
Besides Gina's correct statement that |
I'm happy to write an RFC for this, if you feel that this is necessary. Of course I'd prefer to avoid the work if possible. |
I don't care about this change a lot and I don't need RFC myself. |
Two questions:
I'd still prefer avoiding changing the name and manually "fixing" this as part of |
related to php/php-src#13550 see analog symfony change: symfony/symfony#54614
related to php/php-src#13550 see analog symfony change: symfony/symfony#54614
related to php/php-src#13550 see analog symfony change: symfony/symfony#54614
related to php/php-src#13550 see analog symfony change: symfony/symfony#54614
related to php/php-src#13550 see analog symfony change: symfony/symfony#54614
related to php/php-src#13550 see analog symfony change: symfony/symfony#54614
* Adjust pretty name of closures on PHP 8.4 related to php/php-src#13550 see analog symfony change: symfony/symfony#54614 * php php_cs to 3.7.2 temporarily
related to php/php-src#13550 see analog symfony change: symfony/symfony#54614
This change makes stack traces involving Closures, especially multiple
different Closures, much more useful, because it's more easily visible which
closure was called for a given stack frame.
The implementation is similar to that of anonymous classes which already
include the file name and line number within their generated classname.
/cc @derickr, as Xdebug includes a similar functionality and he might've opinions.