Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve performance in simplexml #2855

Closed
wants to merge 1 commit into from

Conversation

danslo
Copy link
Contributor

@danslo danslo commented Jun 4, 2014

When casting a SimpleXML element to boolean, we're only concerned with
the object having an XML node, or having attributes. Building
attributes is done by sxe_get_prop_hash that also recursively builds the
entire element tree starting from that element.

We shortcircuit when the object has a valid XML node to prevent excessive
amount of object creation when all your program is doing is loading
XML strings and doing read/toBool operations.

Before: http://i.imgur.com/TFFj97u.gif
After: http://i.imgur.com/3kNpwIM.gif

@JoelMarcey JoelMarcey self-assigned this Jun 4, 2014
Array properties = Array::Create();
sxe_get_prop_hash(sxe, true, properties);
return node != nullptr || properties.size();
if (node) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: it's a bit easier to read to lower the amount of nesting if you remove the else block and just do:

if (node) return true;
Array properties ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danslo I can make this style change for you in the internal pull.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already fixed it :)

When casting a SimpleXML element to boolean, we're only concerned with
the object having an XML node, *or* having attributes. Building
attributes is done by sxe_get_prop_hash that also recursively builds the
entire element tree starting from that element.

We shortcircuit when the object has a valid XML node to prevent excessive
amount of object creation when all your program is doing is loading
XML strings and doing read/toBool operations.

Before: http://i.imgur.com/TFFj97u.gif
After:  http://i.imgur.com/3kNpwIM.gif
@JoelMarcey
Copy link
Contributor

@danslo Thanks! From my end, this looks like a nice fix. Excellent. This has been pulled in.

Internal Diff Number: D1364022

@fredemmott
Copy link
Contributor

@danslo: not directly related to this issue, but how much are you planning on working on libxml stuff?
The memory model used in DOM* and SimpleXML in HHVM is broken; it's currently an owner flag. This causes read-after-free and double free (if you're creative) if the owner goes out of scope (and decref-freed), but you have another variable containing a reference that's still in scope. I'll probably be addressing this in a few weeks - but it'll be a fairly substantial change to both. #2408 contains some example code

@JoelMarcey JoelMarcey closed this in ea535b8 Jun 5, 2014
@danslo danslo deleted the simplexml-bool-perf branch June 5, 2014 04:52
@danslo
Copy link
Contributor Author

danslo commented Jun 5, 2014

@fredemmott SimpleXML is a priority for me because its heavily used in Magento. You're right, in php-src there's a shared object between SimpleXML and DOMDocument, stuff like php_libxml_increment_node_ptr. I did not port that over to limit the scope of my work.

Between SimpleXMLElement and SimpleXMLElementIterator, I work around the issue by just incReffing and decReffing in the right places.

I can work with you on that, but prefer spending time in other areas (currently a bunch of SOAP stuff). I also still need to get around converting SimpleXML to HNI, do the OSX fastcgi and JIT stuff. So much to do, so little time :-)

danslo added a commit to danslo/hhvm that referenced this pull request Jun 6, 2014
This is a follow up to PR facebook#2855.

When we need to build our element's child tree (sxe_get_prop_hash) to
determine if it casts to true/false, we can get away with bailing after
finding the first element.

In my small Magento siege tests these result in significant
improvements:

Before:
Transactions:               1551 hits
Transaction rate:           51.79 trans/sec
Lowest response time:       97ms
Callgrind:                  http://i.imgur.com/npxMo0q.png

After:
Transactions:               1730 hits
Transaction rate:           58.43 trans/sec
Lowest response time:       89ms
Callgrind:                  http://i.imgur.com/Dgojb9Y.png
facebook-github-bot pushed a commit that referenced this pull request Jun 6, 2014
Summary: This is a follow up to PR #2855.

When we need to build our element's child tree (sxe_get_prop_hash) to
determine if it casts to true/false, we can get away with bailing after
finding the first element.

In my small Magento siege tests these result in significant
improvements:

Before:
Transactions:               1551 hits
Transaction rate:           51.79 trans/sec
Lowest response time:       97ms
Callgrind:                  http://i.imgur.com/npxMo0q.png

After:
Transactions:               1730 hits
Transaction rate:           58.43 trans/sec
Lowest response time:       89ms
Callgrind:                  http://i.imgur.com/Dgojb9Y.png
Closes #2868

Reviewed By: @ptarjan

Differential Revision: D1368727

Pulled By: @JoelMarcey
@danslo danslo restored the simplexml-bool-perf branch August 1, 2014 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants