Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory requirements #169

Open
lkwdwrd opened this issue Dec 29, 2015 · 6 comments
Open

Reduce memory requirements #169

lkwdwrd opened this issue Dec 29, 2015 · 6 comments

Comments

@lkwdwrd
Copy link
Contributor

lkwdwrd commented Dec 29, 2015

Right now the parser has very large memory requirements. I had to up my local install's available memory to run the parsing for core. 2GB on the VM and 1GB for PHP. This has some headroom, but it's regularly taking around 400MB to parse core initially and 320MB on an subsequent update run. The import script can run hundreds of thousands of queries, a simply put, PHP isn't meant to run for long periods of time so it leaks.

I would love to see some work done to audit where the memory leaks are and shore the worst ones up. I did a little bit of exploring and was able to determine that parsing should never be done with the SAVEQUERIES constant set to true. That should have been obvious, but I have it on for all my dev sites and I didn't think about it saving queries eating up memory during the cli call. Turning this off alone halves the memory requirement (~800MB before turning it off).

Nothing else I have explored has helped much, and some of it even increased the requirements. It'll take some digging because memory tools in PHP are rudimentary at best.

@atimmer
Copy link
Member

atimmer commented Dec 30, 2015

From looking into this earlier I think it comes down to having a representation of all parsed content in a giant object that is then passed to a function that imports it into WordPress.

I think the only meaningful way to reduce memory footprint is to stop doing that.

@lkwdwrd
Copy link
Contributor Author

lkwdwrd commented Dec 30, 2015

It definitely starts that way. Then usage double or more during the import itself.

On Dec 30, 2015, at 3:22 AM, Anton Timmermans [email protected] wrote:

From looking into this earlier I think it comes down to having a representation of all parsed content in a giant object that is then passed to a function that imports it into WordPress.


Reply to this email directly or view it on GitHub.

@JDGrimes
Copy link
Contributor

What if we parsed and stored the data for each file separately? That way we'd only need to ever have one file in memory at a time. We could then import the files one at a time too. Also, it would perhaps make it easier for us to allow partial parsing in the future, only reparsing the files that have changed since the last time, for example.

So, instead of one giant JSON file, we'd have a bunch of smaller ones in a directory.

@JDGrimes
Copy link
Contributor

Or we could keep one file but stream to/from it instead of pulling the whole thing into memory at once (sort of like is being done with the WordPress Importer plugin).

@lkwdwrd
Copy link
Contributor Author

lkwdwrd commented Dec 30, 2015

Great ideas! This could definitely go hand in hand with making the import process a lot more modular as well. Right now it's kind of an importing god object with a massive god function inside it.

Doing the import in bits would allow us to do some cleanup even in the create workflow as well between each file. We'll likely still need to work out why we're leaking so much memory on the WP side of things, I don't think that's anything the importer is doing, but it's a good start.

@JDGrimes
Copy link
Contributor

https://blackfire.io/ might be helpful here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants