-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pooling and request limiting to prevent memory leaks #110
Add pooling and request limiting to prevent memory leaks #110
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is looking good.
- there's use of the
/**
JSDoc comment opener, but we aren't particularly following JSDoc conventions. it would be nice if we were consistent here. - do you think this same issue is likely present in the web version of the Playground? maybe it's rarer since people won't keep it open as long as they keep
wp-now
running? is there an easy way you can think of to refresh the PHP instance from within the Playground itself so that tools likewp-now
don't also need to apply this kind of fix?
packages/wp-now/src/pool.ts
Outdated
maxRequests = 2000, | ||
maxJobs = 5, | ||
} = {}) { | ||
Object.assign(this, { spawner, maxRequests, maxJobs }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a style note, but I think we would generally prefer to set these like this.spawner = spawner
did you see this pattern elsewhere here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's just something I use as shorthand for multiple assignments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmsnell This has been refactored:
playground-tools/packages/wp-now/src/pool.ts
Lines 160 to 166 in 75bd3eb
Object.defineProperties(this, { | |
maxRequests: { value: maxRequests }, | |
maxJobs: { value: maxJobs }, | |
spawn: { value: spawn }, | |
fatal: { value: fatal }, | |
reap: { value: reap }, | |
}); |
packages/wp-now/src/pool.ts
Outdated
} = {}) { | ||
Object.assign(this, { spawner, maxRequests, maxJobs }); | ||
this[Reap](); | ||
this[Spawn](); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the purpose of using these symbols instead of simply using reap
, spawn
, and fatal
as names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's just a pattern I use for private methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmsnell This has been refactored:
playground-tools/packages/wp-now/src/pool.ts
Lines 36 to 85 in 75bd3eb
const spawn = async (pool: Pool) => { | |
const newInstances = new Set(); | |
if (pool.maxJobs <= 0) return newInstances; | |
while (pool.instanceInfo.size < pool.maxJobs) { | |
const info = new PoolInfo(); | |
const instance = await pool.spawn(); | |
pool.instanceInfo.set(instance, info); | |
info.active = true; | |
newInstances.add(instance); | |
} | |
return newInstances; | |
}; | |
/** | |
* Reaps children if they've passed the maxRequest count. | |
* @param pool the pool object to work on | |
* @private | |
*/ | |
const reap = (pool: Pool) => { | |
for (const [instance, info] of pool.instanceInfo) { | |
if (pool.maxRequests > 0 && info.requests >= pool.maxRequests) { | |
info.active = false; | |
pool.instanceInfo.delete(instance); | |
pool.reap(instance); | |
continue; | |
} | |
} | |
}; | |
/** | |
* Handle fatal errors gracefully. | |
* @param pool the pool object to work on | |
* @param instance the php instance to clean up | |
* @param error the actual error that got us here | |
* @private | |
*/ | |
const fatal = (pool: Pool, instance: PhpInstance, error: Error) => { | |
console.error(error); | |
if (instance && pool.instanceInfo.has(instance)) { | |
const info = pool.instanceInfo.get(instance); | |
info.active = false; | |
pool.instanceInfo.delete(instance); | |
} | |
return pool.fatal(instance, error); | |
}; |
packages/wp-now/src/pool.ts
Outdated
const idleInstanceNext = this.getIdleInstance(); | ||
|
||
const next = this.backlog.shift(); | ||
const info = this.instances.get(idleInstanceNext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couldn't ideInstanceNext
here be false too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this function is kicked off by the resolution of a previous promise that SHOULD free up the instance its using, I could see how it would be possible for a FALSE to come in here if we get a request at JUST the wrong millisecond.
This is a little complicated because we're actually managing TWO pools here, one of backlogged requests and one of idle/running PHP instances, but I think I can come up with something robust to that case.
packages/wp-now/src/pool.ts
Outdated
// instance is done spawning. | ||
request = next(await nextInstance); | ||
} | ||
catch(error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmsnell Let me know what you think of this logic. The alternative to re-queuing after a failed spawn is to return a 502 bad gateway.
Interested to know your thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused on what happens in these cases and probably need to think about it more. One thing that worries me is that if there is something intrinsic in the request that led to the failure, and we unshift
it to the beginning of the queue, could we get into situations that lead to infinite and infinitely-fast re-death of newly spawned processes. I'm not sure if that worry is sound though, as I don't fully understand what's happening with these workers. It's not like they are OS processor or have the same characteristics of overhead and startup time.
In general, a 500
seems like an acceptable response. Why specifically a 502? The lines here are blurry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I recall correctly an HTTP daemon will serve a 502 if it can't properly instantiate PHP to handle a request, but a 500 seems valid as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamziel @dmsnell Refactored to return the error to calling code:
playground-tools/packages/wp-now/src/pool.ts
Lines 267 to 273 in 75bd3eb
// Grab the reject handler from the notfier | |
// promise and run it if the request rejects. | |
request.catch((error) => { | |
const resolver = this.resolvers.get(next); | |
this.resolvers.delete(next); | |
resolver.reject(fatal(this, nextInstance, error)); | |
}); |
The comment style is a force of habit, I can look over the existing style and make it consistent. If this code need to be upstreamed into wordpress-playground, that is 100% possible. The Pool class is not coupled to anything in particular, so we can absolutely provide it out of the box in wordpress-playground. |
Is this ready for another review now? |
I saw this came up in the review queue, but there are some failing CI checks and open conversations so it seems like it's not ready for a review yet. |
@adamziel That's going to fail until we plan a new release of |
const nodePHPOptions: PHPLoaderOptions = { | ||
requestHandler: { | ||
documentRoot, | ||
absoluteUrl: options.absoluteUrl, | ||
isStaticFilePath: (path) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this triggers a lint error above – seemsLikeAPHPFile
is now dead code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strange, I was getting linter errors on my end as if it were deprecated. That's what prompted me to remove it to begin with.
I'll double check that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamziel Looks like that type doesn't allow for the isStaticFilePath
key. Is the isStaticFilePath
key not expected to be removed? If I'm missing something, lets revisit this one tomorrow morning.
GitHub search for isStaticFilePath
: https://github.com/search?q=org%3AWordPress+isStaticFilePath&type=code
Current Commit (30a561d97bdd7dee6674528dfff95c912bc734d3
) https://github.com/WordPress/wordpress-playground/blob/30a561d97bdd7dee6674528dfff95c912bc734d3/packages/php-wasm/universal/src/lib/php-request-handler.ts#L17-L27
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamziel That's been removed.
packages/wp-now/src/pool.ts
Outdated
*/ | ||
class PoolInfo { | ||
id: number; // Unique ID for debugging purposes. | ||
started: number; // Time spawned. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be unused – let's remove it.
Looking through this I didn't see any kind of manual memory allocation for spawning a new process, but I didn't check behind the diff to see if that's there or if the question even makes sense. Still, I was thinking about the comment about the linear memory and fragmentation (thanks for adding that) and wondered if we have any control over this, and whether we could add in a later PR a custom allocator for each PHP process so we could ensure reuse and a memory limit. E.g. if we set PHP's max memory, we could preallocate a multiple of that for every worker we build, then let new workers use the now-reaped memory from the previous worker. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for your continued efforts on this @seanmorris. I thought it was closer to being ready, but in my review here I uncovered some things that I think do warrant some extra attention:
- an architectural question about the approach we're taking for eagerly pre-spawning instances and using new instances vs. reusing existing ones.
- several questions about names that don't fully match function or intent, and lots of documentation that's going 99% of the way to being useful but misses out because it's not following the JSdoc conventions.
- missing types that could thread documentation and consistency through the multiple layers of the system that PHP instances and HTTP requests take through this. specifically we should have some type that describes the
PhpInstance
. if it's different for Node and the web that's fine, we can add a type parameter toclass Pool<PhpInstance extends BaseInstance> { }
and then have theNodePool
and default pool extendBaseInstance
however they want, but it should be there on all the functions insidePool
that work with requests. therequest
type is similar, and should take one of thesePhpInstance
values as its input.
I'm trying to summarize the issues here, but please check on all the comments as there are other items in there that would be good to address.
@@ -43,6 +45,8 @@ export interface WPNowOptions { | |||
wpContentPath?: string; | |||
wordPressVersion?: string; | |||
numberOfPhpInstances?: number; | |||
maxRequests?: number; | |||
maxJobs?: number; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the interplay here between maxJobs
and numberOfPhpInstances
?
we didn't recreate the same config value did we?
could we use numberOfPhpInstances
to set maxJobs
or are these somehow indicating different things: number of workers in a pool vs. number of pools? would it make sense to only use one and then to leave the existing config for WPNowOptions
?
@@ -20,6 +20,8 @@ export interface CliOptions { | |||
port?: number; | |||
blueprint?: string; | |||
reset?: boolean; | |||
maxRequests?: number; | |||
maxJobs?: number; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's late in the process but now as this is basically ready I'm questioning how clear these names are. we're not limiting the total number of requests, and what is a job?
what if we either renamed them or created them as sub-options like poolOptions
?
something like maxRequestsPerWorkerLifespan
is verbose but does far more to communicate the role of the config, and maxPhpWorkers
(or even the already existing numberOfPhpInstances
) skips the term for one that's already in scope when creating one of these config values.
class PoolInfo { | ||
id: number; // Unique ID for debugging purposes. | ||
requests = 0; // Total requests processed. | ||
active = false; // Whether instance is considered active. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems clear that active
conveys some sense of being active
, but I am confused on two things:
- what does "active" mean? e.g. "// An instance is considered active if it's currently handling a request."
- this is a property of a
PoolInfo
class, so which instance is considered active? we've been discussingPhpInstance
and other instances. should this be "Whether this pool is [whatever active means]"?
as noted in the next comment, we would get a lot for almost no effort by moving these into JSDoc comments
class PoolInfo { | ||
id: number; // Unique ID for debugging purposes. | ||
requests = 0; // Total requests processed. | ||
active = false; // Whether instance is considered active. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we're already explaining these, let's use JSDoc syntax so that the comments follow the values.
/** Unique ID for debugging purposes. */
id: number;
/** Total requests processed by this pool/instance/something? */
requests = 0;
/** This pool is considered active if X Y Z */
active = false;
}; | ||
|
||
/** | ||
* Reaps children if they've passed the maxRequest count. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a description of what the code currently does, but it would be less fragile as a comment if we described why it does what it does.
something like this…
/**
* Reaps (terminates) all of the PHP instances
* in a given pool that are deemed "too old."
*
* In order to prevent memory leaks inherent in
* a long-living PHP worker, this function reaps
* PHP instances after they have handled more
* than the configured number of maximum requests.
*
* There could be different conditions for reaping,
* but the intent is to limit the extent of a
* memory leak and ensure any proper cleanup happens
* even if a process goes astray.
*
* @private
*
* @param pool Reap old PHP instances in this pool.
*/
const reap = (pool: Pool) => { … }
|
||
request = next(nextInstance); | ||
} catch (error) { | ||
// Grab the reject handler from the notfier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: notfier
-> notifier
this.running.add(nextInstance); | ||
info.requests++; | ||
|
||
let request; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
notice that at this point, if we don't rename the request
type, we will have a type and a variable with the same name in the same scope. they occupy different namespaces in TypeScript, which means there's no clash, but that's needlessly confusing if someone is trying to figure out what's happening
try { | ||
const newInstances = await spawn(this); | ||
|
||
// ... but, if we've just spanwed a fresh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we preferring a newly-spawned instance over the one that just finished? is there any advantage to this inside our WASM world?
I would have guessed that passing the next request to the existing worker would exhaust the worker faster and prevent needing to create additional workers, which would lead to a lower overall memory use.
this also leads me to wonder about the pre-allocation of workers and if that's necessary. if we have maxJobs
(or any more suitable name) and we're going to eagerly pay the costs of allocating them, why not defer allocation and initialization until those workers are required: that is, until we have a request come in when there are no non-active workers.
would you see any problem in reaping workers as soon as they complete a request and are over the max-requests threshold, and then only spawning new workers once there are no available non-active workers in the pool?
return; | ||
} | ||
|
||
const completed = onCompleted(nextInstance); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line has no explanation, but it's calling onCompleted
from within onCompleted
, and is a bit dizzying trying to follow the flow of data inside of it, and inside the recursion.
I think here a different model that examines one request at a time could potentially remove a fair amount of logic and simplify everything in this flow. I'm curious to see if you also see the value and opportunity there.
@@ -0,0 +1,322 @@ | |||
type PhpInstance = any; | |||
|
|||
type request = (instance: PhpInstance) => Promise<any>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's leave our types capitalized: request
-> Request
additionally, it would help if we had a proper type for an instance and for a request. they don't need to be here, but somewhere would be good. here is a start I tossed together.
import { HTTPMethod } from '@php-wasm/universal';
interface HttpRequest {
url: string;
headers: Record<string, string>;
method: HTTPMethod;
files: Record<string, File>;
body: string;
}
interface BaseInstance {
request(data: HttpRequest): Promise<void>;
}
it could be that these already exist somewhere and I didn't see them. they will help later on, in places like start-server.ts
For posterity – this would be useful in php-wasm core as the web version of Playground suffers from the same memory leak issue. |
…a certain number of requests Supersedes WordPress/playground-tools#110 Prevents PHP's memory leaks from causing the system to stop working once memory has been exhausted. It follows in the footsteps of systems like PHP-FPM by limiting the number of requests a PHP "thread" can handle before it is discarded and restarted. Implementing a solution in core enables all downstream consumers of the PHP WASM package to benefit. ## How it works Adds a rotatedPHP function that takes a PHP factory function and returns a Proxy that manages the PHP instance internally. The Proxy will discard the PHP instance after a certain number of requests, and will create a new instance when the next request comes in. Example: ```ts const php = rotatedPHP({ createPhp: () => WebPHP.load('8.0'), maxRequests: 50, }); // After 50 request() calls, the PHP instance will be discarded and a new one will be created. // It all happens internally. The caller doesn't know anything about this. ``` I started by porting the "Process Pool" idea from WordPress/playground-tools#110, but I realized that: * The logic was complicated * We only need to manage a single PHP instance, not a pool of them * Worker thread is built to a single PHP instance and not architected to handle a pool So I decided to simplify the logic and just manage a single PHP instance. ## Remaining work * Unit tests * Comments * Better names perhaps, `rotatedPHP` doesn't seem ideal CC @dmsnell
I'm exploring an alternative in Playground core here: |
…a certain number of requests (#990) Swaps the internal PHP runtime for a fresh one after a certain number of requests are handled. ## Rationale PHP and PHP extension have a memory leak. Each request leaves the memory a bit more fragmented and with a bit less available space than before. Eventually, new allocations start failing. Rotating the PHP instance may seem like a workaround, but it's actually what PHP-FPM does natively: Implementing this solution in core enables all downstream consumers of the PHP WASM package, like `wp-now`, to benefit. ## How it works Adds a `rotatePHP` function that keeps track of the `BasePHP.run()` calls and replaces the Emscripten runtime with a new one after a certain number of calls. Example: ```ts const loadRuntime = async () => { return await NodePHP.loadRuntime("8.0"); } const php = await rotatedPHP({ php: new NodePHP(await loadRuntime(), { documentRoot: '/test-root', }), recreateRuntime: loadRuntime, maxRequests: 50 }); // After 50 request() calls, the PHP runtime will be discarded and a new one will be created. // It all happens internally. The caller doesn't know anything about this. ``` I started by porting the "Process Pool" idea ([see the branch](https://github.com/WordPress/wordpress-playground/compare/harvested-memory-pool?expand=1)) from WordPress/playground-tools#110, but I realized that: * The logic was complicated * We only need to manage a single PHP instance, not a pool of them * Worker thread is built to a single PHP instance and not architected to handle a pool So I decided to simplify the logic and just manage a single PHP instance. ## Other changes * Implements `BasePHP.hotSwapPHPRuntime()` that opens the door to switching PHP versions dynamically and without a full page reload. * Makes the `BasePHP.semaphore` property public to enable conflict-free runtime swaps * Adds the `runtime.initialized` and `runtime.beforedestroy` events for OPFS handlers to re-bind on runtime swap. * Tiny unrelated change: Migrates the OPFS handler from monkey-patching the `php.run` method to relying on event listeners. The `rotatePHP` function went through a similar journey and I updated that other code path while I was on it. Supersedes WordPress/playground-tools#110
WordPress/wordpress-playground#990 just landed and enables a drop-in solution to the memory leak problem. Instead of this: const php = await NodePHP.load("8.0"); You would now do this: import { rotatePHPRuntime } from '@php-wasm/universal';
const loadRuntime = async () => await NodePHP.loadRuntime("8.0");
const php = new NodePHP(await loadRuntime(), {
documentRoot: '/test-root',
});
await rotatePHPRuntime({
php,
recreateRuntime: loadRuntime,
maxRequests: 400
}); And the PHP runtime would be magically cleaned up and recreated every 400 requests. Adapting |
Nice! @adamziel , I think we need to release a new version of |
I created the PR for wp-now: #152 Currently is on version 0.5.6 that was released 18 days ago, and the new function was introduced 4 days ago on WordPress/wordpress-playground#990 https://www.npmjs.com/package/@php-wasm/node |
Agreed, I started a new issue to track it at WordPress/wordpress-playground#998
You're right! v0.6.0 release is in progress now |
…xport-functionality Add import and export logic
What is this PR doing?
packages/wp-now/src/start-server.ts
.--inspect
and--inspect-brk
switches have been added to allow the debugging of nodejs based WASM scripts in chrome dev tools.--trace-warnings
,--trace-uncaught
, and--trace-exit
flag have also been added to aid in debugging.Strucklines moved to #132What problem is it solving?
PHP will eventually hit a memory leak when handling many requests one after another.
How is the problem addressed?
This PR adds behavior to track the number of requests a given PHP instance has handled, and discards it after a certain amount. A pool of PHP instances can be maintained so a fresh instance is always available, while discarded instances are replaced.
Testing Instructions
in
wordpress-playground
This requires the latest development version of
wordpress-playground
, so we'll need to check out thetrunk
branch and link some packages first:Then, switch to the
dist/packages/playground/blueprints/
directory and link the package:In another terminal, switch to the
dist/packages/php-wasm/node/
directory and link that package as well:In this project:
Run:
Navigate to http://localhost:8881/
Set up auto refresh to refresh every few seconds and leave it running indefinitely.
You can use the this extenstion to do that in chrome: https://chromewebstore.google.com/detail/amazing-auto-refresh/lgjmjfjpldlhbaeinfjbgokoakpjglbn