-
-
Notifications
You must be signed in to change notification settings - Fork 365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run PHP using the official integration API on AWS #100
Comments
Update:
Building our own runtime doesn't sound so like a bad idea honestly, we already have a good build script. |
Do you have a WIP branch that we can try out & contribute to? |
If you've managed to create a PHP 7.2 runtime easily enough, with everything Bref currently supports, and less buggy than the 3rd party version, then I can't see any real wins from depending on another project. |
Here is a summary. IntroTo have PHP support on Lambda with the new API we need 2 parts:
Both are IMO loosely related. I'll talk about those separately. PHP binary & extensionsThe Stackery binary in their "layer" is disappointing for now, I consider it unusable at the moment. Using Bref's scripts I have compiled PHP 7.2 and published a layer that provides the binary + extensions (+ a bootstrap file but you can ignore it). The ARN of the layer is This part is IMO the "easiest" to do and less interesting. It's basically about compiling PHP and publishing an AWS layer. Bootstrap fileNow THIS is where there are VERY interesting opportunities. This is were I'm incredibly excited! The bootstrap file is basically the process manager of the lambda. When the lambda starts the bootstrap file is called (it can be a PHP script or anything executable). It is responsible for calling an HTTP API that stalls until an event is available to process by the Lambda. When this HTTP calls finally returns the event data, the code of our PHP app should execute and process the event. Then the HTTP API should be called again to signal the end of the lambda's execution with the response to return. Let's compare with PHP-FPM:
With Bref 0.2 (before the new integration):
With the new Lambda integration:
The last example is what can be done if we simply port Bref's behavior to the new runtime API. But we can explore other solutions. For example Stackery's runtime provides a This is an idea worth exploring. However again I do not consider this bootstrap usable at the moment (because of its quality basically). BUT we are not limited to those options. Here are a few I tried. Scenario AWe run the PHP code in the same process as This is very fast, both for cold starts and warm requests! We can get response times below 10ms with that. However, just like when using such frameworks outside of lambda we have disadvantages: the memory is shared between all requests. That means we can have memory leaks, we have to be careful about global state, etc. Also a fatal error will kill the whole lambda (a new lambda will be started by AWS but that means a new cold start). This is a very interesting option that can be worth proposing as an option, but it cannot be the solution that will work with all apps/frameworks. Example of a <?php
// ...
require __DIR__ . '/vendor/autoload.php';
// BOOT Symfony BEFORE a request comes in!
$kernel = new Kernel('prod', false);
$kernel->boot();
$symfonyAdapter = new SymfonyAdapter($kernel);
while (true) {
// This is a blocking HTTP call until an event is available
$event = waitForEventFromLambdaApi();
$request = RequestFactory::fromLambdaEvent($event);
// REUSE the same Symfony Kernel, meaning fast response time!
$response = $symfonyAdapter->handle($request);
$lambdaResponse = LambdaResponse::fromPsr7Response($response);
signalSuccessToLambdaApi($lambdaResponse);
} Scenario BThe That allows to protect the This is similar too to how PHP-FPM works (in the spirit at least). Example of a <?php
// ...
while (true) {
// This is a blocking HTTP call until an event is available
$event = waitForEventFromLambdaApi();
$process = new Process(['/opt/bin/php', 'index.php', /* pass the event too */]);
$process->setTimeout(null);
// This waits for the process to finish
$process->run();
// [fetch response ...]
signalSuccessToLambdaApi($lambdaResponse);
} Example of a <?php
// ...
require __DIR__ . '/vendor/autoload.php';
// [fetch event from process args]
$kernel = new Kernel('prod', false);
$kernel->boot();
$symfonyAdapter = new SymfonyAdapter($kernel);
$request = RequestFactory::fromLambdaEvent($event);
$response = $symfonyAdapter->handle($request);
$lambdaResponse = LambdaResponse::fromPsr7Response($response);
// [return response to bootstrap somehow]
// DIE!
exit(0); Scenario CJust like B except Example of a <?php
// ...
while (true) {
$process = new Process(['/opt/bin/php', 'index.php']);
$process->setTimeout(null);
// This waits for the process to finish (i.e. waits until an event has been processed)
$process->run();
} Example of a <?php
// ...
require __DIR__ . '/vendor/autoload.php';
// BOOT Symfony BEFORE a request comes in!
$kernel = new Kernel('prod', false);
$kernel->boot();
$symfonyAdapter = new SymfonyAdapter($kernel);
// This is a blocking HTTP call until an event is available
$event = waitForEventFromLambdaApi();
$request = RequestFactory::fromLambdaEvent($event);
$response = $symfonyAdapter->handle($request);
$lambdaResponse = LambdaResponse::fromPsr7Response($response);
signalSuccessToLambdaApi($lambdaResponse);
// DIE!
exit(0); Scenario DHow about instead of creating a new process we fork the The app would bootstrap once in total, but still there is no shared state between events (because each event is processed by a fork). Example of <?php
// ...
require __DIR__ . '/vendor/autoload.php';
// BOOT Symfony ONLY ONCE for all the requests!
$kernel = new Kernel('prod', false);
$kernel->boot();
$symfonyAdapter = new SymfonyAdapter($kernel);
while (true) {
$pid = pcntl_fork();
if ($pid) {
// Root process
// Wait for the child to process the event
pcntl_wait($status);
} else {
// Child process
// Here the autoloader is already loaded and Symfony initialized!
// This is a blocking HTTP call until an event is available
$event = waitForEventFromLambdaApi();
$request = RequestFactory::fromLambdaEvent($event);
$response = $symfonyAdapter->handle($request);
$lambdaResponse = LambdaResponse::fromPsr7Response($response);
signalSuccessToLambdaApi($lambdaResponse);
// The fork DIES! The root process will resume its execution
exit(0);
}
} I find this scenario very interesting. This is something I've always wanted to try implement with PHP-FPM (boot the app before a request comes in) but was never able to because it requires knowing C. ConclusionWith Lambda's execution model and API it is now possible to basically recreate PHP-FPM but with any language, without having to care about the load (because we handle only 1 event at a time). A new world of possibilities is opening! How about other ideas? How about workers or other types of events? How about websockets? Performances?Here are a few benchmarks I did. I admit being disappointed by the performances of the There are lambda execution time for handling one HTTP event from API Gateway (cold start excluded):
Symfony performances are quite bad in general here, I don't know why (I did run in prod environment with cache generated). I don't trust my own benchmarks (was a bit tired) so don't take them too seriously. I did not spend time measuring cold starts but they were between 100ms and 500ms. |
@nealio82 agreed. Also if you want to try out all that see https://gist.github.com/mnapoli/573e4f36a241e458fe9395b779f87511 This is very rough for now sorry about that. For the next weeks I have 2 full days every week to work on all of that. There is something awesome to be done here I can feel it… Any help is welcome! I also tried diving into PHP's source code (mainly PHP FPM and PHP's built-in webserver). If only I knew more about C, I'm sure a PHP-FPM clone that would work with AWS's API and not FastCGI would kill everything. |
Did you compare the performance to the current Bref implementation? I guess ideally, scenario B would be faster, or at least equal to the current Nodejs based version, but with the benefit of being in PHP and not having to ship the binary, right? Would that be a good thing to start with, getting this approach on par with the current Bref? Good job! |
Maybe I don't understand Lambda's execution model well enough - but my first question is this: if we the bootstrap is happening before a request has come in & then waiting for requests, does that mean the Lambda function is running all the time? In that case, how does this differ from running an EC2 box with PHP installed on it? Or does the bootstrap happen once for each cold start when the container boots, but you're only charged by AWS between request and response when the function code is running? Also, will scenario D still be relevant after PHP pre-loading becomes a thing? In userland we could pre-load the entire framework. https://wiki.php.net/rfc/preload Should we document each method and allow the end-user to choose which scenario they want to implement? Maybe for some C would be the better option, whereas others might find A suits their needs the most. |
@hectorj made me realize the example of scenario D wasn't optimal. I have edited it to reflect what I actually tested: I fork first then the child waits for the event (instead of forking after). That allows to save the overhead of forking in the response time. The example wasn't showing that so I edited my comment.
@barryvdh No that's a good point! I just tested the same demo that I deployed earlier this year with Bref and I see an average execution time of 80ms. So better than scenario C (and logically B too), but worse than the fork option. That's surprising that the new integration is slower than the Node-based one 🤔 There must be something going on here. What I should do is create a repository with all the test cases laid out so that it is automated, reproducible and reviewable.
@nealio82 from what I understand yes, the container and the
You only pay for the actual execution time, not the time it's waiting for requests. Also scaling of each "worker"/container is handled by AWS (no limit compared to an EC2 machine).
The // init the lambda: this is the cold start (plus the container start time obviously)
exports.myHandler = function(event, context, callback) {
// this is the part executed on each event
} The Node process stays alive and waits for events to execute the callback. This is basically the same in PHP now if we create a
Exactly. |
Oh, I always assumed it just invoked a new process each time :S |
Could this also potentially open the door for some interesting stuff like installing the blackfire.io daemon? |
Yep now that they've opened the engine and explained the internals everything makes much more sense! This is actually simple in the end: the container starts and executes
Exactly! (at least that's what I understood!) This is why they boast New Relic integration (and other monitoring tools) with the layer thing: you can add the New Relic layer which will add the daemon binary. I guess then it's up to you to start it in your bootstrap. Clever! Oh and I just want to note before I forget: I did not compile and enable opcache in the new layer! This is maybe what is affecting performances so much 🤔 (they are enabled in Bref, but there is a warning in my Symfony-deployed lambda with Bref that opcache is not loading… I need to fix this and publish it. Or if anyone wants to do it go ahead!) |
About the current performance being better than the native layer, remember that the current bref is basically a web server written in NodeJS and delegating requests to php binary. |
Yes that's right! Roadrunner is indeed something to consider but AFAIK not different from React/Aerys/Amp/PHP-PM in the sense that a single PHP process will handle many requests/events. So there is shared memory to care about (memory leaks, etc.). This is the same thing as solution A, except with another language involved. I don't see a benefit over solution A 🤔 |
I didn't think roadrunner would keep the php process up. I though it was more like Apache: spawn children processes that will be the event (take requests) and run php. |
If you're using something like https://github.com/php-pm/php-pm for option A, there is the benefit of already having the framework integrations (Symfony, Laravel etc). But I think it probably still needs to be optional, because of shared memory (doesn't always work well enough). |
@barryvdh I'm not sure I see what we would gain with php-pm, it already works with all (major) frameworks right now. |
I meant if you want to go with option A, which shares the Symfony/Laravel etc bootstrap code. Right now it works because you start a 'fresh' php process, so no shared memory/containers etc, right? php-pm would be faster after the cold-start, because it doesn't have to autoload/boot the container etc. But I think you have to make sure some container stuff is reset etc. But I might just be misunderstanding what Bref/php-pm exactly does exactly. |
Did you mean „no concurrent requests“ instead of „no current requests“ in the PR header? |
@OskarStark yes I fixed that thanks.
@barryvdh yes, but that's the same with solution A because the |
@mnapoli php-pm will do worker restart when memory usage is too high, so i think this is the better choice |
I see also other problems to solution A :
Basically everything that does what this function do : https://github.com/php/php-src/blob/67e0138c0dfd966624223911a0821f6c294ad1c6/main/main.c#L1857 will not be done on the solution A It may be ok in some uses case, but i think it's dangerous to be a default behavior. (Maybe Bref can have multiple bootstraping solution ?) |
@joelwurtz completely agree, these are exactly the same problems as any other long running web app with PHP-PM/React/Amp/etc. So yes it will not be the default. But it's great that some people already running these technologies (i.e. they have apps developed with that in mind) can benefit of AWS Lambda. So yeah, A is not the default but will be offered (as the benefits are very real). Now the default solution could be B, C, D or something else. I think there is something worth exploring with the forks, and I wonder if there's not something else to do with PHP-FPM: can we take the code of PHP-FPM and make it work with Lambda's integration instead of FastCGI? We could even do that in another language (Go, even PHP?). I think the key thing here is understand how PHP-FPM reuses the same PHP processes but without them sharing memory between requests. How do they "reset" the memory of those processes? How do PHP-FPM work? This is key to avoid the overhead of booting a PHP process on every event. |
This is basically how php fpm work (may be not 100% right, but closes) FPM Start a master process
Child process :
You can see the child request loop here : https://github.com/php/php-src/blob/67e0138c0dfd966624223911a0821f6c294ad1c6/sapi/fpm/fpm/fpm_main.c#L1878 |
Instead of replication FPM, can't you use it directly? Or does that remove the performance gain? |
@sandrokeil yes, I place it in the same category as A and the related solutions (Amp, React, PHP-PM, etc.). The reason for that is that there is no isolation between events/requests. (unless I'm mistaken). Solution D allows to have a complete separation between requests as the whole state of the PHP process is reset every time. @barryvdh yes good point, let's call this Scenario E :) (we have to test it if we want the comparison to be serious). I suspect it won't have awesome performances because we need a |
I studied Lambda on the php scenario a bit during the last few months (sporadic research) and the reason I dropped most of my interest was not performance, but rather price. Although lambda is extremely cheap, when I realized that I had to support 1 million request per 5 days I concluded that Fargate is cheaper than Lambda. Sorry if I sounded too crazy here, the idea just popped up and I got carried away 🤣 |
@deleugpn that's an interesting idea but clearly out of scope for now ^^ Maybe later once everything is stable! And TBH I don't reach the same conclusions as you regarding pricing but I'd rather keep this thread on topic 😉 so let's discuss that at another time. |
Sorry if I'm not making sense, I have little experience with Lambda, so not sure how the execution model works for cold/hot starts. I've did some testing on my local Mac with a very simplistic test case using https://github.com/hollodotme/fast-cgi-client Scenario B = runExec or runProcess, This seems to reduce the overhead of calling a new process pretty much.
See script: https://gist.github.com/barryvdh/75667b91f4cd9820ae9c746d752166b7 So obviously that depends on how many times the VM is re-used etc and perhaps on some configuration (I guess you would just want 1 fpm worker in this case) (Note; a cold start is probably not really a cold start here, because opcache etc loads things already and php fpm is already running) |
Great news! 🎉 I have created a repository to document the benchmarks and share the results: https://github.com/mnapoli/bref-bootstrap-benchmarks I have published new numbers there now that I run with opcache and those are very interesting! Solution D is actually twice faster than current Bref performances! There is also now solution E (PHP-FPM), F (built-in webserver) and G (custom PHP SAPI). @joelwurtz thinks solution G is doable (but requires to code it in C) and if it is, it might be very interesting in terms of performances! It would be the equivalent of PHP-FPM (which works with FastCGI) but to work with AWS Lambda's API. Let's see how we can get that ball rolling in the next days. Please also help with some missing scenarios, I can run the benchmark myself. I'm not sure anyone will be able to provide E without having to compile PHP and all so it's not easy. But if someone can do F and also B that would be awesome. Running the benchmarks on a LAMP stack would be very helpful too to compare AWS Lambda to LAMP! Also please if you can review the benchmarks, especially the Symfony code! I might be missing some optimizations! I have also opened 2 issues on that repo if you want to help. |
I cannot wait to try this - but I won’t have time until after Wednesday :( FTR, I love how quickly you’ve / we’ve adapted to the changes from AWS and how much better the project is becoming because of everyone’s excitement and enthusiasm. |
I don't really have time to create the benchmark, still need to setup docker/AWS SAM etc. But this is what I tried for the webserver: $server = new Process("php -S localhost:8000 'index.php'");
$server->setTimeout(null);
$server->start(function ($type, $output) {
if ($type === Process::ERR) {
echo($output);
exit(1);
}
});
// Wait for the server to start
sleep(1);
register_shutdown_function(function() use($server) {
$server->stop();
}); From there it should be easy to use Guzzle or any other PSR-7 client to send the request (with modified host perhaps) to localhost). Response times seem pretty fast, but downside is that I can't really seem to reliably detect if the webserver is started, so you have to wait a while (1 sec is what stackery does). But I can't detect the output, only retry messages of the connection is refused. But not ideal. |
For those following here: the benchmarks have spoken. We'll have 2 stable runtimes:
We'll have an extra experimental runtime that I'll document later when I get more time. I've started building the PHP CLI runtime in #106. The second part is in #115. |
You can start testing 🎉 See #116 if you are interested in testing! |
AWS announced the possibility to use any programming language on Lambda. This is awesome! That means a simplification in Bref, (probably) better performances and a more official support of PHP.
Stackery announced they are working on the PHP runtime and this is available in this repository: https://github.com/stackery/php-lambda-layer
The questions are:
Let's use this issue to track information about all this.
At the moment I have been trying Stackery's PHP layer and here is what I noted:
json
,intl
, etc.Update: this runtime does not seem to be made or maintained by PHP developers judging from the discussions in the issues/PR. I don't consider it viable at the moment.
What's interesting is that creating a runtime for AWS is in the end pretty easy. Our build script is almost ready, and more powerful that what can be found there.
I'll be trying out more things, if you have info to share feel free to post it here.
The text was updated successfully, but these errors were encountered: