Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proof of concept for Datasette on AWS Lambda with EFS #850

Open
simonw opened this issue Jun 16, 2020 · 25 comments
Open

Proof of concept for Datasette on AWS Lambda with EFS #850

simonw opened this issue Jun 16, 2020 · 25 comments
Labels

Comments

@simonw
Copy link
Owner

simonw commented Jun 16, 2020

https://aws.amazon.com/about-aws/whats-new/2020/06/aws-lambda-support-for-amazon-elastic-file-system-now-generally-/

If Datasette can run on Lambda with access to EFS it could both read AND write large databases there.

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

File locking is interesting here. https://docs.aws.amazon.com/lambda/latest/dg/services-efs.html

Amazon EFS supports file locking to prevent corruption if multiple functions try to write to the same file system at the same time. Locking in Amazon EFS follows the NFS v4.1 protocol for advisory locking, and enables your applications to use both whole file and byte range locks.

SQLite can apparently work on NFS v4.1. I think I'd rather set things up so there's only ever one writer - so a Datasette instance could scale reads by running lots more lambda functions but only one function ever writes to a file at a time. Not sure if that's feasible with Lambda though - maybe by adding some additional shared state mechanism like Redis?

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

Easier solution to this might be to have two functions - a "read-only" one which is allowed to scale as much as it likes, and a "write-only" one which can write to the database files but is limited to running a maximum of one Lambda instance. https://docs.aws.amazon.com/lambda/latest/dg/invocation-scaling.html

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

https://docs.aws.amazon.com/efs/latest/ug/wt1-getting-started.html is an EFS walk-through using the AWS CLI tool instead of clicking around in their web interface.

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

https://github.com/jordaneremieff/mangum looks like the best way to run an ASGI app on Lambda at the moment.

from mangum import Mangum

async def app(scope, receive, send):
    await send(
        {
            "type": "http.response.start",
            "status": 200,
            "headers": [[b"content-type", b"text/plain; charset=utf-8"]],
        }
    )
    await send({"type": "http.response.body", "body": b"Hello, world!"})


handler = Mangum(app)

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

From https://mangum.io/adapter/

The AWS Lambda handler event and context arguments are made available to an ASGI application in the ASGI connection scope.

scope['aws.event']
scope['aws.context']

I can use https://github.com/simonw/datasette-debug-asgi to see that.

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

It looks like SAM - AWS Serverless Application Model - is the currently recommended way to deploy Python apps to Lambda from the command-line: https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-getting-started-hello-world.html

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

Installed SAM:

brew tap aws/tap
brew install aws-sam-cli
sam --version
SAM CLI, version 0.52.0

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

simon@Simons-MacBook-Pro /tmp % sam init

	SAM CLI now collects telemetry to better understand customer needs.

	You can OPT OUT and disable telemetry collection by setting the
	environment variable SAM_CLI_TELEMETRY=0 in your shell.
	Thanks for your help!

	Learn More: https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-telemetry.html

Which template source would you like to use?
	1 - AWS Quick Start Templates
	2 - Custom Template Location
Choice: 1

Which runtime would you like to use?
	1 - nodejs12.x
	2 - python3.8
	3 - ruby2.7
	4 - go1.x
	5 - java11
	6 - dotnetcore3.1
	7 - nodejs10.x
	8 - python3.7
	9 - python3.6
	10 - python2.7
	11 - ruby2.5
	12 - java8
	13 - dotnetcore2.1
Runtime: 2

Project name [sam-app]: datasette-proof-of-concept

Cloning app templates from https://github.com/awslabs/aws-sam-cli-app-templates.git

AWS quick start application templates:
	1 - Hello World Example
	2 - EventBridge Hello World
	3 - EventBridge App from scratch (100+ Event Schemas)
	4 - Step Functions Sample App (Stock Trader)
Template selection: 1

-----------------------
Generating application:
-----------------------
Name: datasette-proof-of-concept
Runtime: python3.8
Dependency Manager: pip
Application Template: hello-world
Output Directory: .

Next steps can be found in the README file at ./datasette-proof-of-concept/README.md

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

simon@Simons-MacBook-Pro datasette-proof-of-concept % sam build --use-container
Starting Build inside a container
Building function 'HelloWorldFunction'

Fetching lambci/lambda:build-python3.8 Docker container image..........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Mounting /private/tmp/datasette-proof-of-concept/hello_world as /tmp/samcli/source:ro,delegated inside runtime container

Build Succeeded

Built Artifacts  : .aws-sam/build
Built Template   : .aws-sam/build/template.yaml

Commands you can use next
=========================
[*] Invoke Function: sam local invoke
[*] Deploy: sam deploy --guided
    
Running PythonPipBuilder:ResolveDependencies
Running PythonPipBuilder:CopySource

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

simon@Simons-MacBook-Pro datasette-proof-of-concept % sam local invoke
Invoking app.lambda_handler (python3.8)

Fetching lambci/lambda:python3.8 Docker container image....................................................................................................................................................................................................................................
Mounting /private/tmp/datasette-proof-of-concept/.aws-sam/build/HelloWorldFunction as /var/task:ro,delegated inside runtime container
START RequestId: 4616ab43-6882-1627-e5e3-5a29730d52f9 Version: $LATEST
END RequestId: 4616ab43-6882-1627-e5e3-5a29730d52f9
REPORT RequestId: 4616ab43-6882-1627-e5e3-5a29730d52f9	Init Duration: 140.84 ms	Duration: 2.49 ms	Billed Duration: 100 ms	Memory Size: 128 MBMax Memory Used: 25 MB	

{"statusCode":200,"body":"{\"message\": \"hello world\"}"}
simon@Simons-MacBook-Pro datasette-proof-of-concept % sam local invoke
Invoking app.lambda_handler (python3.8)

Fetching lambci/lambda:python3.8 Docker container image......
Mounting /private/tmp/datasette-proof-of-concept/.aws-sam/build/HelloWorldFunction as /var/task:ro,delegated inside runtime container
START RequestId: 3189df2f-e9c0-1be4-b9ac-f329c5fcd067 Version: $LATEST
END RequestId: 3189df2f-e9c0-1be4-b9ac-f329c5fcd067
REPORT RequestId: 3189df2f-e9c0-1be4-b9ac-f329c5fcd067	Init Duration: 87.22 ms	Duration: 2.34 ms	Billed Duration: 100 ms	Memory Size: 128 MB	Max Memory Used: 25 MB	

{"statusCode":200,"body":"{\"message\": \"hello world\"}"}

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

simon@Simons-MacBook-Pro datasette-proof-of-concept % sam deploy --guided

Configuring SAM deploy
======================

	Looking for samconfig.toml :  Not found

	Setting default arguments for 'sam deploy'
	=========================================
	Stack Name [sam-app]: datasette-proof-of-concept
	AWS Region [us-east-1]: 
	#Shows you resources changes to be deployed and require a 'Y' to initiate deploy
	Confirm changes before deploy [y/N]: y
	#SAM needs permission to be able to create roles to connect to the resources in your template
	Allow SAM CLI IAM role creation [Y/n]: y
	HelloWorldFunction may not have authorization defined, Is this okay? [y/N]: y
	Save arguments to samconfig.toml [Y/n]: y
Error: Failed to create managed resources: Unable to locate credentials

I need to get my AWS credentials sorted. I'm going to follow https://docs.aws.amazon.com/IAM/latest/UserGuide/getting-started_create-admin-group.html and https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-getting-started-set-up-credentials.html

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

I used https://console.aws.amazon.com/billing/home?#/account and activated "IAM user/role access to billing information" - what a puzzling first step!

I created a new user with AWS console access (which means access to the web UI) called simon-administrator and set a password. I created an Administrators group with AdministratorAccess.

Banners_and_Alerts_and_IAM_Management_Console

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

I think I need to sign in to the AWS console with this new simon-administrator account and create IAM credentials for it.

... for which I needed my root "account ID" - a 12 digit number - to use on the IAM login form.

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

Logged in as simon-administrator I'm using https://console.aws.amazon.com/iam/home?region=us-east-2#/security_credentials to create credentials:

Banners_and_Alerts_and_IAM_Management_Console

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

Clicking that button generated me an access key ID / access key secret pair. Dropping those into ~/.aws/credentials using this format:

[default]
aws_access_key_id = your_access_key_id
aws_secret_access_key = your_secret_access_key

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

OK, sam deploy --guided now works!

simon@Simons-MacBook-Pro datasette-proof-of-concept % sam deploy --guided

Configuring SAM deploy
======================

	Looking for samconfig.toml :  Not found

	Setting default arguments for 'sam deploy'
	=========================================
	Stack Name [sam-app]: datasette-proof-of-concept
	AWS Region [us-east-1]: 
	#Shows you resources changes to be deployed and require a 'Y' to initiate deploy
	Confirm changes before deploy [y/N]: 
	#SAM needs permission to be able to create roles to connect to the resources in your template
	Allow SAM CLI IAM role creation [Y/n]: 
	HelloWorldFunction may not have authorization defined, Is this okay? [y/N]: y
	Save arguments to samconfig.toml [Y/n]: 

	Looking for resources needed for deployment: Not found.
	Creating the required resources...
	Successfully created!

		Managed S3 bucket: aws-sam-cli-managed-default-samclisourcebucket-1ksajo4h62s07
		A different default S3 bucket can be set in samconfig.toml

	Saved arguments to config file
	Running 'sam deploy' for future deployments will use the parameters saved above.
	The above parameters can be changed by modifying samconfig.toml
	Learn more about samconfig.toml syntax at 
	https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-config.html

	Deploying with following values
	===============================
	Stack name                 : datasette-proof-of-concept
	Region                     : us-east-1
	Confirm changeset          : False
	Deployment s3 bucket       : aws-sam-cli-managed-default-samclisourcebucket-1ksajo4h62s07
	Capabilities               : ["CAPABILITY_IAM"]
	Parameter overrides        : {}

Initiating deployment
=====================
Uploading to datasette-proof-of-concept/0c208b5656a7aeb6186d49bebc595237  535344 / 535344.0  (100.00%)
HelloWorldFunction may not have authorization defined.
Uploading to datasette-proof-of-concept/14bd9ce3e21f9c88634d13c0c9b377e4.template  1147 / 1147.0  (100.00%)

Waiting for changeset to be created..

CloudFormation stack changeset
---------------------------------------------------------------------------------------------------------------------------------------------------------
Operation                                           LogicalResourceId                                   ResourceType                                      
---------------------------------------------------------------------------------------------------------------------------------------------------------
+ Add                                               HelloWorldFunctionHelloWorldPermissionProd          AWS::Lambda::Permission                           
+ Add                                               HelloWorldFunctionRole                              AWS::IAM::Role                                    
+ Add                                               HelloWorldFunction                                  AWS::Lambda::Function                             
+ Add                                               ServerlessRestApiDeployment47fc2d5f9d               AWS::ApiGateway::Deployment                       
+ Add                                               ServerlessRestApiProdStage                          AWS::ApiGateway::Stage                            
+ Add                                               ServerlessRestApi                                   AWS::ApiGateway::RestApi                          
---------------------------------------------------------------------------------------------------------------------------------------------------------

Changeset created successfully. arn:aws:cloudformation:us-east-1:462092780466:changeSet/samcli-deploy1592349262/d685f2de-87c1-4b8e-b13a-67b94f8fc928


2020-06-16 16:14:29 - Waiting for stack create/update to complete

CloudFormation events from changeset
---------------------------------------------------------------------------------------------------------------------------------------------------------
ResourceStatus                         ResourceType                           LogicalResourceId                      ResourceStatusReason                 
---------------------------------------------------------------------------------------------------------------------------------------------------------
CREATE_IN_PROGRESS                     AWS::IAM::Role                         HelloWorldFunctionRole                 -                                    
CREATE_IN_PROGRESS                     AWS::IAM::Role                         HelloWorldFunctionRole                 Resource creation Initiated          
CREATE_COMPLETE                        AWS::IAM::Role                         HelloWorldFunctionRole                 -                                    
CREATE_IN_PROGRESS                     AWS::Lambda::Function                  HelloWorldFunction                     Resource creation Initiated          
CREATE_IN_PROGRESS                     AWS::Lambda::Function                  HelloWorldFunction                     -                                    
CREATE_COMPLETE                        AWS::Lambda::Function                  HelloWorldFunction                     -                                    
CREATE_IN_PROGRESS                     AWS::ApiGateway::RestApi               ServerlessRestApi                      Resource creation Initiated          
CREATE_IN_PROGRESS                     AWS::ApiGateway::RestApi               ServerlessRestApi                      -                                    
CREATE_COMPLETE                        AWS::ApiGateway::RestApi               ServerlessRestApi                      -                                    
CREATE_IN_PROGRESS                     AWS::Lambda::Permission                HelloWorldFunctionHelloWorldPermissi   -                                    
                                                                              onProd                                                                      
CREATE_IN_PROGRESS                     AWS::ApiGateway::Deployment            ServerlessRestApiDeployment47fc2d5f9   -                                    
                                                                              d                                                                           
CREATE_COMPLETE                        AWS::ApiGateway::Deployment            ServerlessRestApiDeployment47fc2d5f9   -                                    
                                                                              d                                                                           
CREATE_IN_PROGRESS                     AWS::ApiGateway::Deployment            ServerlessRestApiDeployment47fc2d5f9   Resource creation Initiated          
                                                                              d                                                                           
CREATE_IN_PROGRESS                     AWS::Lambda::Permission                HelloWorldFunctionHelloWorldPermissi   Resource creation Initiated          
                                                                              onProd                                                                      
CREATE_IN_PROGRESS                     AWS::ApiGateway::Stage                 ServerlessRestApiProdStage             -                                    
CREATE_COMPLETE                        AWS::ApiGateway::Stage                 ServerlessRestApiProdStage             -                                    
CREATE_IN_PROGRESS                     AWS::ApiGateway::Stage                 ServerlessRestApiProdStage             Resource creation Initiated          
CREATE_COMPLETE                        AWS::Lambda::Permission                HelloWorldFunctionHelloWorldPermissi   -                                    
                                                                              onProd                                                                      
CREATE_COMPLETE                        AWS::CloudFormation::Stack             datasette-proof-of-concept             -                                    
---------------------------------------------------------------------------------------------------------------------------------------------------------

CloudFormation outputs from deployed stack
---------------------------------------------------------------------------------------------------------------------------------------------------------
Outputs                                                                                                                                                 
---------------------------------------------------------------------------------------------------------------------------------------------------------
Key                 HelloWorldFunctionIamRole                                                                                                           
Description         Implicit IAM Role created for Hello World function                                                                                  
Value               arn:aws:iam::462092780466:role/datasette-proof-of-concept-HelloWorldFunctionRole-8MIDNIV5ECA6                                       

Key                 HelloWorldApi                                                                                                                       
Description         API Gateway endpoint URL for Prod stage for Hello World function                                                                    
Value               https://q7lymja3sj.execute-api.us-east-1.amazonaws.com/Prod/hello/                                                                  

Key                 HelloWorldFunction                                                                                                                  
Description         Hello World Lambda Function ARN                                                                                                     
Value               arn:aws:lambda:us-east-1:462092780466:function:datasette-proof-of-concept-HelloWorldFunction-QTF78ZEUDCB                            
---------------------------------------------------------------------------------------------------------------------------------------------------------

Successfully created/updated stack - datasette-proof-of-concept in us-east-1

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

https://q7lymja3sj.execute-api.us-east-1.amazonaws.com/Prod/hello/

That's a pretty ugly URL. I'm not sure how to get rid of the /Prod/ prefix on it. Might have to use the base_url setting to get something working: https://datasette.readthedocs.io/en/stable/config.html#base-url

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

I added an exclamation mark to hello world and ran sam deploy again. https://q7lymja3sj.execute-api.us-east-1.amazonaws.com/Prod/hello/ still shows the old message.

Running sam build --use-container first and then sam deploy did the right thing.

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

I changed requirements.txt to this:

datasette
mangum

And app.py to this:

from datasette.app import Datasette
from mangum import Mangum


datasette = Datasette([], memory=True)
lambda_handler = Mangum(datasette.app())

But then when I ran sam build --use-container I got this:

simon@Simons-MacBook-Pro datasette-proof-of-concept % sam build --use-container
Starting Build inside a container
Building function 'HelloWorldFunction'

Fetching lambci/lambda:build-python3.8 Docker container image......
Mounting /private/tmp/datasette-proof-of-concept/hello_world as /tmp/samcli/source:ro,delegated inside runtime container

Build Failed
Running PythonPipBuilder:ResolveDependencies
Error: PythonPipBuilder:ResolveDependencies - {uvloop==0.14.0(wheel)}

uvloop isn't actually necessary for this project, since it's used by uvicorn which isn't needed if Lambda is serving ASGI traffic directly.

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

Someone else ran into this problem: iwpnd/fastapi-aws-lambda-example#1

So I need to be able to pip install MOST of Datasette, but skip uvicorn. Tricky. I'll try installing a custom fork?

simonw added a commit that referenced this issue Jun 16, 2020
simonw added a commit that referenced this issue Jun 16, 2020
@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

OK, changed requirements.txt to this:

https://github.com/simonw/datasette/archive/no-uvicorn.zip
mangum

No sam build --use-container runs without errors. Ran sam deploy too.

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

https://q7lymja3sj.execute-api.us-east-1.amazonaws.com/Prod/hello/ is now giving me a 500 internal server error.

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

Tried sam local invoke:

simon@Simons-MacBook-Pro datasette-proof-of-concept % sam local invoke
Invoking app.lambda_handler (python3.8)

Fetching lambci/lambda:python3.8 Docker container image......
Mounting /private/tmp/datasette-proof-of-concept/.aws-sam/build/HelloWorldFunction as /var/task:ro,delegated inside runtime container
START RequestId: 7c04480b-5d42-168e-dec0-4e8bf34fa596 Version: $LATEST
[INFO]	2020-06-16T23:33:27.24Z	7c04480b-5d42-168e-dec0-4e8bf34fa596	Waiting for application startup.
[INFO]	2020-06-16T23:33:27.24Z	7c04480b-5d42-168e-dec0-4e8bf34fa596	LifespanCycleState.STARTUP:  'lifespan.startup.complete' event received from application.
[INFO]	2020-06-16T23:33:27.24Z	7c04480b-5d42-168e-dec0-4e8bf34fa596	Application startup complete.
[INFO]	2020-06-16T23:33:27.24Z	7c04480b-5d42-168e-dec0-4e8bf34fa596	Waiting for application shutdown.
[INFO]	2020-06-16T23:33:27.24Z	7c04480b-5d42-168e-dec0-4e8bf34fa596	LifespanCycleState.SHUTDOWN:  'lifespan.shutdown.complete' event received from application.
[ERROR] KeyError: 'requestContext'
Traceback (most recent call last):
  File "/var/task/mangum/adapter.py", line 110, in __call__
    return self.handler(event, context)
  File "/var/task/mangum/adapter.py", line 130, in handler
    if "eventType" in event["requestContext"]:
END RequestId: 7c04480b-5d42-168e-dec0-4e8bf34fa596
REPORT RequestId: 7c04480b-5d42-168e-dec0-4e8bf34fa596	Init Duration: 1120.76 ms	Duration: 7.08 ms	Billed Duration: 100 ms	Memory Size: 128 MBMax Memory Used: 47 MB	

{"errorType":"KeyError","errorMessage":"'requestContext'","stackTrace":["  File \"/var/task/mangum/adapter.py\", line 110, in __call__\n    return self.handler(event, context)\n","  File \"/var/task/mangum/adapter.py\", line 130, in handler\n    if \"eventType\" in event[\"requestContext\"]:\n"]}

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

Just realized Colin Dellow reported an issue with Datasette and Mangum back in April - #719 - and has in fact been working on https://github.com/code402/datasette-lambda for a while!

@simonw
Copy link
Owner Author

simonw commented Jun 16, 2020

https://aws.amazon.com/blogs/compute/announcing-http-apis-for-amazon-api-gateway/ looks very important here: AWS HTTP APIs were introduced in December 2019 and appear to be a third of the price of API Gateway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant