Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AWS DynamoDB Local hosting component #975

Closed
wants to merge 5 commits into from

Conversation

normj
Copy link
Contributor

@normj normj commented Nov 21, 2023

AWS DynamoDB Local is an AWS provided emulator for DynamoDB that is available as a container from AWS Elastic Container Registry (ECR). This is rather specific scenario instead of solving general AWS workflows. I wanted to start with this small PR to get better acquainted with the codebase.

The user flow of is users add the Aspire.Hosting.AWS.DynamoDBLocal package and then in their AppHost add the resource and add references to the project like the following.

var builder = DistributedApplication.CreateBuilder(args);

// Startup the DynamoDB Local container
var dynamoDB = builder
                    .AddAWSDynamoDBLocal("default")
                    // Add a callback to seed the DynamoDB local with sample data.
                    .WithSeedDynamoDBCallback(DynamoDBLocalLoader.Configure);

builder.AddProject<Projects.Frontend>("frontend")
        .WithAWSDynamoDBLocalReference(dynamoDB);

builder.Build().Run();

The AddAWSDynamoDBLocal sets up the container similar to AddRedisContainer. The WithAWSDynamoDBLocalReference is different then the Redis example because we are not working with connection strings. The AWS SDK will search the environment for configurations for which region to send request to. The WithAWSDynamoDBLocalReference sets the AWS_ENDPOINT_URL_DYNAMODB environment variable for the project so that the SDK's service clients pick that up as part of its environment search.

Because DynamoDB local should only be used for local development I'm forcing that nothing is written to the manifest for it including the fact that a container was added and referenced.

Because an empty database is useless for testing I added the concept of providing a callback to inject seed data into the database. This is similar to what I would imagine Entity Framework would need for data seeding. I think this is an important feature but is really hacky right now because there isn't a callback that I can see for when a container has been launched and in a steady state. I am repurposing the WithEnvironment where I can get the endpoints and then polling the container till I starts responding successfully.

I also need a way customize the ENTRYPOINT for a container. DynamoDB local has important command line arguments I need to set but that needs to be done via the ENTRYPOINT. Without it can only have DynamoDB local as an in memory database. To get the right F5 debug experience I need to default the database used by DynamoDB local to be shared persisting the database in the volume mount.

I also need the AWSSDK.DynamoDBv2 package added to the NuGet feed used for the repository.

The sample using DynamoDB local should work out of the box for users that don't have AWS credentials or regions configured for their development environment.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-integrations Issues pertaining to Aspire Integrations packages label Nov 21, 2023
@normj normj marked this pull request as draft November 21, 2023 03:32
@RussKie
Copy link
Member

RussKie commented Nov 21, 2023

I also need the AWSSDK.DynamoDBv2 package added to the NuGet feed used for the repository.

The package is already available on dotnet-public feed: https://dev.azure.com/dnceng/public/_artifacts/feed/dotnet-public/NuGet/AWSSDK.DynamoDBv2/overview/3.7.3.30

@davidfowl
Copy link
Member

I also need a way customize the ENTRYPOINT for a container. DynamoDB local has important command line arguments I need to set but that needs to be done via the ENTRYPOINT. Without it can only have DynamoDB local as an in memory database. To get the right F5 debug experience I need to default the database used by DynamoDB local to be shared persisting the database in the volume mount.

#177

I'll try to get to this. There are other people trying to use it as well.

@normj
Copy link
Contributor Author

normj commented Nov 21, 2023

I also need the AWSSDK.DynamoDBv2 package added to the NuGet feed used for the repository.

The package is already available on dotnet-public feed: https://dev.azure.com/dnceng/public/_artifacts/feed/dotnet-public/NuGet/AWSSDK.DynamoDBv2/overview/3.7.3.30

Can we get a newer version. The 3.7.300 version is where .NET 8 support was added. I'm also relying on some environment variable support that has been added since the 3.7.3.30 version. So I would suggest all AWSSDK references should be at least 3.7.300.

@RussKie
Copy link
Member

RussKie commented Nov 21, 2023

I scheduled the package mirroring job for https://www.nuget.org/packages/AWSSDK.DynamoDBv2/3.7.300.6. If all goes well, it will be available in about 30min

@RussKie
Copy link
Member

RussKie commented Nov 21, 2023

@normj
Copy link
Contributor Author

normj commented Nov 21, 2023

There you go: https://dev.azure.com/dnceng/public/_artifacts/feed/dotnet-public/NuGet/AWSSDK.DynamoDBv2/overview/3.7.300.6

Awesome thanks. Updated Directory.Packages.props to that version and reverted any local NuGet.config hacks and build is successful.

@mitchdenny
Copy link
Member

@normj do you think it is worth having a separate package just for DynamoDB vs. having one for all AWS resource types?

/// <returns></returns>
/// <exception cref="DistributedApplicationException"></exception>
public static IResourceBuilder<IDynamoDBLocalResource> AddAWSDynamoDBLocal(this IDistributedApplicationBuilder builder,
string name, string image = "public.ecr.aws/aws-dynamodb-local/aws-dynamodb-local", string tag = "latest", bool disableDynamoDBLocalTelemetry = false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realistically how likely is it that folks will want to change this image? Originally I used to want to put all kinds of options on these extension methods but I realized that less is more.

If someone really wants to change this they can replace the image annotation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect a very 9X% percent use case. But if a user is using this in an air gapped region or private VPC then it is reasonable to override these values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidfowl This does raise the question I have had.

For hosting packages is it going to be considered a breaking change if we add new optional parameters to these methods in the future? In our SDK we consider that a binary breaking change and we avoid that by having methods always take in an options class instead of using parameters so we can always add new properties to the options class. So I get nervous relying on optional parameters but I see that is the convention being used. Is it not a breaking concern because the hosting packages are so top level that any changes would just get picked up in the recompile?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// <summary>
/// Utilities that can be used for helping to seed DynamoDB local with tables and data.
/// </summary>
public class SeedDynamoDBUtilities(IAmazonDynamoDB dynamoDBClient)
Copy link
Member

@mitchdenny mitchdenny Nov 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The convention we used in the eShop sample was that we had a dedicated container used for seeding/initializing the database. You might want to do the same thing here rather than introducing seeding logic into the AppHost.

It also helps reduce the # of dependencies for resource types in the AppHost project.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't notice that in the eShop sample but looking at it now I'm not really a fan of it. I'm probably missing some important details but that is a lot more work to setup a whole other project. Why do I need to spin up an indefinite running webserver for something that has one startup purpose and then it is done. Also since this is a project in the AppHost it will be added to the manifest which means it would be something deployed if we went from the manifest to deployment. I know in my case this is just adding seed data to an emulator but even for EF would you really want deploy a webserver to apply EF migrations?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know in my case this is just adding seed data to an emulator but even for EF would you really want deploy a webserver to apply EF migrations?

#398

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pattern recommended by the EF team for cases when databases have complicated initialization and/or seeding logic, e.g. seeding data with relationships. It's also a pattern we've seen customers and internal teams use with success. Having an app dedicated to the initialization and management of the database is simple to reason about and very powerful WRT what can be done there. Also, it's all C# code which is nice 😄.

We did talk a bit about patterns for running EF migrations from the AppHost project but ultimately decided on the dedicated app approach as it ended up being "cleaner" and solved the added issue of what runs migrations when deploying the app?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a key difference here is I only expect this seeding logic to be used when using the emulator. I only have a rudimentary knowledge of EF but I expect in the EF case that same project would still be used during deployment to handle DB migrations for a DB provisioned by the deployment tooling. In a DynamoDB case the tooling doing the deployment would most likely take care of the provisioning of DynamoDB creating the "schemaless" tables, there is no concept of migrations to run on the table.

In EF's case I can see the value of having a project that stores the migrations and optional seed data. Still find it weird that it is a web app left running. Seems like that is leaving unnecessary process running that is wasting memory and its health checks could affect the rest of the system.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even in a schemaless system there is an implied schema captured in the app tier. Regardless of how strictly the data store enforces the schema I've always found that there needs to be some kind of job that runs in the background that fills in data as the data shapes evolve.

Keep in mind that it might not necessarily be a process just for doing migrations. Most systems have some kind of background scheduler doing stuff, and this is just one of the things it does.

credentials = new BasicAWSCredentials("AKIAIOSFODNN7EXAMPLE", "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY");
}

builder.Services.AddSingleton<IAmazonDynamoDB>(new AmazonDynamoDBClient(credentials));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The convention we are using for Aspire components is that we have a set of top-level extension methods for registering components. So, you might have AddAmazonDynamoDB(...) for example.

You would also have a keyed variant. Probably the best analogue that we have now is the CosmosDB component. We also try pretty hard to avoid having any credentials be dealt with in Program.cs already. All of that information should come via the connection string.

For local development we might have a connection string that includes a randomly generated password, but for development we try to have the code use the workload-based identity solutions of the hosting platform. In Azure that is managed identity. I'm assuming that is a AWS IAM role?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we need components for adding the service clients like DynamoDB. That is a general problem for all AWS service clients that I would like to solve in a separate PR.

The FallbackCredentialsFactory.GetCredentials() line above this is doing what you are saying about workload-based identity. I put this fallback of using the dummy credentials so people who didn't have an AWS setup could still test the sample. I'm guessing few on the .NET team have AWS credential setups 😄. But I do admit this will not work once we add more AWS components and those require real credentials.

So my question back to the team, how much do you want to have the samples be F5 debuggable even though people might not have all of the normal prerequisites.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think when we are talking about cloud specific components like we are here, we don't have to be as strict with the F5 debuggable without doing some local prep work first. But we should explain what people need to do (maybe in a README) to get the sample running assuming they can meet those requirements.

For the eShopLite example we have a dependency on service bus, but its a soft dependency ... the sample will still run without it. But eShopLite is our main sandpit when we are trying things out in the repo before we build the necessary test harnesses around it :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't fleshed out the idea but wondered if provisioners needs to be a more first class concept. When they are registered they they could show up in the dashboard and the user could decide when to invoke the provisioner should be invoked instead of being something that is always done as part F5. I'm worried about disrupting the F5 experience especially after the initial F5 session checking to see if anything needs to be done. Being in the dashboard would allow us to have the provisioner interface have a tear down method that could be invoked from the dashboard making it easier to tear things down. If you all think this idea warrants further discussion I can open a top level discussion on the topic.


namespace AWS.AppHost;

internal sealed class DynamoDBLocalLoader
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move into a separate project (similar to the CatalogDb project in eShop).

/// <summary>
/// Represents a DynamoDB local resource. This is a dev only resources and will not be written to the project's manifest.
/// </summary>
public interface IDynamoDBLocalResource : IResourceWithEnvironment, IResourceWithBindings
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably implement IResourceWithConnectionString.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how this would work because I'm not setting a connection string. The AWS SDK looks for a specific environment variable that does not go into the connection string grouping.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide some more detail on how the environment variables work, especially in instances where you might be connecting to multiple databases.

There are two parts to consider here. How we inject the environment variables on the AppModel side, and how they are consumed on the service side.

@davidfowl this could challenge some of our assumptions about how WithReference works with connection strings. We might need another primative for things that can't use connection strings where the resource can be an environment producer and be given a context to populate variables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are our docs on service specific endpoints. https://docs.aws.amazon.com/sdkref/latest/guide/feature-ss-endpoints.html. This is a feature implemented across all AWS SDKs not just the .NET SDK.

There are two parts to consider here. How we inject the environment variables on the AppModel side, and how they are consumed on the service side.

Using this feature of the SDK allows us to not have to implement the second part and users don't have to change any existing code to take advantage of it.

In most cases in AWS you are not "connecting" to a resource you sending request to a specific AWS region/account and the resource being acted on is a parameter to on the request. In DynamoDB's case there is no concept of a "database". An account/region just has a collection of tables making account and region the partitioning point.

// on the project pointing the local DynamoDB so the AWS .NET SDK picks that up as the
// endpoint.
builder.AddProject<Projects.Frontend>("frontend")
.WithAWSDynamoDBLocalReference(dynamoDB)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you implement IResourceWithConnectionString on your resource type then you can get rid of this custom extension and just use WithReference.

@normj
Copy link
Contributor Author

normj commented Nov 22, 2023

@normj do you think it is worth having a separate package just for DynamoDB vs. having one for all AWS resource types?

My instinct was to have things modularized to avoid unnecessary dependencies but since this for the AppHost not the application projects that probably isn't a concern.

Also puts more pressure to get more fundamental features in the package 😄. I started with this smaller scope feature to get familiar with the codebase and experiences before jumping into more fundamental features.

@davidfowl
Copy link
Member

My instinct was to have things modularized to avoid unnecessary dependencies but since this for the AppHost not the application projects that probably isn't a concern.

I would say yes before we ship 1.0 sometime next year, but not in this first set of PRs. A single AWS package would make it easier to iterate.

@danmoseley danmoseley added this to the preview 3 (Jan) milestone Nov 29, 2023
@birojnayak
Copy link

@normj the PR status is in draft mode.. is it ready to review or primarily a directional PR before you publish another PR ?

@normj
Copy link
Contributor Author

normj commented Dec 6, 2023

@birojnayak I put this PR out to kick the tires and get some feedback on whether I understand the codebase and the direction of Aspire. This also identified some feature gaps in Aspire like the ability to customize the ENTRYPOINT of a container which sounds like might be address in preview2.

This is not intended to be the first merged PR for AWS support although I do think it will eventually make using DynamoDB local a lot easier for .NET developers to use without having to get into DynamoDB local's Java funness.

@joperezr joperezr added the community-contribution Indicates that the PR has been added by a community member label Dec 11, 2023
@eerhardt eerhardt added the area-app-model Issues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication label Dec 14, 2023
@mitchdenny mitchdenny self-assigned this Jan 10, 2024
@normj
Copy link
Contributor Author

normj commented Jan 27, 2024

Closing this because I want to focus on this PR which is more low level first. Will revisit this later. #1905

@normj normj closed this Jan 27, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Apr 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-app-model Issues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication area-integrations Issues pertaining to Aspire Integrations packages community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants