Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it easier to use external volumes for database containers #837

Closed
Kralizek opened this issue Nov 15, 2023 · 25 comments · Fixed by #1317
Closed

Make it easier to use external volumes for database containers #837

Kralizek opened this issue Nov 15, 2023 · 25 comments · Fixed by #1317
Assignees
Labels
area-app-model Issues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication

Comments

@Kralizek
Copy link

I found my way around creating a database container with a volume mount

var database = builder.AddPostgresContainer("database")
    .WithVolumeMount("../../data/database/data", "/var/lib/postgresql/data");

It would be cool to have .WithVolumeMountForData("../../data/database/data") offered as an extension method to IResourceBuilder<PostgresContainerResource>

It would be even cooler if the libraries would push us towards the pit of success by assuming that we want to use a volume mount for the database data.

@danmoseley danmoseley added the area-app-model Issues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication label Nov 15, 2023
@davidfowl davidfowl added help wanted Issue that is a good candidate for community contribution. good first issue Good for newcomers labels Nov 24, 2023
@davidfowl davidfowl added this to the backlog milestone Nov 24, 2023
@stbau04
Copy link
Contributor

stbau04 commented Dec 6, 2023

It would be even cooler if the libraries would push us towards the pit of success by assuming that we want to use a volume mount for the database data.

How could the library push you towards using the mount? Do you mean that the mount should be required when adding a postgres container and it would need to be specifically deactivated if not wanted?

Which of the optional parameters should be available for the extension method? Setting the VolumeMountType might make sense, but are there any cases where the data directory should be mounted readonly?

@josephaw1022
Copy link

I like this idea. I think a holistic approach to this and having some sort of approach to do this for every supported component would be necessary for consistency (postgres, sql server, rabbitmq, etc..., and any other upcoming components). While some users may prefer not to use volumes by default, providing it by default could enhance usability for those who do.

If adding volumes by default is not feasible or not the approach the maintainers want to take, then introducing a more streamlined extension method would be a decent alternative. The current challenge I find when working with volumes is determining the appropriate volume path. An ideal solution would allow simply naming the volume without having to specify the path, significantly simplifying the process. Since aspire is opinionated with its container version for each component, it's not like the path is unknown or varying in any way.

Here's a comparison of the current method versus the proposed streamlined approach in code:

// Current method
var queue = builder.AddRabbitMQContainer("queue")
                   .WithVolumeMount("VolumeQueue", "/var/lib/rabbitmq", VolumeMountType.Named);

var accountDb = builder.AddPostgresContainer("accountDatabase")
                       .WithVolumeMount("VolumeMount.accountdb.data", "/var/lib/postgresql/data", VolumeMountType.Named)
                       .AddDatabase("account");

// Proposed streamlined method
var queue = builder.AddRabbitMQContainer("queue")
                   .WithNamedVolume("VolumeQueue");

var accountDb = builder.AddPostgresContainer("accountDatabase")
                       .WithNamedVolume("VolumeMount.accountdb.data")
                       .AddDatabase("account");

The proposed method would make the development experience more efficient and user-friendly, especially for those who frequently work with volumes.

@stbau04
Copy link
Contributor

stbau04 commented Dec 11, 2023

If adding volumes by default is not feasible or not the approach the maintainers want to take, then introducing a more streamlined extension method would be a decent alternative. The current challenge I find when working with volumes is determining the appropriate volume path. An ideal solution would allow simply naming the volume without having to specify the path, significantly simplifying the process. Since aspire is opinionated with its container version for each component, it's not like the path is unknown or varying in any way.

I think adding the volumes by default is feasable by mounting the volume in the extension method for the IDistributedApplicationBuilder. To keep the possability of not mounting the default volumes we could add a optional parameter to the extensions which is either a bool addDefaultVolumeMounts: true or a enum (something like ContainerConfigurationLevel) providing the following values: None, VolumesMounted. Default would be VolumesMounted.

If ContainerConfigurationLevel.VolumesMounted is passed to the extension method the volumes are mounted as named with a auto-generated name.

The drawback of that approach would be that you are not forced to think of whether mounts might be necessary during implementing it (as it would be with an abstraction/inheritance approach, but that would probably be a lot more complicated to implement and use).

// Proposed streamlined method
var queue = builder.AddRabbitMQContainer("queue")
.WithNamedVolume("VolumeQueue");

var accountDb = builder.AddPostgresContainer("accountDatabase")
.WithNamedVolume("VolumeMount.accountdb.data")
.AddDatabase("account");


The proposed method would make the development experience more efficient and user-friendly, especially for those who frequently work with volumes.

This specific approach would probably not work very well as not all containers are going to have exactly one mount. There might be containers with no mounts or with more than one default volume required. In those cases an exception or some other soltuion would be required.

@davidfowl @danmoseley Do you think the default volume mounts are the right way to go? Or should we stick to the WithVolumeMountForData-approach @Kralizek mentioned?

@DamianEdwards
Copy link
Member

I don't think we should mount volumes by default. I'd prefer we consider methods that expose the underlying container's supported paths and environment variables for the various built-in behaviors, including the data directories. There would need to be specific overloads of these for each container resource type of course, but they'd be the same name and shape for consistency, e.g.:

var accountDb = builder.AddPostgresContainer("accountDatabase")
    .WithDataVolume() // Automatically named volume mapped to Postgres-specific path for data, accepts optional volume name
    .WithInitDbBindMount("./data/accountDatabase") // Path to dir on host to mount to /docker-entrypoint-initdb.d
    .AddDatabase("account");

@stbau04
Copy link
Contributor

stbau04 commented Dec 11, 2023

There are the following container resources (they inherit from ContainerResource):

  • MongoDBContainerResource
  • MySqlContainerResource
  • PostrgresContainerResourcce
  • PostgresServerResource
  • RabbitMQContainerResource
  • RedisContainerResource
  • SqlServerContainerResource
  • SqlServerDatabaseResource

Are PostgresServerResource and SqlServerDatabaseResource actually container resources?
PostgresServerResource uses the type "postgres.server.v0" in the manifest while there is a method WriteContainer(ContainerResource container) in the ManifestPublishingContext which uses the type "container.v0". Maybe I'm just not getting it, but it seems a little bit off to me?
For the SqlServerDatabaseResource it is pretty much the same (further e.g. the MongoDBDatabaseResource is not a container resource). Did I miss something? Why is SqlServerDatabaseResource a container resource while the mostly equal MongoDBDatabaseResource isnt?

Apart from that I would suggest the following extensions methods (i assume that the signature of different volume mounts may vary as they have completely different requirements but the signature should be consitant between the same method for different containers):

...
.WithDataVolumeMount() // Volume name is auto-generated, VolumeMountType is Named and it is not readonly
.WithDataVolumeMount(string source, VolumeMountType mountType =VolumeMountType .Named , bool isReadonly = false) // Configure the details yourself
.WithLogVolumeMount() // Volume name is auto-generated, VolumeMountType is Named and it is not readonly
.WithLogVolumeMount(string source, VolumeMountType mountType =VolumeMountType .Named) // Configure the details yourself, readonly setting is not available as it does not make any sense to make the log folder readonly
.WithInitDbBindVolumeMount(string source, VolumeMountType mountType =VolumeMountType .Bind, bool isReadonly = true) // No version without the source as it probably doesn't have any common use cases
...

I would implement these four function wherever applicable. Are there any further directories that could be commonly used? Did i miss any container resources?

Edit:
Use With...VolumeMount instead of With...Volume/With...Mount to align with the existing WithVolumeMount

@DamianEdwards
Copy link
Member

@stbau04 I think the existing WithVolumeMount is conflating different container concepts (bind-mount vs. volumes) so we should feel free to suggest changing it too.

@DamianEdwards
Copy link
Member

For the SqlServerDatabaseResource it is pretty much the same (further e.g. the MongoDBDatabaseResource is not a container resource). Did I miss something? Why is SqlServerDatabaseResource a container resource while the mostly equal MongoDBDatabaseResource isnt?

This is a bit of a quirk of the latest changes to those resources and I expect this will change again. The non-container named resources now run as containers locally by default, but publish to the manifest as non-container resources.

@stbau04
Copy link
Contributor

stbau04 commented Dec 11, 2023

@DamianEdwards

@stbau04 I think the existing WithVolumeMount is conflating different container concepts (bind-mount vs. volumes) so we should feel free to suggest changing it too.

Like two methods WithBindMount/WithVolume instead of the VolumeMountType enum? That would all in all add three additional methods if we want to support all possabilites i suggested in my prior comment

@DamianEdwards
Copy link
Member

I don't think we'd need volume-flavored overloads for initdb support, that really only makes sense as a bind-mount if I understand correctly. For data and logs I'd assumed volumes were the primary scenario, but I can imagine some folks might want to bind to existing host directories for data, but seems a bit unlikely for logs?

All that said, having first-class methods for both volumes and bind-mounts for data, logs, and initdb would be fine I think.

@stbau04
Copy link
Contributor

stbau04 commented Dec 11, 2023

This is a bit of a quirk of the latest changes to those resources and I expect this will change again. The non-container named resources now run as containers locally by default, but publish to the manifest as non-container resources.

Ok, so it probably does not make any sense to implement the extension methods for them as they are not deployed as container resources (would it even work?)

@DamianEdwards
Copy link
Member

@stbau04 we're going to revisit it again as part of preview.3/preview.4 anyway, that's just how they're going to be in preview.2

@stbau04
Copy link
Contributor

stbau04 commented Dec 11, 2023

Should i open a seperate issue for the WithBindMount/WithVolume stuff?

@mitchdenny
Copy link
Member

Are PostgresServerResource and SqlServerDatabaseResource actually container resources?
PostgresServerResource uses the type "postgres.server.v0" in the manifest while there is a method WriteContainer(ContainerResource container) in the ManifestPublishingContext which uses the type "container.v0". Maybe I'm just not getting it, but it seems a little bit off to me?
For the SqlServerDatabaseResource it is pretty much the same (further e.g. the MongoDBDatabaseResource is not a container resource). Did I miss something? Why is SqlServerDatabaseResource a container resource while the mostly equal MongoDBDatabaseResource isnt?

Thanks for reporting these issues. Working on a fix now. PostgresServerResource should not derive from ContainerResource, and neither should SqlServerDatabaseResource.

@mitchdenny
Copy link
Member

Just to expand a little bit on the differences between methods like AddPostgres(...) and AddPostgresContainer(...). The reason we have two methods is that when it comes to deployment, sometimes people will want to just defer to the deployment tool to create a resource using that target environments ideal implementation of that resource. For example in Azure, Postgres is probably best handled by the Azure Postgres Flexible Server resource type vs spinning up your own Postgres container and dealing with all the operational issues around doing that well.

Sometimes however, a developer wants to explicitly say that I want Postgres, and I want it implemented as a container that sits alongside the rest of my application code in ACA (continuing with the Azure example). You always need to think through whether that container is appropriate for production workloads though.

@Kralizek
Copy link
Author

Just to expand a little bit on the differences between methods like AddPostgres(...) and AddPostgresContainer(...). The reason we have two methods is that when it comes to deployment, sometimes people will want to just defer to the deployment tool to create a resource using that target environments ideal implementation of that resource. For example in Azure, Postgres is probably best handled by the Azure Postgres Flexible Server resource type vs spinning up your own Postgres container and dealing with all the operational issues around doing that well.

Sometimes however, a developer wants to explicitly say that I want Postgres, and I want it implemented as a container that sits alongside the rest of my application code in ACA (continuing with the Azure example). You always need to think through whether that container is appropriate for production workloads though.

Maybe a bit off topic here, but Is this the right approach?

It feels like development and deployment concerns are clashing here: If I wear the developer hat, I care little how it will be deployed.

In my humble opinion, rather than two conflicting resource types or methods, we should have one resource (a development concept) that can be specialized to be deployed in one way or the other.

To me it feels like the Deployment strategy should be an (optional) metadata of the resource. It definitely should not be a decision point on which identical resource I use at dev time.

@stbau04
Copy link
Contributor

stbau04 commented Dec 12, 2023

But there are some functions you can only use on container resources (e.g. WithVolumeMount). How would you handle them? Either they can be applied always and just fail if it is not possible or they can only be applied after setting the metadata e.g. through an .AsContainer extension method, which might result in a pretty similar structure as it is now

Further as @mitchdenny mentioned, AddPostgres and AddPostgresContainer have some differences (e.g. you need to think whether the container is fine to be deployed to production). ). If this is dependent on the metadata of the resource, it would mean that just a small method call might change the effective type of the resource completely.

@mitchdenny
Copy link
Member

mitchdenny commented Dec 12, 2023

@Kralizek we used to just have AddPostgresContainer (as an example). And when it was written to the manifest it would be with a resource type of postgres.server.v0 - with very little other information because we assume that the deployment tool is going to provision it. The change we've made here is to allow people to exert a little bit of control over the deployment tool to say actually I do want this to be a container.

The alternative that we considered was not having AddPostgresContainer(...) at all, and just having AddPostgres(...) and if someone wanted a Postgres container that they could manipulate with volume mounts etc, then they would just use AddContainer(...) and build it up themselves.

The complication with that is that Aspire does some magic around connection string handling when you pass in the type via WithReference(...) to extract the connection string. A handle rolled container using AddContainer(...) wouldn't be able to participate in that mechanic because it doesn't have the right interfaces (because not all containers provide connection strings).

The API surface that we have isn't perfect, but we've tried it a few different ways and this is the one that felt the most ergonomic once all factors were considered. It is all still subject to change however and your feedback is noted.

@stbau04
Copy link
Contributor

stbau04 commented Dec 16, 2023

I created a seperate issure for splitting the current WithVolumeMount into two methods (one for bind mounts and one for named volumes): #1437

@stbau04
Copy link
Contributor

stbau04 commented Dec 17, 2023

What should the name for automatically generated volumes be? In my first approach i used a guid (as @josephaw1022 did in his PR; there are two PRs now?!). But the random guid would change each time the method is executed (at least while developing this means a new volume on each start, i am not sure what happens when the project is deployed).

We could either persist the guids somewhere if a container way already created, but i don't think that this is actually the way to go? Another approach could be to generate the volume dependent on some sort of assembly id and the resource name, but then we could only run the application once per host (otherwise they would share volumes). I don't actually like any of these approaches, but i couldn't figure out anything else.

How could we solve this one?

@josephaw1022
Copy link

josephaw1022 commented Dec 18, 2023

Wrote a long-winded explanation explaining giving my thought process on this #1447 (comment)

TLDR of comment

In my opinion, the simplest approach to this would have the developers write a name for the volumes like this

var db1 = builder.AddPostgresContainer("db1")
                       .WithNamedVolume("db1.volume");

var db2 = builder.AddPostgresContainer("db2")
                       .WithNamedVolume("db2.volume");

The developers would need to name the volume, but they wouldn't need to memorize or look up any volume paths with this approach.

However, the case where two volumes have the same name needs to be considered and thought through for this to work.

Currently if you were to run this

var volumeName = "VolumeMount.sqlserver.data";

var database = builder.AddSqlServerContainer("sqlserver", sqlpassword)
    .WithVolumeMount(volumeName, "/var/opt/mssql", VolumeMountType.Named)
    .AddDatabase("appdb");


var anotherDatabase = builder.AddSqlServerContainer("sqlserver2", sqlpassword)
    .WithVolumeMount(volumeName, "/var/opt/mssql", VolumeMountType.Named)
    .AddDatabase("anotherdb");


builder.Build().Run();

You should see this error

image

This error seems to be more of a symptom exception rather than an exception being thrown at the root cause (duplicate volume names).

So, I think having some custom error being thrown for two container resources using the same volume name would need to be implemented so that if someone were to write

var db1 = builder.AddSqlServerContainer("sqlserver", sqlpassword)
    .WithNamedVolume("newway")
    .AddDatabase("appdb");


var db2 = builder.AddSqlServerContainer("sqlserver2", sqlpassword)
    .WithNamedVolume("newway")
    .AddDatabase("anotherdb");


// or 


var db3 = builder.AddSqlServerContainer("sqlserver3", sqlpassword)
    .WithVolumeMount("oldway", "/var/opt/mssql", VolumeMountType.Named)
    .AddDatabase("appdb");


var db4 = builder.AddSqlServerContainer("sqlserver4", sqlpassword)
    .WithVolumeMount("oldway", "/var/opt/mssql", VolumeMountType.Named)
    .AddDatabase("anotherdb");


builder.Build().Run();

they would then immediately know they have a duplicate volume name.

And now that I think about it, adding more specific errors for duplicate volume names doesn't necessarily need to be done for this new syntactic sugar to be added as it's still an issue without the syntactic sugar. But, I am curious on if we wanted just the syntactic sugar done now and do something later for the duplicate volume names exception issue or do we want that solved in the same pr or done before the syntactic sugar pr?

@stbau04
Copy link
Contributor

stbau04 commented Dec 18, 2023

@josephaw1022
In my opinion we should probably create a seperate issue/PR for that as this is a somewhat different topic. Further it is not approved by a team member that this error is something we want

@stbau04
Copy link
Contributor

stbau04 commented Jan 7, 2024

Probably we should wait for #1521 and then see how it affects this issue (especially the volume/bind mount stuff)

@davidfowl
Copy link
Member

cc @JamesNK

@davidfowl davidfowl removed good first issue Good for newcomers help wanted Issue that is a good candidate for community contribution. labels Jan 27, 2024
@davidfowl
Copy link
Member

Moving to next preview.

@davidfowl
Copy link
Member

@eerhardt We're going to go with @stbau04's PR on this.

@github-actions github-actions bot locked and limited conversation to collaborators Apr 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-app-model Issues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants