Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify the scope of DataSourceInitializer #9528

Closed
snicoll opened this issue Jun 15, 2017 · 10 comments
Closed

Clarify the scope of DataSourceInitializer #9528

snicoll opened this issue Jun 15, 2017 · 10 comments
Assignees
Labels
status: superseded An issue that has been superseded by another theme: datasource Issues relating to data sources type: enhancement A general enhancement

Comments

@snicoll
Copy link
Member

snicoll commented Jun 15, 2017

Right now DataSourceInitializer runs regardless of the fact that we've auto-configured a DataSource or a custom one is available. If more than one is available, this currently throws a low-level exception because we expect a primary.

We have plans to change the last bit in 2.0 but I think it is confusing that we apply something datasource related on something that we haven't auto-configured. After all things defined in spring.datasource (in DataSourceProperties) should only be taken into account if we configure the DataSource.

The side effect is that if we do this explicitly, we have the guarantee the DataSource is initialized and ready to be used once we expose it in the context. Right now we use a BPP for that and it causes hard to track issues such as #9394

The purpose of this issue is to:

  1. Create a public pojo-based API that initializes a DataSource. It should be super easy to grab an instance of such object based on DataSourceProperties
  2. Replace the auto-configuration to create that object and migrate the database. We have DataSourceConfiguration#createDataSource that has all that we need to share that logic. We'd also need to update EmbeddedDataSourceConfiguration so it would be nice if we could easily migrate a DataSource based on a DataSource and a DataSourceProperties instances

If someone is willing to initialize a custom DataSource then they'll have to do the same thing but at least it will be explicit.

@snicoll snicoll added for: team-attention An issue we'd like other members of the team to review status: waiting-for-triage An issue we've not yet triaged labels Jun 15, 2017
@snicoll snicoll changed the title Run DataSourceInitializer on the auto-configured DataSource Run DataSourceInitializer only on the auto-configured DataSource Jun 15, 2017
@snicoll snicoll self-assigned this Jun 21, 2017
@snicoll snicoll added priority: normal type: enhancement A general enhancement and removed for: team-attention An issue we'd like other members of the team to review status: waiting-for-triage An issue we've not yet triaged labels Jun 21, 2017
@snicoll
Copy link
Member Author

snicoll commented Jun 22, 2017

I did a bit of hacking on this which makes me think that the original statement has to be revisited a bit.

Public API

The current code is highly dependant of DataSourceProperties and breaking that link basically means that we would have to duplicate the concept. If a user name and a password is provided for the upgrade, we recreate a datasource based on the settings defined in DataSourceProperties. So to be a public API, we should accept a hard link between the DataSource and the DataSourceProperties instances.

This should be mitigated with the fact that our doc invites users to create their custom DataSource using a custom DataSourceProperties object. There isn't a clean way to validate that link though.

Also, the invocation of the migration is purely programmatic so any advanced feature where the intialization should happen only once Hibernate has started will have to be managed by the user somehow.

Event

The DataSourceInitializedEvent should probably go away as it is JPA specific and is not very useful outside the internal implementation.

Ordering

Right now, the initializer does two separate things:

  • Create the schema: has to happen before Hibernate starts (in case you've set DDL validation for instance)
  • Initialize the schema: has to happen after in case you're using Hibernate to create the schema.

While our implementation is considered a simple solution (compared to our Liquibase/Flyway support), it's actually much harder to implement. The first link can be easily done by adding a ref to the entity manager (The Liquibase and Flyway auto-configs already do that).The second link is harder though since the same bean already required the opposite order to be able to create the schema.

Technicall we should wait for Flyway, Liquibase or Hibernate DDL auto to complete to initialize the schema. And that's where the event we have may have more sense.

I am wondering if SmartInitializingSingleton couldn't be a cheap solution for this.

@snicoll
Copy link
Member Author

snicoll commented Jun 22, 2017

#1115 seems to be in contradiction with my initial ordering analysis. So it looks like everything should happen as soon as the DataSource is available. I am trying now to understand why DataSourceInitializedPublisher does what it does.

@snicoll
Copy link
Member Author

snicoll commented Jun 22, 2017

More investigation on the code (this one is definitely tricky!)

The following cases are supported:

  • JPA (ddl auto off) + init enabled: everything runs as soon as the DataSource is available because there is at least one schema script
  • JPA (ddl auto on) + init enabled: the script is empty (since we want hibernate to do that for us). The @PostConstruct of the initializer doesn't do anything and rely on "something else" to trigger the event. (we have a BPP that sends an event as soon as it sees an initialized EntityManagerFactory)
  • No JPA + init enabled: everything runs as soon as the DataSource is available

This allowed me to discover a weird case: if you don't have JPA and only an init script (no schema) because something custom is going to create the schema, the initializer doesn't run.

It would probably be easier if we looked at things a bit differently:

  1. Something must create the schema, it's either the initializer or hibernate
  2. The schema must be initialized if necessary once the schema has been created

Perhaps a SchemaCreatedEvent would clarify the intend but I also think that calling things explicitly makes it easier to follow. Sending an event if nobody is listening for something that must happen is a bit weird so perhaps we should retrieve the invoker in the BPP and invoke it instead? This may create a package tangle though.

@snicoll
Copy link
Member Author

snicoll commented Jun 28, 2017

We now have better test coverage for those cases that I've discovered which should facilitate the refactoring.

Having discussed with @jhoeller we won't be able to get rid of the BPP dance. But we can get rid of the event and make the intent more clear. Also, for users that do not require things to be created super early rolling their own version should be pretty easy.

@snicoll
Copy link
Member Author

snicoll commented Sep 21, 2017

The BPP dance is such that we're considering running the initializer if one DataSource is available. It doesn't matter if Spring Boot auto-configured it or if a user provided one but as long as there is one candidate we should initialize it.

To make that intend more clear, we need to move the initialization-related properties out of DataSourceProperties with a DataSourceInitializationProperties. Perhaps with a dedicated auto-configuration that runs after DataSourceAutoConfiguration.

@snicoll snicoll changed the title Run DataSourceInitializer only on the auto-configured DataSource Clarify the scope of DataSourceInitializer Sep 21, 2017
@snicoll
Copy link
Member Author

snicoll commented Oct 2, 2017

Unfortunately it is not that easy, we do have support for custom username/password for both the creation of the schema and its initialization. When that feature is used, it is assume that we're going to create the exact same DataSource as the one that the auto-configuration would have created.

I confirm we need to configure a single DataSource if present. I've tried to make it only applicable in case of auto-configuration and @JdbcTest blows up as it's creating the DataSource itself before DataSourceAutoConfiguration gets a chance to kick in (in short: such datasource is not initialized).

snicoll added a commit to snicoll/spring-boot that referenced this issue Oct 3, 2017
This commit separates the lifecycle of the datasource initialization
from DataSourceInitializer itself. It also makes sure that a @primary
data source is no longer required.

See spring-projectsgh-9528

This leads to cycle in slice tests
snicoll added a commit to snicoll/spring-boot that referenced this issue Oct 3, 2017
This commit separates the lifecycle of the datasource initialization
from DataSourceInitializer itself. It also makes sure that a @primary
data source is no longer required.

See spring-projectsgh-9528
@snicoll
Copy link
Member Author

snicoll commented Oct 3, 2017

Let's make sure to validate #9394 works now. There is a sample attached to the issue that still breaks and the only way to make it work is to stop with the BPP dance (which isn't going to happen so we may want to go the documentation route for that case).

@zachmarshall
Copy link

I updated the test repo for #9394 with minor changes for spring-boot 2: https://github.com/zachmarshall/spring-datasource-init-bug/tree/spring-boot-2. As you said, it still fails. Please let me know if there's something you'd like for me to validate as working.

@wilkinsona wilkinsona added this to the Backlog milestone May 14, 2018
@wilkinsona wilkinsona self-assigned this May 14, 2018
@philwebb philwebb added the status: blocked An issue that's blocked on an external project change label Jun 1, 2018
@rvit34
Copy link

rvit34 commented Jun 20, 2018

Please, see my comment #9394 (comment) Am I correct that now there is no way to fix it?

@wilkinsona wilkinsona removed the status: blocked An issue that's blocked on an external project change label Jul 18, 2018
@philwebb philwebb modified the milestones: 2.1.x, 2.x Sep 24, 2018
@wilkinsona wilkinsona assigned snicoll and unassigned snicoll and wilkinsona Jul 16, 2019
@philwebb philwebb modified the milestones: 2.x, 2.5.x Jan 11, 2021
@philwebb philwebb added the status: pending-design-work Needs design work before any code can be developed label Jan 11, 2021
@philwebb philwebb added the theme: datasource Issues relating to data sources label Jan 11, 2021
@wilkinsona
Copy link
Member

This has been superseded by #25756 and #25323.

@wilkinsona wilkinsona removed this from the 2.5.x milestone Mar 22, 2021
@wilkinsona wilkinsona added status: superseded An issue that has been superseded by another and removed status: pending-design-work Needs design work before any code can be developed labels Mar 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: superseded An issue that has been superseded by another theme: datasource Issues relating to data sources type: enhancement A general enhancement
Projects
None yet
Development

No branches or pull requests

5 participants