Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLException: Unable to enlist connection to existing transaction when accessing multiple persistence units in the same transaction since 3.8.2 #39283

Closed
jacopo-cavallarin opened this issue Mar 8, 2024 · 118 comments · Fixed by #40365
Labels
area/persistence OBSOLETE, DO NOT USE kind/bug Something isn't working
Milestone

Comments

@jacopo-cavallarin
Copy link

Describe the bug

After updating from 3.8.1 to 3.8.2, some of our tests that insert data in multiple PUs within a single transaction now fail and throw an exception.

That exception is thrown whenever we attempt to access two or more persistence units within a single @Transactional method. This worked fine in previous releases.

We suspect that the bug is due to Agroal 2.3, since we encountered the same problem weeks ago while attempting to force the 2.3 version on older Quarkus releases.

Expected behavior

The transaction commits successfully without any error.

Actual behavior

The transaction is rolled back and this exception is thrown:

io.quarkus.arc.ArcUndeclaredThrowableException: Error invoking subclass method
        at io.test.agroal.bug.reproducer.ReproducerApp_Subclass.run(Unknown Source)
        at io.test.agroal.bug.reproducer.ReproducerApp_ClientProxy.run(Unknown Source)
        at io.quarkus.runtime.ApplicationLifecycleManager.run(ApplicationLifecycleManager.java:132)
        at io.quarkus.runtime.Quarkus.run(Quarkus.java:71)
        at io.quarkus.runtime.Quarkus.run(Quarkus.java:44)
        at io.quarkus.runner.GeneratedMain.main(Unknown Source)
        at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
        at java.base/java.lang.reflect.Method.invoke(Method.java:580)
        at io.quarkus.runner.bootstrap.StartupActionImpl$1.run(StartupActionImpl.java:113)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: jakarta.transaction.RollbackException: ARJUNA016053: Could not commit transaction.
        at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1283)
        at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:104)
        at io.quarkus.narayana.jta.runtime.NotifyingTransactionManager.commit(NotifyingTransactionManager.java:70)
        at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorBase.endTransaction(TransactionalInterceptorBase.java:406)
        at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorBase.invokeInOurTx(TransactionalInterceptorBase.java:171)
        at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorBase.invokeInOurTx(TransactionalInterceptorBase.java:107)
        at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorRequired.doIntercept(TransactionalInterceptorRequired.java:38)
        at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorBase.intercept(TransactionalInterceptorBase.java:61)
        at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorRequired.intercept(TransactionalInterceptorRequired.java:32)
        at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorRequired_Bean.intercept(Unknown Source)
        at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext.perform(AroundInvokeInvocationContext.java:30)
        at io.quarkus.arc.impl.InvocationContexts.performAroundInvoke(InvocationContexts.java:27)
        ... 10 more
Caused by: org.hibernate.exception.GenericJDBCException: Unable to acquire JDBC Connection [Exception in association of connection to existing transaction] [n/a]
        at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:63)
        at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:108)
        at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:94)
        at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnectionManagedImpl.java:116)
        at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getPhysicalConnection(LogicalConnectionManagedImpl.java:143)
        at org.hibernate.engine.jdbc.internal.MutationStatementPreparerImpl.connection(MutationStatementPreparerImpl.java:137)
        at org.hibernate.engine.jdbc.internal.MutationStatementPreparerImpl$1.doPrepare(MutationStatementPreparerImpl.java:48)
        at org.hibernate.engine.jdbc.internal.MutationStatementPreparerImpl$StatementPreparationTemplate.prepareStatement(MutationStatementPreparerImpl.java:106)
        at org.hibernate.engine.jdbc.internal.MutationStatementPreparerImpl.prepareStatement(MutationStatementPreparerImpl.java:38)
        at org.hibernate.engine.jdbc.mutation.internal.ModelMutationHelper.standardStatementPreparation(ModelMutationHelper.java:145)
        at org.hibernate.engine.jdbc.mutation.internal.ModelMutationHelper.lambda$standardPreparation$0(ModelMutationHelper.java:118)
        at org.hibernate.engine.jdbc.mutation.internal.PreparedStatementDetailsStandard.resolveStatement(PreparedStatementDetailsStandard.java:87)
        at org.hibernate.engine.jdbc.mutation.internal.JdbcValueBindingsImpl.lambda$beforeStatement$0(JdbcValueBindingsImpl.java:88)
        at java.base/java.lang.Iterable.forEach(Iterable.java:75)
        at org.hibernate.engine.jdbc.mutation.spi.BindingGroup.forEachBinding(BindingGroup.java:51)
        at org.hibernate.engine.jdbc.mutation.internal.JdbcValueBindingsImpl.beforeStatement(JdbcValueBindingsImpl.java:85)
        at org.hibernate.engine.jdbc.mutation.internal.AbstractMutationExecutor.performNonBatchedMutation(AbstractMutationExecutor.java:104)
        at org.hibernate.engine.jdbc.mutation.internal.MutationExecutorSingleNonBatched.performNonBatchedOperations(MutationExecutorSingleNonBatched.java:40)
        at org.hibernate.engine.jdbc.mutation.internal.AbstractMutationExecutor.execute(AbstractMutationExecutor.java:52)
        at org.hibernate.persister.entity.mutation.InsertCoordinator.doStaticInserts(InsertCoordinator.java:175)
        at org.hibernate.persister.entity.mutation.InsertCoordinator.coordinateInsert(InsertCoordinator.java:113)
        at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2873)
        at org.hibernate.action.internal.EntityInsertAction.execute(EntityInsertAction.java:104)
        at org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:632)
        at org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:499)
        at org.hibernate.event.internal.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:363)
        at org.hibernate.event.internal.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:41)
        at org.hibernate.event.service.internal.EventListenerGroupImpl.fireEventOnEachListener(EventListenerGroupImpl.java:127)
        at org.hibernate.internal.SessionImpl.doFlush(SessionImpl.java:1403)
        at org.hibernate.internal.SessionImpl.managedFlush(SessionImpl.java:484)
        at org.hibernate.internal.SessionImpl.flushBeforeTransactionCompletion(SessionImpl.java:2319)
        at org.hibernate.internal.SessionImpl.beforeTransactionCompletion(SessionImpl.java:1976)
        at org.hibernate.engine.jdbc.internal.JdbcCoordinatorImpl.beforeTransactionCompletion(JdbcCoordinatorImpl.java:439)
        at org.hibernate.resource.transaction.backend.jta.internal.JtaTransactionCoordinatorImpl.beforeCompletion(JtaTransactionCoordinatorImpl.java:336)
        at org.hibernate.resource.transaction.backend.jta.internal.synchronization.SynchronizationCallbackCoordinatorNonTrackingImpl.beforeCompletion(SynchronizationCallbackCoordinatorNonTrackingImpl.java:47)
        at org.hibernate.resource.transaction.backend.jta.internal.synchronization.RegisteredSynchronization.beforeCompletion(RegisteredSynchronization.java:37)
        at com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple.beforeCompletion(SynchronizationImple.java:52)
        at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.beforeCompletion(TwoPhaseCoordinator.java:351)
        at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:69)
        at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:138)
        at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1271)
        ... 22 more
Caused by: java.sql.SQLException: Exception in association of connection to existing transaction
        at io.agroal.narayana.NarayanaTransactionIntegration.associate(NarayanaTransactionIntegration.java:130)
        at io.agroal.pool.ConnectionPool.getConnection(ConnectionPool.java:257)
        at io.agroal.pool.DataSource.getConnection(DataSource.java:86)
        at io.quarkus.hibernate.orm.runtime.customized.QuarkusConnectionProvider.getConnection(QuarkusConnectionProvider.java:23)
        at org.hibernate.internal.NonContextualJdbcConnectionAccess.obtainConnection(NonContextualJdbcConnectionAccess.java:46)
        at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnecagedImpl.java:113)
        ... 59 more
Caused by: java.sql.SQLException: Unable to enlist connection to existing transaction
        at io.agroal.narayana.NarayanaTransactionIntegration.associate(NarayanaTransactionIntegration.java:121)
        ... 64 more

How to Reproduce?

Reproducer: https://github.com/jacopo-cavallarin/agroal-bug-reproducer

Clone the linked repo and follow the instructions in the README

Output of uname -a or ver

Darwin M0-055116363 23.2.0 Darwin Kernel Version 23.2.0: Wed Nov 15 21:55:06 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T6020 arm64

Output of java -version

openjdk version "21.0.1" 2023-10-17 LTS OpenJDK Runtime Environment Temurin-21.0.1+12 (build 21.0.1+12-LTS) OpenJDK 64-Bit Server VM Temurin-21.0.1+12 (build 21.0.1+12-LTS, mixed mode)

Quarkus version or git rev

3.8.2

Build tool (ie. output of mvnw --version or gradlew --version)

Apache Maven 3.9.6 (bc0240f3c744dd6b6ec2920b3cd08dcc295161ae) Maven home: /Users/jacopocavallarin/.m2/wrapper/dists/apache-maven-3.9.6-bin/3311e1d4/apache-maven-3.9.6 Java version: 21.0.1, vendor: Eclipse Adoptium, runtime: /Users/jacopocavallarin/.sdkman/candidates/java/21.0.1-tem Default locale: en_IT, platform encoding: UTF-8 OS name: "mac os x", version: "14.2.1", arch: "aarch64", family: "mac"

Additional information

No response

@jacopo-cavallarin jacopo-cavallarin added the kind/bug Something isn't working label Mar 8, 2024
@quarkus-bot quarkus-bot bot added the area/persistence OBSOLETE, DO NOT USE label Mar 8, 2024
@maymaewa
Copy link

Hello! I have the same situation where I'm trying to perform select operations on two datasources in single transaction.
After updating Quarkus from version 3.8.1 to version 3.8.2 I get the following error:
java.sql.SQLException: Enlisted connection used without active transaction

Error stacktrace:

	at io.agroal.narayana.XAExceptionUtils.xaException(XAExceptionUtils.java:20)
	at io.agroal.narayana.XAExceptionUtils.xaException(XAExceptionUtils.java:8)
	at io.agroal.narayana.LocalXAResource.rollback(LocalXAResource.java:89)
	at com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord.topLevelAbort(XAResourceRecord.java:338)
	at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.enlistResource(TransactionImple.java:644)
	at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.enlistResource(TransactionImple.java:398)
	at io.agroal.narayana.NarayanaTransactionIntegration.associate(NarayanaTransactionIntegration.java:120)
	at io.agroal.pool.ConnectionPool.getConnection(ConnectionPool.java:257)
	at io.agroal.pool.DataSource.getConnection(DataSource.java:86)
	at com.company.cloud.core.quarkus.db.datasource.testcontainers.TransactionsTest.lambda$testConnectionIsNotSharedWithinTransaction$3(TransactionsTest.java:146)
	at io.quarkus.narayana.jta.TransactionRunnerImpl.lambda$run$0(TransactionRunnerImpl.java:27)
	at io.quarkus.narayana.jta.QuarkusTransactionImpl.callInOurTx(QuarkusTransactionImpl.java:136)
	at io.quarkus.narayana.jta.QuarkusTransactionImpl.callRequireNew(QuarkusTransactionImpl.java:106)
	at io.quarkus.narayana.jta.QuarkusTransactionImpl.call(QuarkusTransactionImpl.java:29)
	at io.quarkus.narayana.jta.TransactionRunnerImpl.run(TransactionRunnerImpl.java:26)
	at com.company.cloud.core.quarkus.db.datasource.testcontainers.TransactionsTest.testConnectionIsNotSharedWithinTransaction(TransactionsTest.java:139)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at io.quarkus.test.junit.QuarkusTestExtension.runExtensionMethod(QuarkusTestExtension.java:1013)
	at io.quarkus.test.junit.QuarkusTestExtension.interceptTestMethod(QuarkusTestExtension.java:827)
	at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
	at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
	at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
	at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
	at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)
	at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
	at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
	at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
	at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)
	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:218)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:214)
	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:139)
	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:69)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.submit(SameThreadHierarchicalTestExecutorService.java:35)
	at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:57)
	at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:54)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:198)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:169)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:93)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:58)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:141)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:57)
	at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:103)
	at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:85)
	at org.junit.platform.launcher.core.DelegatingLauncher.execute(DelegatingLauncher.java:47)
	at org.junit.platform.launcher.core.SessionPerRequestLauncher.execute(SessionPerRequestLauncher.java:63)
	at com.intellij.junit5.JUnit5IdeaTestRunner.startRunnerWithArgs(JUnit5IdeaTestRunner.java:57)
	at com.intellij.rt.junit.IdeaTestRunner$Repeater$1.execute(IdeaTestRunner.java:38)
	at com.intellij.rt.execution.junit.TestsRepeater.repeat(TestsRepeater.java:11)
	at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:35)
	at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:235)
	at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:54)
Caused by: java.sql.SQLException: Enlisted connection used without active transaction
	at io.agroal.pool.ConnectionHandler.verifyEnlistment(ConnectionHandler.java:381)
	at io.agroal.pool.ConnectionHandler.transactionRollback(ConnectionHandler.java:352)
	at io.agroal.narayana.LocalXAResource.rollback(LocalXAResource.java:86)
	... 85 more

This error is reproduced when trying to run the next test:

 @Test
    void testConnectionIsNotSharedWithinTransaction() throws SQLException {

        //Datasources creating and runing flyway scripts ...

        Assertions.assertNotEquals(firstDatasource, secondDatasource);
        firstDatasource.getConnection().createStatement().execute("INSERT into persons(first_name, last_name) VALUES ('first', 'person')");
        secondDatasource.getConnection().createStatement().execute("INSERT into persons(first_name, last_name) VALUES ('second', 'person')");

        QuarkusTransaction.requiringNew().run(() -> {
            try {
                ResultSet firstResultSet = firstDatasource.getConnection().createStatement().executeQuery("SELECT first_name from persons where last_name = 'person'");
                Assertions.assertTrue(firstResultSet.next());
                Assertions.assertEquals("first", firstResultSet.getString(1));
                Assertions.assertFalse(firstResultSet.next());

                ResultSet secondResultSet = secondDatasource.getConnection().createStatement().executeQuery("SELECT first_name from persons where last_name = 'person'");
                Assertions.assertTrue(secondResultSet.next());
                Assertions.assertEquals("second", secondResultSet.getString(1));
                Assertions.assertFalse(secondResultSet.next());
            } catch (SQLException e) {
                Assertions.fail(e);
            }
        });
    }

}

@mjiderhamn
Copy link

mjiderhamn commented Mar 13, 2024

This seems caused by #39072.

However setting quarkus.datasource.XYZ.jdbc.transactions=xa on the datasources involved as per the discussion there seems to cause other issues solve the issue.

@mjiderhamn
Copy link

Downgrading to Agroal 2.2 resolves the issue. So something in Agroal 2.3 broke this.

@luca-bassoricci
Copy link
Contributor

Is it safe to downgrade agroal to 2.2 in pom.xml or better return to 3.8.1 and waiting for a fix?

@ketola
Copy link
Contributor

ketola commented Mar 19, 2024

I think this is an improvement, but this is a pretty big change, meaning that if you have not paid attention to the transaction handling when using multiple data sources you might need several code changes to fix this. And that's why I think it might be a good idea to revert the Agroal upgrade and introduce it in the next minor version upgrade. Because otherwise this might block some users from upgrading to patches containing security fixes.

Also this might be worth explaining in https://quarkus.io/guides/transaction as you might come across this issue the moment you add the second datasource in the application.

In my case I have several data sources in use and they are in separate modules and I consider them being independent and that's why I have tried to keep the transaction handling separated - event though I have noticed earlier that Quarkus has allowed me to do queries in separate datasources using the same transaction.
This change brought up couple of places where my transaction handling was still allowing the same transaction. In these places I was able to fix it either using the annotation:

@Transactional(Transactional.TxType.REQUIRES_NEW)

or if I needed to have separate transactions inside the same method I could use the programmatic approach:

return QuarkusTransaction.requiringNew().call(() -> {
            // fetch stuff
        });

I hope these tips could be helpful to someone.

@geoand
Copy link
Contributor

geoand commented Mar 22, 2024

@barreiro can you please have a look?

cc @gsmet @yrodiere

@yrodiere
Copy link
Member

At the very least this needs an entry in the migration guide, both 3.9 and 3.8... looks like we collectively skipped that, sorry @gastaldi .

@turing85
Copy link
Contributor

turing85 commented Mar 22, 2024

To be honest, I think the change should be reverted for 3.8.x. This is a breaking change and very unexpected. Maybe someone could point out exactly what changed, and why?

There is still the open question whether we can set the agroal version back to 2.2 in our projects: #39283 (comment)

@maxandersen
Copy link
Member

What is the fundemental change in agroal that caused this ?

@turing85
Copy link
Contributor

turing85 commented Mar 23, 2024

I may have a reproducer involving camel-quarkus. It shows the exception mentioned in #39283 (comment).

https://github.com/turing85/quarkus-camel-transactions

I am still awaiting confirmation from the camel team that the application does (as per the camel specification) what I think it should do.

Side note: camel-quarkus was not updated from 3.8.1 to 3.8.2; it remained at version 3.8.0.

@turing85
Copy link
Contributor

turing85 commented Mar 23, 2024

It seems that agroal/agroal@342ee87 is the commit that caused the issue to appear. However, this alone does not seem to be the root cause. The root cause seems to be agroal/agroal@ced9e8b. It feels like this throws is missing some additional checks.

The example I gave in my previous comment (#39283 (comment)) does not use XA at all. Thus, I do not understand why createXaResource(...) should even be called.

@zhfeng
Copy link
Contributor

zhfeng commented Mar 25, 2024

Thanks @turing85 for the reproducer!

I'm sorry that I have to say these codes were working before but not right. In your case, if you want to do the db clean in source database and db write in target database in a transation which means do them all or nothing, it definely needs XA. I highly recommend to config all the datasource with …​.jdbc.transactions = xa and also enable quarkus.transaction-manager.enable-recovery=true .Otherwise, it would be risk for the inconsistent data in two datasources.

The root case in agroal is to add LastResource interface in LocalXAResource. IIRC, narayana allows only one LastResource to enlist in a transaction @mmusgrov ? From the transaction perspective, this is right to make sure the non-XA resource could be involved in a XA transaction.

So I think the change in agroal makes sense but unfortuntaly, it breaks the applications which involve two or more non-XA resource in a XA transaction.

@maxandersen I think we definily need a document to describe these changes and impacts.

@turing85
Copy link
Contributor

Thank you @zhfeng for the review. I stroke-through my reproducer.

@barreiro
Copy link
Contributor

@zhfeng comment is spot on. Agroal added the LastResource to LocalXAResource to ensure that a single database is enlisted. Also, there was a change to throw an exception when the enlistment fails because that is a necessary step to manage the connection.

Unfortunately, the current exception does not describe the issue and is not very meaningful. I'll try to improve that.

@turing85
Copy link
Contributor

turing85 commented Mar 25, 2024

The question still is: why was this change made? And should we really include it in a patch-level update of an LTS? Was something broken (as in "transactional properties were violated", not in "actual behaviour was different from documented behaviour") before?

@maxandersen
Copy link
Member

@zhfeng can you clarify that last question ?

If we need to break Lts behaviour we need to have a reason.

Trying to grok what was actually happening in previous versions ? Sounds like commits was potentially happening outside a tx. For some users that might be tolerable (bad; I know) so is there a way for those users to get that old behaviour back or are you saying this is literally broken behaviour in all cases?

@zhfeng
Copy link
Contributor

zhfeng commented Mar 26, 2024

can you clarify that last question ?

Yeah, in previous version, if it involves multi non-XA datasources in a transaction, Narayana Transaction Manager CAN NOT guarantees that DO THEM ALL OR NOTHING. I think it could violate the Atomic property of Transaction. Also this does not work in the crash recovery secenario. And I understand that in some cases, users don't want such a Strong Consistent transaction behavior.

is there a way for those users to get that old behaviour back?

Yeah, I think there is a propery we can set in Narayana to allow multi LastResources like

arjPropertyManager.getCoreEnvironmentBean().setAllowMultipleLastResources(true);

but it should be set before creating the instance of TransactionManager. So I think we need to introduce a ConfigItem in quarkus-narayana-jta just like quarkus.transaction-manager.allow-multiple-last-resources=true. Do we have any plan to release 3.8.4 and I can try to add it?

@mmusgrov What do you think if we can introudce such a propery in quarkus-narayana-jta?

@mmusgrov
Copy link
Contributor

@zhfeng we could add that property but it is transactionally unsafe so we I doubt we'd add it.

@turing85
Copy link
Contributor

turing85 commented Mar 26, 2024

@zhfeng we could add that property but it is transactionally unsafe so we I doubt we'd add it.

I am for adding this property. For quarkus 3.9.0 and onwards, the property can default to false, so users have to opt-in in order to use it. For 3.8.x, however, I think it should default to true to not break existing behaviour.

@mmusgrov
Copy link
Contributor

Even allowing a single one-phase aware resource to join an XA transaction containing two-phase aware resources is transactionally unsafe. Beyond that, allowing two such resources is asking for trouble and we will get many users asking us why the integrity of their data is compromised. I can anticipate that it will end badly and give Quarkus a poor reputation.

@maxandersen
Copy link
Member

If we really need the updated deps we should add the flag.

@mjiderhamn
Copy link

I can anticipate that it will end badly and give Quarkus a poor reputation.

Worse than a patch version of an LTS containing breaking changes...?

@mmusgrov
Copy link
Contributor

Adding it to the management api/property config implies that Quarkus will support users who fall foul of the consequences of using this behaviour, I would even go so far as saying they'd be better of disabling transactions and winging it instead since that would be better than having naive users thinking that transactions are giving them protection - we already give them the option of disabling recovery, which is a bit odd, and allowing multiple last resources would only compound the problem of allowing transactionally unsafe usage of the extension.

I'd be agreeable to telling them that quarkus supports system properties that they can use to enable this behaviour but not for adding it directly to the extension config, ie it would become a workaround for a known defect.

Finally as I mentioned above LRCO is unsafe but there is a safe alternative, called Commit Markable Resource, but that only allows a single 1-phase aware resource to join an XA transaction.

@turing85
Copy link
Contributor

turing85 commented Apr 2, 2024

@maxandersen , @mmusgrov So... how do we proceed now? What is the process?

@mmusgrov
Copy link
Contributor

mmusgrov commented Apr 3, 2024

Can't I just tell you the system property and then then someone updates the docs?

@turing85
Copy link
Contributor

turing85 commented Apr 3, 2024

Can't I just tell you the system property and then then someone updates the docs?

If I undestand the comment from @zhfeng correctly, this is not sufficient.

@mmusgrov
Copy link
Contributor

mmusgrov commented Apr 3, 2024

Can't I just tell you the system property and then then someone updates the docs?

If I undestand the comment from @zhfeng correctly, this is not sufficient.

There is no sufficient fix: LRCO is transactionally unsafe.

@anbonifacio
Copy link

Can't I just tell you the system property and then then someone updates the docs?

If I undestand the comment from @zhfeng correctly, this is not sufficient.

There is no sufficient fix: LRCO is transactionally unsafe.

Since there is no fix or property that can be exposed on Quarkus side to mitigate the problem, I think that at least the reproducer from @jacopo-cavallarin should be used as a starting point to write a Quarkus blog post about this issue, in order to illustrate how to restructure the code and fix the problematic transactions.

@mmusgrov
Copy link
Contributor

mmusgrov commented Apr 24, 2024

@zhfeng the narayana-jta extension needs to know about the new class:

[mmusgrov@dev2 quarkus] (main) $ git diff extensions/narayana-jta/deployment/src/main/java/io/quarkus/narayana/jta/deployment/NarayanaJtaProcessor.java
diff --git a/extensions/narayana-jta/deployment/src/main/java/io/quarkus/narayana/jta/deployment/NarayanaJtaProcessor.java b/extensions/narayana-jta/deployment/src/main/java/io/quarkus/narayana/jta/deployment/NarayanaJtaProcessor.java
index 396279af870..253fa7810ac 100644
--- a/extensions/narayana-jta/deployment/src/main/java/io/quarkus/narayana/jta/deployment/NarayanaJtaProcessor.java
+++ b/extensions/narayana-jta/deployment/src/main/java/io/quarkus/narayana/jta/deployment/NarayanaJtaProcessor.java
@@ -99,6 +99,7 @@ public void build(NarayanaJtaRecorder recorder,
         additionalBeans.produce(new AdditionalBeanBuildItem(NarayanaJtaProducers.class));
         additionalBeans.produce(AdditionalBeanBuildItem.unremovableOf("io.quarkus.narayana.jta.RequestScopedTransaction"));
 
+        runtimeInit.produce(new RuntimeInitializedClassBuildItem(com.arjuna.ats.jta.resources.LastResourceCommitOptimisation.class.getName()));

Although it is a while since I looked at extension code so I will defer to your expertise.

@zhfeng
Copy link
Contributor

zhfeng commented Apr 24, 2024

@mmusgrov as you can see, the LAST_RESOURCE_OPTIMISATION_INTERFACE is still com.arjuna.ats.jta.resources.LastResourceCommitOptimisation but not change to io.agroal.narayana.LocalXAResource as we set it at runtime.

It's because all of the properties have been calculated at build time and stored in a staic Map.
see BeanPopulator. But we can not just initliaze BeanPopulator at runtime since it is a very fundemental class which has been used in many classes.

The only way is to use @RecomputeFieldValue(kind = RecomputeFieldValue.Kind.Reset) to set beanInstances value as null and force it to reload the properties at runtime in native mode. This needs some changes in BeanPopulator.java to handle the null value of beanInstances like:

    private static ConcurrentMap<String, Object> getBeanInstances() {
        if (beanInstances == null) {
            beanInstances = new ConcurrentHashMap<String, Object>();
        }
        return beanInstances;
    }

and replace all of the reference to beanInstances with getBeanInstances() method in BeanPopulator.java. I think with these changes, system properties could effect at runtime.

So the question is do we need such changes to enable system properties at runtime in native mode? or just in build time is enough with the fix of #40250

@zhfeng
Copy link
Contributor

zhfeng commented Apr 25, 2024

Hi @mmusgrov

I open jbosstm/narayana#2248 to make some changes in BeanPopulator. Then we can reload the properties in native mode.

@zhfeng
Copy link
Contributor

zhfeng commented Apr 26, 2024

OK, it looks good to make BeanPopulator to init at runtime. So don't need change in narayana side. I open #40310 and with this fix, we can pass system properties in native mode.

@turing85 can you verify the fix in #40310 with your reproducer?

@turing85
Copy link
Contributor

turing85 commented Apr 26, 2024

OK, it looks good to make BeanPopulator to init at runtime. So don't need change in narayana side. I open #40310 and with this fix, we can pass system properties in native mode.

@turing85 can you verify the fix in #40310 with your reproducer?

I am on it (building quarkus takes forever on my machine...)

While we are at it... how can I set those two properties for surefire-/@QuarkusTests? Setting them through surefire's systemPropertyVariables (apache.maven.org) does not seem to work.

@turing85
Copy link
Contributor

OK, it looks good to make BeanPopulator to init at runtime. So don't need change in narayana side. I open #40310 and with this fix, we can pass system properties in native mode.

@turing85 can you verify the fix in #40310 with your reproducer?

I can confirm that this seems to work. Here is a patch to build with the 999-SNAPSHOT-version, built from #40310:

Subject: [PATCH] use snapshot
---
Index: pom.xml
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/pom.xml b/pom.xml
--- a/pom.xml	(revision f07d24763f1ad5cccb6a01db559a67f18ae52026)
+++ b/pom.xml	(date 1714151479713)
@@ -20,8 +20,8 @@
 
         <!-- Quarkus versions -->
         <quarkus.platform.artifact-id>quarkus-bom</quarkus.platform.artifact-id>
-        <quarkus.platform.group-id>io.quarkus.platform</quarkus.platform.group-id>
-        <quarkus.platform.version>3.8.2</quarkus.platform.version>
+        <quarkus.platform.group-id>io.quarkus</quarkus.platform.group-id>
+        <quarkus.platform.version>999-SNAPSHOT</quarkus.platform.version>
 
         <!-- Test dependency versions -->
         <truth.version>1.4.2</truth.version>
@@ -256,9 +256,9 @@
                 <scope>import</scope>
             </dependency>
             <dependency>
-                <groupId>${quarkus.platform.group-id}</groupId>
+                <groupId>io.quarkus.platform</groupId>
                 <artifactId>quarkus-camel-bom</artifactId>
-                <version>${quarkus.platform.version}</version>
+                <version>3.8.2</version>
                 <type>pom</type>
                 <scope>import</scope>
             </dependency>

@maxandersen
Copy link
Member

trying to follow here but struggling :) whats the status ?

@zhfeng
Copy link
Contributor

zhfeng commented Apr 29, 2024

@maxandersen

The work around is to use
-DCoreEnvironmentBean.allowMultipleLastResources=true
-DJTAEnvironmentBean.lastResourceOptimisationInterfaceClassName=io.agroal.narayana.LocalXAResource

@maxandersen
Copy link
Member

maxandersen commented Apr 30, 2024

proposal from a call with @mmusgrov, @Sanne, @zhfeng, @gsmet:

  1. Agreement that blocking behavior is correct given @Transactional should not allow multiple non-xa to participate - it is actually a severe bug we did not stop it. Also agreement we need a way to allow existing users that relied on the mistaken lenient behavior to continue using 3.8+ without code changes.

  2. Add a quarkus config property: *.allowUnsafeMultipleLastResources=true|default:false which will build time init:
    -DCoreEnvironmentBean.allowMultipleLastResources=true -DJTAEnvironmentBean.lastResourceOptimisationInterfaceClassName=io.agroal.narayana.LocalXAResource
    This ensures we get same behavior in jvm and native mode.
    Mark it deprecated and have it always print warning as it is going to lead to data loss if not fully understood.

  3. Add to migration guide consequences

  4. Continue the conversation on how users work with multiple non-xa connections going forward. looking for feedback from those who are doing it today.

@maxandersen
Copy link
Member

maxandersen commented Apr 30, 2024

@jacopo-cavallarin I/we are quite curious to understand what if anything you had in place when using multiple PU's with no XA support. how did you handle failure? are/were you aware of that you couldn't rely on @Transactional helping you here?

curious to hear from others too as we would like to figure out how we best handle this critical change in behavior that ensures transactional safety - but also understand some systems are build which can work without as strict constraints.

@jacopo-cavallarin
Copy link
Author

jacopo-cavallarin commented Apr 30, 2024

@jacopo-cavallarin I/we are quite curious to understand what if anything you had in place when using multiple PU's with no XA support. how did you handle failure? are/were you aware of that you couldn't rely on @Transactional helping you here?

curious to hear from others too as we would like to figure out how we best handle this critical change in behavior that ensures transactional safety - but also understand some systems are build which can work without as strict constraints.

We simply weren't aware of this problem and assumed that everything would work as expected.

Maybe we've been lucky and haven't encountered this problem because in most cases I can think of we write to only one datasource, and only read from the rest.

In addition, enabling XA on our applications is not an easy task, as we also need to enable it on all our affected database instances (and from what I gather this requires a server restart, at least for PostgreSQL). That's why it's very important for us to have a viable workaround while we solve this.

@maxandersen
Copy link
Member

@jacopo-cavallarin thanks for the info - and fully understood and as of now you can set those system properties running jvm mode. It is though just a workaround we we'll want to add config flag for it to ensure it is more explicitly documented as not recommended.

To be clear even write from one and read to another can be problematic.

note, you don't necessarily need to enable XA on your application but if you don't you will have to manually manage your transactions because without it you can't just rely on @transactional to get right behavior.

Anyhow, your case is why its important we change to error by default since you haven't realized you were running non-transactional.

@gsmet
Copy link
Member

gsmet commented Apr 30, 2024

I created a draft of the fix we discussed: #40365 .

I won't have the time to finalize this today so I would appreciate:

  • if people could review the change/improve the javadoc/migration guide/doc
  • someone could test it in native mode (I tested JVM mode with the two reproducers)

Thanks!

@gsmet
Copy link
Member

gsmet commented Apr 30, 2024

I was able to verify that it fixes the issue in native.

@turing85
Copy link
Contributor

turing85 commented May 1, 2024

I was able to verify that it fixes the issue in native.

I can also verify that it fixes the issue on my reproducer 👍

@yrodiere
Copy link
Member

yrodiere commented May 2, 2024

I was able to verify that it fixes the issue in native.

I can also verify that it fixes the issue on my reproducer 👍

Thank you @turing85!

@DeMol-EE
Copy link
Contributor

A bit late to join the party, but we were also impacted by this when we tried to migrate from 3.8.1 to 3.8.2. We held off on migrating for a bit. Do I understand correctly that we now have an option to go to 3.8.5 and opt-in to the same (but unsafe) behaviour that was there before the fix from agroal?
Also, I’m a little confused about what the problem really was, because we never faced anything unexpected in our setup, which includes multiple postgresql datasources and JTA: we saw that a rolled back transaction resulted in no rows in any database, and that a committed transaction resulted in rows in the different databases - so what was actually broken? I’m curious to know to be able to assess if there’s a need to start a risk assessment in our project.
Either way, we’re almost certainly going to mark the data sources for XA going forward, as I get the sense that this is the (only) correct way to do it (and which also allows us to follow the quarkus releases without opting in to the unsafe behaviour).

@turing85
Copy link
Contributor

turing85 commented Jun 22, 2024

A bit late to join the party, but we were also impacted by this when we tried to migrate from 3.8.1 to 3.8.2. We held off on migrating for a bit. Do I understand correctly that we now have an option to go to 3.8.5 and opt-in to the same (but unsafe) behaviour that was there before the fix from agroal?

Yes. This is what we did in some of our applications.

Also, I’m a little confused about what the problem really was, because we never faced anything unexpected in our setup, which includes multiple postgresql datasources and JTA: we saw that a rolled back transaction resulted in no rows in any database, and that a committed transaction resulted in rows in the different databases - so what was actually broken? I’m curious to know to be able to assess if there’s a need to start a risk assessment in our project.

I am no expert on the topic of XA, so take this with a grain of salt. XA works by a two-phase commit protocol. in the first phase, the transaction manager (aka. narayana) asks every resource "Can I commit?". If a resource responds with "yes" it "commits to commit", i.e. it says "I guarantee that you can commit". If all resoruces in the first phase answer with "yes", then the second phase is entered. In this phase, the transaction manager prompts all resources to commit. Importantly, as soon as one resource commits, all other reosurces have to commit, otherwise the XA-properties are violated.

Now, to allow non-XA resources, narayana uses a small trick: it skips those resources in the 1st phase, and commits them first thing in the 2nd phase. This creates a small gap. It could happen that all XA-resources report "can commit" in the 1st phase. Then, in the 2nd phase, the first non-XA resource commits, but the second XA-resource fails to commit. This represents a problem, since the 1st resource cannot roll back (since it is already committed), but the 2nd cannot commit. This violates the XA-properties.

If I understand @mmusgrov's comment correctly, there is also a risk if only one non-XA reosurce is used. But I do not understand where the "gap" lies here. Maybe he can elaborate.

Either way, we’re almost certainly going to mark the data sources for XA going forward, as I get the sense that this is the (only) correct way to do it (and which also allows us to follow the quarkus releases without opting in to the unsafe behaviour).

If possible, this is probably the best thing to do.

@mmusgrov
Copy link
Contributor

mmusgrov commented Jul 2, 2024

If I understand @mmusgrov's comment correctly, there is also a risk if only one non-XA reosurce is used. But I do not understand where the "gap" lies here. Maybe he can elaborate.

@turing85 There is a small window where the non-XA resource commits but then the transaction manager crashes just before it writes its commit decision to a transaction log and before it starts telling those XA resources to commit. When the TM restarts it doesn't know about the transaction because it has no log and the non-XA resource doesn't know about the transaction branch because it successfully committed, and the XA resources will have timed out and rolled back - ie there's a heuristic outcome meaning some resources committed while others rolled back. The failure mode is sufficiently rare that most users accept the risk. But once one starts adding more non-XA resources the exposure to failures increases, in a non-linear fashion, because of the extra time and the extra network calls to multiple resources.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/persistence OBSOLETE, DO NOT USE kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.