Refactor Txn in RC isolation #34088

lcwangchao · 2022-04-19T06:33:35Z

We refactored the implementation of stale read in #32699 and #33812. Recapping these PRs, we did below things:

Added two methods GetReadTS and GetForUpdateTS to TxnManager and put some information like stale read timestamp to the StalenessTxnProvider. This makes the transaction information easier to obtain and also fixes some problems caused by passing those information by parameters: Got error when stale read with scalar subquery #31954
Unified the timing of txn context's initialization. In the previous the implement, some stale read context determination will be postponed to the planner stage.Now all of them are initialized before planning. This also fixed some bugs like sometimes stale read will return wrong result when execute with binary proto #33814

After #33812 merged, we will continue to refactor the RC isolation part. We'll also provide a txn provider for RC to manage its own context. Different with stale read, RC is a little complex, for example:

RC will do the current read instead of snapshot read for every statement. So if we do not do any optimization, the read ts for each statement is not the same.
RC supports write and for update operations. If the forUpdateTS is out of date, an ErrWriteConflict will return from storage layer and the statement should update forUpdateTS and then retry.
Some optimizations exist for RC like parallel TSO and RCCheck: txn: add document for read-consistency read tso optimization #32806

Some the descriptions above is also be applied to RR isolation or optimistic transactions. Next we will do some refactors according to them.

The design

The draft code is here: #33995

Add some lifecycle hook methods to `TxnManager`

Since RC often uses different ts for each statement. The TxnManager needs to separate different statements. So we added some methods to tell TxnManager this information:

type TxnManager interface {
   // EnterNewTxn enters a new txn
   EnterNewTxn(ctx context.Context, request *NewTxnRequest) error
   // OnStmtStart is the hook that should be called when a new statement started
   OnStmtStart(ctx context.Context) error
   // OnStmtError is the hook that should be called when statement get an error
   OnStmtError(err error)
   // OnStmtRetry is the hook that should be called when a statement is retrying
   OnStmtRetry(ctx context.Context) error

   // other methods
   ...  
 }

EnterNewTxn makes TxnManager enter to a new transaction. This method will set a new provider for the next transaction according to the environment and the parameter NewTxnRequest. Currently two places are using this method:

Every time a new statement is started and this statement is not in an explicit transaction:

tidb/session/session.go

Lines 3133 to 3135 in e631e65

    
           return sessiontxn.GetTxnManager(s).EnterNewTxn(ctx, &sessiontxn.NewTxnRequest{ 
        
           	TxnMode: txnMode, 
        
           })

When starting a transaction with start transaction or begin :

tidb/executor/simple.go

Lines 603 to 608 in e631e65

    
           return sessiontxn.GetTxnManager(e.ctx).EnterNewTxn(ctx, &sessiontxn.NewTxnRequest{ 
        
           	ExplictStart:          true, 
        
           	StaleReadTS:           e.staleTxnStartTS, 
        
           	TxnMode:               s.Mode, 
        
           	CausalConsistencyOnly: s.CausalConsistencyOnly, 
        
           })

OnStmtStart will be called every time when entering a new statement. For the RC scenario, this method will tell RC's provider to update its read timestamp.
OnStmtRetry will be called when retrying the statement. The retry often happens forUpdateTS is out of date and the provider should update it.
OnStmtError will be called when get an error while executing the statement. If one statement retried multiple times and failed at last, OnStmtError will only be called once.

Below is the examples to illustrate the order in which these methods are called:

-- autocommit=1
select * from t; -- Call: EnterNewTxn -> OnStmtStart
select * from t; -- Call: EnterNewTxn -> OnStmtStart

-- autocommit=0
select * from t; -- Call: EnterNewTxn -> OnStmtStart 
select * from t; -- Call: OnStmtStart 

-- explict begin case
begin; -- EnterNewTxn
select * from t; -- Call: OnStmtStart
select * from t; -- Call: OnStmtStart
commit; -- Call: OnStmtStart
select * from t; -- Call: EnterNewTxn -> OnStmtStart

-- write conflict case, suppose a ErrWriteConfict will ocur in below SQL
select * from t for update; -- Call: EnterNewTxn -> OnStmtStart -> OnStmtRetry

-- rc check error case. OnStmtError will be called
begin; -- EnterNewTxn
select * from t; -- Call: OnStmtStart
-- suppose rc check failed below
select * from t; -- Call: OnStmtStart -> OnStmtError -> OnStmtStart

If the RC check optimization failed and the storage layer returns an error. It will be regarded as a statement failure and then re-execute the SQL again. The OnStmtRetry is not called and the OnStmtStart will be called twice. The reason for simple.

Move RC's states to provider

We introduced a new class readcomitted.txnContextProvider to provide RC context. See the code here: https://github.com/pingcap/tidb/pull/33995/files#diff-4aa25ab176e21d35039cf22c45777a51bd4a0a5c18eecde002750dfe76663602R137

In the above implementation readcomitted.txnContextProvider maintains the statement's TSO future internally to decouple these states from the session. And then we move the most logic of RC to the provider and the method executorBuilder.getReadTS is simplified.

We also moved the logic of the Txn initialization to the provider and provided the method ActiveTxn for external calls. At this point the logic in SimpleExec.executeBegin is also greatly simplified.

Use `Advise` to do some optimizations

When doing refactor we needs to keep the exist optimizations work. For some optimizations such as RCCheck, we have enough information in provider so it's easy to do it. But for other optimizations we do not have insufficient informations in provider to support it.

So we introduce a new method Advise to give provider some extra informations. Different with other methods, it:

It is optional. That it, whether users use Advise or not, the correctness of the provider should not be affected.
If user give a Advice with a "right" information, the behavior of the provider should be correct too. The "right" should be easy for users to achieve. For example some optimization will require an extra logical plan information, the user should guarantee the plan is "right" in the advice. The provider should keep it correctness when the user advises it in a wrong time but it is allowed that optimization will not take affect when it happens.

We now use WarmUp advice to prepare parallel tso future, see:

tidb/planner/optimize.go

Line 147 in 5193dc0

if err := sessiontxn.AdviseTxnWarmUp(sctx); err != nil {

The text was updated successfully, but these errors were encountered:

lcwangchao mentioned this issue Apr 19, 2022

Refactor txn context management in session #30535

Open

25 tasks

lcwangchao added the type/enhancement The issue or PR belongs to an enhancement. label Apr 19, 2022

lcwangchao closed this as completed Jun 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor Txn in RC isolation #34088

Refactor Txn in RC isolation #34088

lcwangchao commented Apr 19, 2022 •

edited

Loading

Refactor Txn in RC isolation #34088

Refactor Txn in RC isolation #34088

Comments

lcwangchao commented Apr 19, 2022 • edited Loading

The design

Add some lifecycle hook methods to TxnManager

Move RC's states to provider

Use Advise to do some optimizations

lcwangchao commented Apr 19, 2022 •

edited

Loading

Add some lifecycle hook methods to `TxnManager`

Use `Advise` to do some optimizations