ddl: support dropping columns with composite indices #19196

bb7133 · 2020-08-14T03:39:44Z

Feature Request

Is your feature request related to a problem? Please describe:
After PR #18852, #18812 is supported in TiDB.

Now we'd like to go on improving the ALTER TABLE ... DROP COLUMN to support dropping columns with composite indices like:

tidb>  create table t(c1 int, c2 int, c3 int, index idx_c1_c2_c3(c1, c2, c3));
Query OK, 0 rows affected (0.02 sec)

tidb> alter table t drop column c3;
ERROR 8200 (HY000): can't drop column a with index covered now

Describe the feature you'd like:
See #3364

Describe alternatives you've considered:
Manually, the ALTER TABLE ... DROP COLUMN in TiDB can be archived by 2 separate steps:

Drop all indices that cover the column to be dropped.
Drop the column without any index.

However, for some third-party frameworks that relies on dropping columns directly, workarounds have to be done for this limitation. This can be disappointing.

Teachability, Documentation, Adoption, Migration Strategy:

More help is available by referring:

The internal algorithm of TiDB online DDL, more specifically AddColumn action and AddIndex action.
The key schema state change codes:
- onDropColumns()
- onDropIndex()
- applyDiff(), which updates the information schema served as a schema cache in TiDB server.

The text was updated successfully, but these errors were encountered:

bb7133 · 2020-08-14T04:00:01Z

After some discussions with @blacktear23 , we've some raw ideas about this. Considering table t in the description body as an example:

Create a temporary index named like idx_c1_c2_c3_tempxxx(c1, c2), change the state of this index to StateWriteReorgnization like what TiDB does for ADD INDEX. This is the updated index if c3 is dropped successfully. However, when the reorganization of index(the "backfill") is finished, the state of idx_c1_c2_c3_tempxxx will not be public.
Update the state of c3 as current dropping columns until the state is StateDeleteReorgnization.
When step 2 is finished, we've:

A column that is in StateDeleteReorgnization, which is close to be dropped.
An index that is in StateWriteReorgnization, which is ready for public.
It should be fine to change the column to StateNone and the index to StatePublic to finish the job.

If the DDL job is rollbacked for any reason, we can recover the column and drop the temporary index if needed.

blacktear23 · 2020-08-14T14:17:05Z

@bb7133 I just wrote a pseudo code for drop composite index covered column. We can discuss on this base code.

func (w *worker) onDropColumn(d *ddlCtx,t *meta.Meta, job *model.Job) (ver int64, _ error) {
    tblInfo, colInfo, sidxInfos, cidxInfos, err := checkDropColumn(t, job)
    
    colOriginalState := colInfo.State
    if len(cidxInfos) > 0 && job.IsRollingback() {
        // Handle the rolling back job.
        ver, err = onDropCompositeIndices(t, job, cidxInfos)
        if err != nil {
            return ver, errors.Trace(err)
        }
        return ver, nil
    }

    if len(cidxInfos) > 0 && colOriginState == model.StatePublic {
        fallThrough := false
        ctidxInfos := getOrCreateTempIndices(tblInfo, colInfo, cidxInfos)
        idxOriginalState := ctidxInfos[0].State
        switch ctidxInfos[0].State {
        case model.StateNone:
            // none -> delete only
            job.SchemaState = model.StateDeleteOnly
            setIndicesState(ctidxInfos, model.StateDeleteOnly)
            ver, err = updateVersionAndTableInfoWithCheck(t, job, tblInfo, idxOriginalState != indexInfo.State)
            metrics.AddIndexProgress.Set(0)
        case model.StateDeleteOnly:
            // delete only -> write only
            job.SchemaState = model.StateWriteOnly
            setIndicesState(ctidxInfos, model.StateWriteOnly)
            ver, err = updateVersionAndTableInfo(t, job, tblInfo, idxOriginalState != indexInfo.State)
        case model.StateWriteOnly:
            // write only -> reorganization
            job.SchemaState = model.StateWriteReorganization
            setIndicesState(ctidxInfos, model.StateWriteReorganization)
            // Initialize SnapshotVer to 0 for later reorganization check.
            job.SnapshotVer = 0
            ver, err = updateVersionAndTableInfo(t, job, tblInfo, idxOriginalState != indexInfo.State)
        case model.StateWriteReorganization:
            tbl, err := getTable(d.store, schemaID, tblInfo)
            if err != nil {
                return ver, errors.Trace(err)
            }
            for _, idxInfo := range ctidxInfos {
                // Run reorg job
                ver, err = func(indexInfo *model.IndexInfo)  (int64, error) {
                    reorgInfo, err := getReorgInfo(d, t, job, tbl)
                    if err != nil || reorgInfo.first {
                        // If we run reorg firstly, we should update the job snapshot version
                        // and then run the reorg next time.
                        return ver, errors.Trace(err)
                    }
                    err = w.runReorgJob(t, reorgInfo, tbl.Meta(), d.lease, func() (addIndexErr error) {
                        defer util.Recover(metrics.LabelDDL, "onDropColumn",
                        func() {
                            addIndexErr = errCancelledDDLJob.GenWithStack("add table `%v` index `%v` panic", tblInfo.Name, indexInfo.Name)
                        }, false)
                        return w.addTableIndex(tbl, indexInfo, reorgInfo)
                    })
                    if err != nil {
                        if errWaitReorgTimeout.Equal(err) {
                            // if timeout, we should return, check for the owner and re-wait job done.
                            return ver, nil
                        }
                        if kv.ErrKeyExists.Equal(err) || errCancelledDDLJob.Equal(err) || errCantDecodeIndex.Equal(err) {
                            logutil.BgLogger().Warn("[ddl] run add index job failed, convert job to rollback", zap.String("job", job.String()), zap.Error(err))
                            ver, err = convertDropColumnWithCompositeIdxJob2RollbackJob(t, job, tblInfo, err)
                        }
                        // Clean up the channel of notifyCancelReorgJob. Make sure it can't affect other jobs.
                        w.reorgCtx.cleanNotifyReorgCancel()
                        return ver, errors.Trace(err)
                    }
                    // Clean up the channel of notifyCancelReorgJob. Make sure it can't affect other jobs.
                    w.reorgCtx.cleanNotifyReorgCancel()
                }(idxInfo)
                if err != nil {
                        return ver, errors.Trace(err)
                }
                // Set column index flag.
                addIndexColumnFlag(tblInfo, indexInfo)
            }
            ver, err = updateVersionAndTableInfo(t, job, tblInfo, idxOriginalState != indexInfo.State)
            if err != nil {
                return ver, errors.Trace(err)
            }
            fallThrough = true
        default:
            err = ErrInvalidDDLState.GenWithStackByArgs("index", tblInfo.State)
        }
        if !fallThrough {
            return ver, errors.Trace(err)
        }
    }
    // Drop columns
    switch colInfo.State {
    case model.StatePublic:
        // public -> write only
        job.SchemaState = model.StateWriteOnly
        colInfo.State = model.StateWriteOnly
        setIndicesState(sidxInfos, model.StateWriteOnly)
        setIndicesState(cidxInfos, model.StateWriteOnly)
        err = checkDropColumnForStatePublic(tblInfo, colInfo)
        if err != nil {
            return ver, errors.Trace(err)
        }
        ver, err = updateVersionAndTableInfoWithCheck(t, job, tblInfo, colOriginalState != colInfo.State)
    case model.StateWriteOnly:
        // write only -> delete only
        job.SchemaState = model.StateDeleteOnly
        colInfo.State = model.StateDeleteOnly
        setIndicesState(sidxInfos, model.StateDeleteOnly)
        setIndicesState(cidxInfos, model.StateDeleteOnly)
        ver, err = updateVersionAndTableInfo(t, job, tblInfo, colOriginalState != colInfo.State)
    case model.StateDeleteOnly:
        // delete only -> reorganization
        job.SchemaState = model.StateDeleteReorganization
        colInfo.State = model.StateDeleteReorganization
        setIndicesState(sidxInfos, model.StateDeleteReorganization)
        setIndicesState(cidxInfos, model.StateDeleteReorganization)
        ver, err = updateVersionAndTableInfo(t, job, tblInfo, colOriginalState != colInfo.State)
    case model.StateDeleteReorganization:
        // reorganization -> absent
        // All reorganization jobs are done, drop this column.
        if len(sidxInfos) > 0 {
            newIndices := make([]*model.IndexInfo, 0, len(tblInfo.Indices))
            for _, idx := range tblInfo.Indices {
                if !indexInfoContains(idx.ID, sidxInfos) {
                    newIndices = append(newIndices, idx)
                }
            }
            tblInfo.Indices = newIndices
        }
        indexIDs := indexInfosToIDList(sidxInfos)
        if len(cidxInfos) > 0 {
            ctidxInfos := getOrCreateTempIndices(tblInfo, colInfo, cidxInfos)
            // Drop origin indices
            newIndices := make([]*model.IndexInfo, 0, len(tblInfo.Indices))
            for _, idx := range tblInfo.Indices {
                if !indexInfoContains(idx.ID, cidxInfos) {
                    newIndices = append(newIndices, idx)
                }
            }
            tblInfo.Indices = newIndices
            for _, idx := range ctidxInfos {
                // Rename temp index
                idx.Name = renameToOrigin(idx, cidxInfos)
                // Set state to public
                idx.State = model.StatePublic
            }
            indexIDs = append(indexIDs, indexInfoToIDList(cidxInfos)...)
        }
        tblInfo.Columns = tblInfo.Columns[:len(tblInfo.Columns)-1]
        colInfo.State = model.StateNone
        ver, err = updateVersionAndTableInfo(t, job, tblInfo, colOriginalState != colInfo.State)
        if err != nil {
            return ver, errors.Trace(err)
        }
        // Finish this job.
        if job.IsRollingback() {
            job.FinishTableJob(model.JobStateRollbackDone, model.StateNone, ver, tblInfo)
        } else {
            // We should set related index IDs for job
            job.FinishTableJob(model.JobStateDone, model.StateNone, ver, tblInfo)
            job.Args = append(job.Args, indexIDs, getPartitionIDs(tblInfo))
        }
    default:
        err = errInvalidDDLJob.GenWithStackByArgs("table", tblInfo.State)
    }
    return ver, errors.Trace(err)
}

blacktear23 · 2020-08-14T14:33:34Z

@bb7133 and there still has some unclear area:

How to process rollback on create index steps, there has no function called onDropIndices
How to process rollback on drop column steps and when the column state is StateDeleteReorganization

bb7133 · 2020-08-14T14:48:15Z

PTAL @zimulala @AilinKid

blacktear23 · 2020-08-15T02:19:02Z

And there has another point, we need IndexInfo add a field that track the relationship between origin composite index and new created temp composite index.
If without this field we can only use Name to calculate the relation between those indices. But if we use Name hinted method, we should tell user which pattern should not use in index naming. And this will cause some missing usage on TiDB.
So I create a PR pingcap/parser#974 for this.

blacktear23 · 2020-08-17T12:28:23Z

@bb7133 below is rollback POC

func rollingbackDropColumn(w *worker, d *ddlCtx, t *meta.Meta, job *model.Job) (ver int64, err error) {
	tblInfo, colInfo, sidxInfos, cidxInfos, err := checkDropColumn(t, job)
	if err != nil {
		return ver, errors.Trace(err)
	}
	
	if len(cidxInfos) > 0 && colInfo.State == model.StatePublic {
		ctidxInfos := getOrCreateTempIndices(tblInfo, colInfo, cidxInfos)
		switch ctidxInfos[0].State {
		case model.StateWriteReorganization:
			if job.SnapshotVer != 0 {
				// add index workers are started. need to ask them to exit.
				logutil.Logger(w.logCtx).Info("[ddl] run the cancelling DDL job", zap.String("job", job.String()))
				w.reorgCtx.notifyReorgCancel()
				return onDropColumn(d, t, job)
			} else {
				return convertDropColumnWithCompositeIdxJob2RollbackJob(t, job, tblInfo)
			}
		case model.StateNone, model.StateDeleteOnly, model.StateWriteOnly:
			return convertDropColumnWithCompositeIdxJob2RollbackJob(t, job, tblInfo)
		}
	}

	for _, indexInfo := range sidxInfos {
		switch indexInfo.State {
		case model.StateWriteOnly, model.StateDeleteOnly, model.StateDeleteReorganization, model.StateNone:
			// We can not rollback now, so just continue to drop index.
			// In function isJobRollbackable will let job rollback when state is StateNone.
			// When there is no index related to the drop column job it is OK, but when there has indices, we should
			// make sure the job is not rollback.
			job.State = model.JobStateRunning
			return ver, nil
		case model.StatePublic:
		default:
			return ver, ErrInvalidDDLState.GenWithStackByArgs("index", indexInfo.State)
		}
	}

	// StatePublic means when the job is not running yet.
	if colInfo.State == model.StatePublic {
		job.State = model.JobStateCancelled
		job.FinishTableJob(model.JobStateRollbackDone, model.StatePublic, ver, tblInfo)
		return ver, errCancelledDDLJob
	}
	// In the state of drop column `write only -> delete only -> reorganization`,
	// We can not rollback now, so just continue to drop column.
	job.State = model.JobStateRunning
	return ver, nil
}

bb7133 · 2020-08-19T01:26:53Z

If without this field we can only use Name to calculate the relation between those indices. But if we use Name hinted method, we should tell user which pattern should not use in index naming. And this will cause some missing usage on TiDB.
So I create a PR pingcap/parser#974 for this.

In fact, we've used the pattern(adding some specific prefix) already, like https://github.com/pingcap/tidb/pull/19059/files#diff-edfa54cbe2fdb418f0a77bd0ec4a56f2R64, can we follow the same style?

bb7133 · 2020-08-19T01:36:40Z

if len(cidxInfos) > 0 && job.IsRollingback() {
// Handle the rolling back job.
ver, err = onDropCompositeIndices(t, job, cidxInfos)
if err != nil {

I think we can try to reuse onDropIndex to support the onDropCompositeIndices by some adjustments: for now, it drop index from the args of the job, it may be changed to find the composite indicies from the columns from the args of 'DropColumn' job.

blacktear23 · 2020-08-19T02:06:13Z

If without this field we can only use Name to calculate the relation between those indices. But if we use Name hinted method, we should tell user which pattern should not use in index naming. And this will cause some missing usage on TiDB.
So I create a PR pingcap/parser#974 for this.

In fact, we've used the pattern(adding some specific prefix) already, like https://github.com/pingcap/tidb/pull/19059/files#diff-edfa54cbe2fdb418f0a77bd0ec4a56f2R64, can we follow the same style?

yes we can follow this style.

blacktear23 · 2020-08-19T03:35:11Z

@bb7133 I add [DNM] on pingcap/parser#974. After this feature implemented I will close it.

bb7133 added the type/feature-request Categorizes issue or PR as related to a new feature. label Aug 14, 2020

qinix mentioned this issue Aug 14, 2020

dropping column with index covered isn't fully compatible with mysql #3364

Open

2 tasks

bb7133 added the sig/sql-infra SIG: SQL Infra label Aug 14, 2020

blacktear23 mentioned this issue Aug 14, 2020

[DNM] Add reference id for drop column with composite index pingcap/parser#974

Closed

zz-jason added the feature/reviewing This feature request is reviewing by product managers label Aug 15, 2020

lance6716 mentioned this issue Aug 17, 2020

reminder: revert #801 after TiDB inplemented drop column with single indice pingcap/dm#844

Open

blacktear23 mentioned this issue Aug 19, 2020

ddl: Add support drop column with composite indices feature #19303

Closed

scsldb added the priority/P1 The issue has P1 priority. label Aug 28, 2020

zz-jason added feature/accepted This feature request is accepted by product managers and removed feature/reviewing This feature request is reviewing by product managers labels Aug 28, 2020

scsldb added this to the Requirement pool milestone Aug 28, 2020

blacktear23 mentioned this issue Feb 2, 2023

ddl: Support drop column on composite index #40972

Closed

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ddl: support dropping columns with composite indices #19196

ddl: support dropping columns with composite indices #19196

bb7133 commented Aug 14, 2020 •

edited

Loading

bb7133 commented Aug 14, 2020 •

edited

Loading

blacktear23 commented Aug 14, 2020 •

edited

Loading

blacktear23 commented Aug 14, 2020 •

edited

Loading

bb7133 commented Aug 14, 2020

blacktear23 commented Aug 15, 2020

blacktear23 commented Aug 17, 2020

bb7133 commented Aug 19, 2020

bb7133 commented Aug 19, 2020

blacktear23 commented Aug 19, 2020

blacktear23 commented Aug 19, 2020

ddl: support dropping columns with composite indices #19196

ddl: support dropping columns with composite indices #19196

Comments

bb7133 commented Aug 14, 2020 • edited Loading

Feature Request

bb7133 commented Aug 14, 2020 • edited Loading

blacktear23 commented Aug 14, 2020 • edited Loading

blacktear23 commented Aug 14, 2020 • edited Loading

bb7133 commented Aug 14, 2020

blacktear23 commented Aug 15, 2020

blacktear23 commented Aug 17, 2020

bb7133 commented Aug 19, 2020

bb7133 commented Aug 19, 2020

blacktear23 commented Aug 19, 2020

blacktear23 commented Aug 19, 2020

bb7133 commented Aug 14, 2020 •

edited

Loading

bb7133 commented Aug 14, 2020 •

edited

Loading

blacktear23 commented Aug 14, 2020 •

edited

Loading

blacktear23 commented Aug 14, 2020 •

edited

Loading