Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLASH-1026] Synchronization between drop table and remove region #538

Merged
merged 3 commits into from
Mar 23, 2020

Conversation

JaySon-Huang
Copy link
Contributor

@JaySon-Huang JaySon-Huang commented Mar 21, 2020

While running tests/fullstack-test/fault-inject/drop-table.test, tiflash may crash because of the synchronization between drop table and remove region.

Problem

After truncate table test.t ;, SchemaBuilder will create a InterpreterDropQuery to drop table. It call shutdown without lock.
https://github.com/pingcap/tics/blob/master/dbms/src/Interpreters/InterpreterDropQuery.cpp#L126-L129

At the same time, since the table is dropped, the region of that table is removed. RegionTable::removeRegion get a lock on DT table, and try to write a deleteRange on that table.
https://github.com/pingcap/tics/blob/master/dbms/src/Storages/Transaction/RegionTable.cpp#L247-L273

When storage->deleteRange() call background_task_handle->wake(), which has been reset by shutdown() in InterpreterDropQuery, TiFlash will crash.

Solution

In InterpreterDropQuery, I move storage->shutdown() after it get lock on storage. shutdown is supposed to be "wait till finish or cancel all background task of storage".

@JaySon-Huang JaySon-Huang added the needs-cherry-pick-release-3.1 PR which needs to be cherry-picked to release-3.1 label Mar 21, 2020
@JaySon-Huang JaySon-Huang changed the title [FLASH-1026] Synchronization between drop table and remove raft region [FLASH-1026] Synchronization between drop table and remove region Mar 21, 2020
@zanmato1984
Copy link
Contributor

I think the fundamental difference, that causes this problem, between TMT and DT, is that DT has to remove corresponding storage data (by writing a delete range(?)) when removing a region whereas TMT does not (TMT removes corresponding data during compaction/merge).

Am I understanding it right? If so, could you explain why DT has to remove storage data instantly?

@JaySon-Huang
Copy link
Contributor Author

JaySon-Huang commented Mar 21, 2020

I think the fundamental difference, that causes this problem, between TMT and DT, is that DT has to remove corresponding storage data (by writing a delete range(?)) when removing a region whereas TMT does not (TMT removes corresponding data during compaction/merge).

Am I understanding it right? If so, could you explain why DT has to remove storage data instantly?

Actually DT don't remove storage data instantly, deleteRange is a simple delete mark for the key range. deleteRange indicate that we don't need that data any more, and that range of data can be removed on its background compact(delta-merge).

For TMT, it will do final optimize on storage if all its regions are removed, or background thread to calculate its parts satisfied threshold or not.

cc @zanmato1984 @flowbehappy

@zanmato1984
Copy link
Contributor

zanmato1984 commented Mar 22, 2020

I think the fundamental difference, that causes this problem, between TMT and DT, is that DT has to remove corresponding storage data (by writing a delete range(?)) when removing a region whereas TMT does not (TMT removes corresponding data during compaction/merge).
Am I understanding it right? If so, could you explain why DT has to remove storage data instantly?

Actually DT don't remove storage data instantly, deleteRange is a simple delete mark for the key range. deleteRange indicate that we don't need that data any more, and that range of data can be removed on its background compact(delta-merge).

For TMT, it will do final optimize on storage if all its regions are removed, or background thread to calculate its parts satisfied threshold or not.

cc @zanmato1984 @flowbehappy

My actual question was, why does DT have to "mark" this range for reclaim, rather than doing it the "tmt" way, which is leaving it to compaction to detect that "this region has been removed, we don't need it anymore"?

@flowbehappy
Copy link
Contributor

I think the fundamental difference, that causes this problem, between TMT and DT, is that DT has to remove corresponding storage data (by writing a delete range(?)) when removing a region whereas TMT does not (TMT removes corresponding data during compaction/merge).
Am I understanding it right? If so, could you explain why DT has to remove storage data instantly?

Actually DT don't remove storage data instantly, deleteRange is a simple delete mark for the key range. deleteRange indicate that we don't need that data any more, and that range of data can be removed on its background compact(delta-merge).
For TMT, it will do final optimize on storage if all its regions are removed, or background thread to calculate its parts satisfied threshold or not.

cc @zanmato1984 @flowbehappy

My actual question was, why does DT have to "mark" this range for reclaim, rather than doing it the "tmt" way, which is leaving it to compaction to detect that "this region has been removed, we don't need it anymore"?

@zanmato1984
One design goal of DT is try to decouple the logic of storage engine and raft layer. And by providing a delete range api, we can easily avoid the necessary of getting the available ranges of regions from raft layer. BTW, as far as I know, TiVK do the same way.

Copy link
Contributor

@zanmato1984 zanmato1984 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@JaySon-Huang
Copy link
Contributor Author

/run-integration-tests

2 similar comments
@JaySon-Huang
Copy link
Contributor Author

/run-integration-tests

@JaySon-Huang
Copy link
Contributor Author

/run-integration-tests

@JaySon-Huang JaySon-Huang changed the title [FLASH-1026] Synchronization between drop table and remove region [DNM][FLASH-1026] Synchronization between drop table and remove region Mar 23, 2020
Signed-off-by: JaySon-Huang <[email protected]>
@JaySon-Huang JaySon-Huang changed the title [DNM][FLASH-1026] Synchronization between drop table and remove region [FLASH-1026] Synchronization between drop table and remove region Mar 23, 2020
@JaySon-Huang JaySon-Huang added the status/can-merge Indicates a PR has been approved by a committer. label Mar 23, 2020
@JaySon-Huang
Copy link
Contributor Author

/merge

@sre-bot
Copy link
Collaborator

sre-bot commented Mar 23, 2020

/run-all-tests

@sre-bot
Copy link
Collaborator

sre-bot commented Mar 23, 2020

@JaySon-Huang merge failed.

@sre-bot
Copy link
Collaborator

sre-bot commented Mar 23, 2020

/run-all-tests

@sre-bot
Copy link
Collaborator

sre-bot commented Mar 23, 2020

@JaySon-Huang merge failed.

@JaySon-Huang JaySon-Huang removed the status/can-merge Indicates a PR has been approved by a committer. label Mar 23, 2020
@JaySon-Huang
Copy link
Contributor Author

/merge

@sre-bot sre-bot added the status/can-merge Indicates a PR has been approved by a committer. label Mar 23, 2020
@sre-bot
Copy link
Collaborator

sre-bot commented Mar 23, 2020

/run-all-tests

@sre-bot
Copy link
Collaborator

sre-bot commented Mar 23, 2020

@JaySon-Huang merge failed.

@JaySon-Huang
Copy link
Contributor Author

/merge

@sre-bot
Copy link
Collaborator

sre-bot commented Mar 23, 2020

/run-all-tests

@JaySon-Huang JaySon-Huang removed the needs-cherry-pick-release-3.1 PR which needs to be cherry-picked to release-3.1 label Mar 23, 2020
@sre-bot sre-bot merged commit cdf2c9d into pingcap:master Mar 23, 2020
@JaySon-Huang JaySon-Huang deleted the FLASH-1026 branch March 23, 2020 13:02
JaySon-Huang added a commit that referenced this pull request Mar 23, 2020
…) (#539)

* In InterpreterDropQuery, call shutdown within table lock
* Temporary check

Signed-off-by: JaySon-Huang <[email protected]>
windtalker added a commit that referenced this pull request Mar 26, 2020
* fix daily test fail (#520)

* fix daily test fail

* fix

* Add fullstack test for engine DeltaTree (#524)

## Add fullstack test for engine DeltaTree.
* Refine `tests/docker/run.sh` and split `tests/docker/docker-compose.yaml` into `tests/docker/{gtest/mock-test/cluster/tiflash-dt/tiflash-tmt}.yaml`

`fullstack/ddl`,`fullstack-test/fault-inject` will be enabled in #526 

## Others
* Add column `tidb_table_id` in `system.tables`
* Add some debugging info

Signed-off-by: JaySon-Huang <[email protected]>

* Bugfix: schrodinger bank2 fail (#521)

* Flush committed data in Region after resolve locks
* Stop append into last packs after split.
* Remove last_cache in Delta to reduce code complexity.
* Add system table: dt_tables and dt_segments, for debug.

* [FLASH-1008] Support br restore & ingest sst (#529)

* [flash-1018]fix bug of datetime default value (#534)

* fix bug of datetime default value

* address comment

* Using SegmentSnapshotPtr instead of SegmentSnapshot (#532)

* [Flash-664] Enable DDL for engine DeltaTree (#526)

- [x] Enable unittest in gtest_dbms
- [x] Enable mock test in tests/delta-merge-test
- [x] Enable fullstack-test/ddl
- [x] Enable fullstack-test/inject (Imported in #443)
- [x] Refine exception while read / write to DeltaTree (FLASH-994)
  * Use `Exception::addMessage` to add more diagnostics for locate which table is wrong (commit: 716ae4a)
- [x] shutdown should cancel all background tasks (FLASH-995) (commit: 7470c2f)
- [x] Run schrodinger/sddl test

## Others 
* Add atomic-rename table test in `tests/fullstack-test/fault-inject/rename-table.test`, but did not enable this. We will fix it later.
* "dt" engine ONLY support disable_bg_flush = true.
If background flush is enabled, read will not triggle schema sync. Which means that we may not get the right result with out-dated schema.
* Found that 'zero' value of type year is not match with tikv (FLASH-1023)

Signed-off-by: JaySon-Huang <[email protected]>

* [FLASH-1027] Fix: proxy override system signal listening (#541)

* remove signal listening from proxy.

* while terminating, stop all learner read.

* [FLASH-1026] Synchronization between drop table and remove region (#538)

* Set default storage engine to DT (#547)

* Do region flush in Region::handleWriteRaftCmd (#542)

* [FLASH-1028] Region merge should not remove data (#544)

* Remove region after region merge should not remove data

Signed-off-by: JaySon-Huang <[email protected]>

* Fix region init index

Signed-off-by: JaySon-Huang <[email protected]>

* Add region_merge.test for DT

Signed-off-by: JaySon-Huang <[email protected]>

* Fix different behavior between DT and TMT

Signed-off-by: JaySon-Huang <[email protected]>

* Fix mock remove region

Signed-off-by: JaySon-Huang <[email protected]>

Co-authored-by: pingcap-github-bot <[email protected]>

* clean useless code

Co-authored-by: JaySon <[email protected]>
Co-authored-by: Flowyi <[email protected]>
Co-authored-by: Tong Zhigao <[email protected]>
Co-authored-by: Han Fei <[email protected]>
Co-authored-by: pingcap-github-bot <[email protected]>
hanfei1991 added a commit that referenced this pull request Jun 22, 2020
* implement join

* fix

* update tipb

* update tipb

* save work: refine dag interpreter, introduce dag interpreter query block

* fix bug

* comment useless code

* refine code

* fix bug

* refine code

* fix bug

* save work

* tiny fix

* save work

* support remote read

* update tipb

* refine code

* fix bug

* update client-c

* refine cop read

* update client-c to support cop reading from tiflash

* refine code

* support batch cop

* fix build error

* fix daily test fail

* some bug fix

* log dag execution time without encode to chunk

* fix bug

* fix bug

* log dag execution time without encode

* make encode multi processors

* fix

* parallel encode

* refine code of batch coprocessor

* refine code

* delete useless code

* update kvproto and client-c

* [flash 1002]refine coprocessor read (#530)

* refind coprocessor

* fix

* try fix ci

* support key ranges in batch coprocessor (#533)

* save work

* save work

* save work

* save work

* dt support key ranges in dag request

* fix bug

* fix bug

* fix bug

* add some comments

* fix bug

* address comments

* merge master branch (#556)

* fix daily test fail (#520)

* fix daily test fail

* fix

* Add fullstack test for engine DeltaTree (#524)

## Add fullstack test for engine DeltaTree.
* Refine `tests/docker/run.sh` and split `tests/docker/docker-compose.yaml` into `tests/docker/{gtest/mock-test/cluster/tiflash-dt/tiflash-tmt}.yaml`

`fullstack/ddl`,`fullstack-test/fault-inject` will be enabled in #526 

## Others
* Add column `tidb_table_id` in `system.tables`
* Add some debugging info

Signed-off-by: JaySon-Huang <[email protected]>

* Bugfix: schrodinger bank2 fail (#521)

* Flush committed data in Region after resolve locks
* Stop append into last packs after split.
* Remove last_cache in Delta to reduce code complexity.
* Add system table: dt_tables and dt_segments, for debug.

* [FLASH-1008] Support br restore & ingest sst (#529)

* [flash-1018]fix bug of datetime default value (#534)

* fix bug of datetime default value

* address comment

* Using SegmentSnapshotPtr instead of SegmentSnapshot (#532)

* [Flash-664] Enable DDL for engine DeltaTree (#526)

- [x] Enable unittest in gtest_dbms
- [x] Enable mock test in tests/delta-merge-test
- [x] Enable fullstack-test/ddl
- [x] Enable fullstack-test/inject (Imported in #443)
- [x] Refine exception while read / write to DeltaTree (FLASH-994)
  * Use `Exception::addMessage` to add more diagnostics for locate which table is wrong (commit: 716ae4a)
- [x] shutdown should cancel all background tasks (FLASH-995) (commit: 7470c2f)
- [x] Run schrodinger/sddl test

## Others 
* Add atomic-rename table test in `tests/fullstack-test/fault-inject/rename-table.test`, but did not enable this. We will fix it later.
* "dt" engine ONLY support disable_bg_flush = true.
If background flush is enabled, read will not triggle schema sync. Which means that we may not get the right result with out-dated schema.
* Found that 'zero' value of type year is not match with tikv (FLASH-1023)

Signed-off-by: JaySon-Huang <[email protected]>

* [FLASH-1027] Fix: proxy override system signal listening (#541)

* remove signal listening from proxy.

* while terminating, stop all learner read.

* [FLASH-1026] Synchronization between drop table and remove region (#538)

* Set default storage engine to DT (#547)

* Do region flush in Region::handleWriteRaftCmd (#542)

* [FLASH-1028] Region merge should not remove data (#544)

* Remove region after region merge should not remove data

Signed-off-by: JaySon-Huang <[email protected]>

* Fix region init index

Signed-off-by: JaySon-Huang <[email protected]>

* Add region_merge.test for DT

Signed-off-by: JaySon-Huang <[email protected]>

* Fix different behavior between DT and TMT

Signed-off-by: JaySon-Huang <[email protected]>

* Fix mock remove region

Signed-off-by: JaySon-Huang <[email protected]>

Co-authored-by: pingcap-github-bot <[email protected]>

* clean useless code

Co-authored-by: JaySon <[email protected]>
Co-authored-by: Flowyi <[email protected]>
Co-authored-by: Tong Zhigao <[email protected]>
Co-authored-by: Han Fei <[email protected]>
Co-authored-by: pingcap-github-bot <[email protected]>

* fix type mismatch bug in broadcast join

* broadcast join support join keys with different data type (#580)

* fix type mismatch bug in broadcast join

* refine code

* refine code

* some improvement for broadcast join (#600)

* some improvement for broadcast join

* format code

* refine code

* address comment

* fix bug

* address comments

* fix bug

* fix bug

* make TiFlash backward compatible to old tipb (#653)

* 1. re-enable exec info in dag response, 2. support old style dag request

* basic support for execute summary

* refine support of executor time for join plan

* format code

* address comments

* fix bug

* refine code

* update header

* Fix execute details regression after merge master (#678)

* refine code

* fix bug

* fix bug

* format code

* update kvproto

* fmt code

* update client-c

* update tipb

Co-authored-by: xufei <[email protected]>
Co-authored-by: xufei <[email protected]>
Co-authored-by: JaySon <[email protected]>
Co-authored-by: Flowyi <[email protected]>
Co-authored-by: Tong Zhigao <[email protected]>
Co-authored-by: pingcap-github-bot <[email protected]>
hanfei1991 added a commit that referenced this pull request Jul 2, 2020
* implement join

* fix

* update tipb

* update tipb

* save work: refine dag interpreter, introduce dag interpreter query block

* fix bug

* comment useless code

* refine code

* fix bug

* refine code

* fix bug

* save work

* tiny fix

* save work

* support remote read

* update tipb

* refine code

* fix bug

* update client-c

* refine cop read

* update client-c to support cop reading from tiflash

* refine code

* support batch cop

* fix build error

* fix daily test fail

* some bug fix

* log dag execution time without encode to chunk

* fix bug

* fix bug

* log dag execution time without encode

* make encode multi processors

* fix

* parallel encode

* refine code of batch coprocessor

* refine code

* delete useless code

* update kvproto and client-c

* [flash 1002]refine coprocessor read (#530)

* refind coprocessor

* fix

* try fix ci

* support key ranges in batch coprocessor (#533)

* save work

* save work

* save work

* save work

* dt support key ranges in dag request

* fix bug

* fix bug

* fix bug

* add some comments

* fix bug

* address comments

* merge master branch (#556)

* fix daily test fail (#520)

* fix daily test fail

* fix

* Add fullstack test for engine DeltaTree (#524)

## Add fullstack test for engine DeltaTree.
* Refine `tests/docker/run.sh` and split `tests/docker/docker-compose.yaml` into `tests/docker/{gtest/mock-test/cluster/tiflash-dt/tiflash-tmt}.yaml`

`fullstack/ddl`,`fullstack-test/fault-inject` will be enabled in #526 

## Others
* Add column `tidb_table_id` in `system.tables`
* Add some debugging info

Signed-off-by: JaySon-Huang <[email protected]>

* Bugfix: schrodinger bank2 fail (#521)

* Flush committed data in Region after resolve locks
* Stop append into last packs after split.
* Remove last_cache in Delta to reduce code complexity.
* Add system table: dt_tables and dt_segments, for debug.

* [FLASH-1008] Support br restore & ingest sst (#529)

* [flash-1018]fix bug of datetime default value (#534)

* fix bug of datetime default value

* address comment

* Using SegmentSnapshotPtr instead of SegmentSnapshot (#532)

* [Flash-664] Enable DDL for engine DeltaTree (#526)

- [x] Enable unittest in gtest_dbms
- [x] Enable mock test in tests/delta-merge-test
- [x] Enable fullstack-test/ddl
- [x] Enable fullstack-test/inject (Imported in #443)
- [x] Refine exception while read / write to DeltaTree (FLASH-994)
  * Use `Exception::addMessage` to add more diagnostics for locate which table is wrong (commit: 716ae4a)
- [x] shutdown should cancel all background tasks (FLASH-995) (commit: 7470c2f)
- [x] Run schrodinger/sddl test

## Others 
* Add atomic-rename table test in `tests/fullstack-test/fault-inject/rename-table.test`, but did not enable this. We will fix it later.
* "dt" engine ONLY support disable_bg_flush = true.
If background flush is enabled, read will not triggle schema sync. Which means that we may not get the right result with out-dated schema.
* Found that 'zero' value of type year is not match with tikv (FLASH-1023)

Signed-off-by: JaySon-Huang <[email protected]>

* [FLASH-1027] Fix: proxy override system signal listening (#541)

* remove signal listening from proxy.

* while terminating, stop all learner read.

* [FLASH-1026] Synchronization between drop table and remove region (#538)

* Set default storage engine to DT (#547)

* Do region flush in Region::handleWriteRaftCmd (#542)

* [FLASH-1028] Region merge should not remove data (#544)

* Remove region after region merge should not remove data

Signed-off-by: JaySon-Huang <[email protected]>

* Fix region init index

Signed-off-by: JaySon-Huang <[email protected]>

* Add region_merge.test for DT

Signed-off-by: JaySon-Huang <[email protected]>

* Fix different behavior between DT and TMT

Signed-off-by: JaySon-Huang <[email protected]>

* Fix mock remove region

Signed-off-by: JaySon-Huang <[email protected]>

Co-authored-by: pingcap-github-bot <[email protected]>

* clean useless code

Co-authored-by: JaySon <[email protected]>
Co-authored-by: Flowyi <[email protected]>
Co-authored-by: Tong Zhigao <[email protected]>
Co-authored-by: Han Fei <[email protected]>
Co-authored-by: pingcap-github-bot <[email protected]>

* fix type mismatch bug in broadcast join

* broadcast join support join keys with different data type (#580)

* fix type mismatch bug in broadcast join

* refine code

* refine code

* some improvement for broadcast join (#600)

* some improvement for broadcast join

* format code

* refine code

* address comment

* fix bug

* address comments

* fix bug

* fix bug

* make TiFlash backward compatible to old tipb (#653)

* 1. re-enable exec info in dag response, 2. support old style dag request

* basic support for execute summary

* refine support of executor time for join plan

* format code

* address comments

* fix bug

* refine code

* update header

* Fix execute details regression after merge master (#678)

* refine code

* fix bug

* fix bug

* format code

* update kvproto

* fmt code

* update client-c

* update tipb

Co-authored-by: xufei <[email protected]>
Co-authored-by: xufei <[email protected]>
Co-authored-by: JaySon <[email protected]>
Co-authored-by: Flowyi <[email protected]>
Co-authored-by: Tong Zhigao <[email protected]>
Co-authored-by: pingcap-github-bot <[email protected]>

Co-authored-by: xufei <[email protected]>
Co-authored-by: xufei <[email protected]>
Co-authored-by: JaySon <[email protected]>
Co-authored-by: Flowyi <[email protected]>
Co-authored-by: Tong Zhigao <[email protected]>
Co-authored-by: pingcap-github-bot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/can-merge Indicates a PR has been approved by a committer.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants