Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor TMTTableFlusher #2

Merged
merged 1 commit into from
Feb 22, 2019
Merged

Refactor TMTTableFlusher #2

merged 1 commit into from
Feb 22, 2019

Conversation

flowbehappy
Copy link
Contributor

@flowbehappy flowbehappy commented Feb 18, 2019

This PR asynchronize the flush process from raft log apply process.

@flowbehappy flowbehappy force-pushed the region-flush-refactor branch from 242f51e to 46b8d25 Compare February 20, 2019 07:22
@@ -94,7 +94,7 @@ void dbgFuncDropTiDBTable(Context & context, const ASTs & args, DBGInvoker::Prin

TMTContext & tmt = context.getTMTContext();
tmt.region_partition.dropRegionsInTable(table_id);
tmt.table_flushers.dropRegionsInTable(table_id);
tmt.region_partition.dropRegionsInTable(table_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

{
auto & partition = getOrInsertRegion(table_id, region_id).second;
partition.updated = true;
partition.cache_bytes += delta;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say for a specific region r1 containing two tables t1 and t2, a raft command put 10 bytes of t1 and 10 bytes of t2 into r1. Then delta here would be 20 bytes, and would be added to relative tables (both t1 and t2). That is not accurate. Am I correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is correct. But actually we don't need to be 100% percent accurate. And a region belonging to more than one table is very rare.

Copy link
Contributor

@zanmato1984 zanmato1984 Feb 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about let Region::onCommand return table id and delta bytes pair and pass into updateRegion? So that in updateRegion we could know how many bytes each table changes. This doesn't require much code I assume, but I could be wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can theoretically, but I still don't think it is necessary to make the logic more complicated. It doesn't loss anything even if we flush a partition by mistake. And again this situation is very rare.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just thought this wouldn't take much code, and the logic could be quite less confusing, so worth doing. But it's your call after all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I should add some documents to help reducing confuses.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good too.

// drop table and create another with same name, but the previous one will still flush
if (storage == nullptr)
{
LOG_ERROR(log, "table " << table_id << " flush partition " << partition_id << " , but storage is not found");
Copy link
Contributor

@zanmato1984 zanmato1984 Feb 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what situation will the storage be null? Does it imply some potential risks that we should do more than simply log and ignore (return), like throwing an exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed we should throw an exception. Some potential bugs may cause this I guess. Although haven't found it happen yet.

}
}

void RegionPartition::updateRegion(const RegionPtr & region, size_t before_cache_bytes, TableIDSet relative_table_ids)
Copy link
Contributor

@zanmato1984 zanmato1984 Feb 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw updateRegion was after Region::onCommand. What if background flush happens in between? More specifically: 1. some data was inserted into a region; 2. background flush grabbed the mutex in RegionPartition and flushed the data of the region (time limit reached for example); 3. updateRegion then executed, it saw an empty region with wrong update arguments (before_cache_bytes for example).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The risk does exist. And it is one of the many multi-thread issues in our system. I would like to file an issue and fix it until we have a better global design. https://internal.pingcap.net/jira/browse/FLASH-132

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@zanmato1984
Copy link
Contributor

LGTM as long as comments addressed.

TableID end_table_id = end_key.empty() ? std::numeric_limits<TableID>::max() : RecordKVFormat::getTableId(end_key);

// This region definitely does not contain any data in this table.
if (start_table_id > table_id || end_table_id < table_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conflict with 42c6a91

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants