Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[colocation] Corruption when dropping indexed table with backfill #4986

Closed
jaki opened this issue Jul 7, 2020 · 2 comments
Closed

[colocation] Corruption when dropping indexed table with backfill #4986

jaki opened this issue Jul 7, 2020 · 2 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug

Comments

@jaki
Copy link
Contributor

jaki commented Jul 7, 2020

I see ERROR: Corruption: Unable to decode key prefixes from: 21 when dropping a colocated table that has/had indexes with backfill enabled.

CREATE TABLE optin (i int UNIQUE);
DROP TABLE optin; -- corruption
CREATE TABLE optout (i int UNIQUE) WITH (colocated = false);
DROP TABLE optout; -- ok
CREATE TABLE optin (i int UNIQUE);
ALTER TABLE optin DROP CONSTRAINT optin_i_key;
DROP TABLE optin; -- corruption
CREATE TABLE t (i int);
CREATE INDEX ON t (i);
DROP TABLE t; -- corruption
CREATE TABLE t (i int);
CREATE INDEX ON t (i);
DROP INDEX t_i_idx;
DROP TABLE t; -- corruption
CREATE TABLE t (i int);
DROP TABLE t; -- ok

Useful logs:

W0707 12:26:24.091925 28744 tablet.cc:1761] writereq: tablet_id: "5c6766d5d0ca4145b85c2051088cc146"
propagated_hybrid_time: 6529638334838734848
include_trace: false
write_batch {
  transaction {
    transaction_id: "\3002|\256\321PE\030\206\017Y\352C\003\002\013"
    isolation: SNAPSHOT_ISOLATION
    status_tablet: "959954825480491793ea1af253a77adb"
    priority: 6865133788327544481
    start_hybrid_time: 6529638334359724032
  }
  DEPRECATED_may_have_metadata: true
}
read_time {
  read_ht: 6529638334383398912
  local_limit_ht: 6529638334588198912
  global_limit_ht: 6529638334588198912
  in_txn_limit_ht: 6529638334838632448
}
pgsql_write_batch {
  client: YQL_CLIENT_PGSQL
  stmt_id: 44901456
  stmt_type: PGSQL_TRUNCATE_COLOCATED
  table_id: "00004000000030008000000000004001"
  schema_version: 5
  column_refs {
  }
}
client_id1: 235144058365939949
client_id2: 14292720657967982002
request_id: 3
min_running_request_id: 3
rejection_score: 0
batch_idx: 0
W0707 12:26:24.095120 28744 docdb.cc:202] doc_path: 21    @     0x7f730add1af6  DetermineKeysToLock (src/yb/docdb/docdb.cc:202)
    @     0x7f730add1af6  yb::docdb::PrepareDocWriteOperation(std::vector<unique_ptr<yb::docdb::DocOperation, std::default_delete<yb::docdb::DocOperation> >, std::allocator<unique_ptr<yb::docdb::DocOperation, std::default_delete<yb::docdb::DocOperation> > > > const&, google::protobuf::RepeatedPtrField<yb::docdb::KeyValuePairPB> const&, scoped_refptr<yb::Histogram> const&, yb::IsolationLevel, yb::docdb::OperationKind, yb::RowMarkType, bool, std::chrono::time_point<yb::CoarseMonoClock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >, yb::StronglyTypedBool<yb::docdb::PartialRangeKeyIntents_Tag>, yb::docdb::SharedLockManager*) (src/yb/docdb/docdb.cc:295)
    @     0x7f730bb6b617 
    @     0x7f730bb54fd1  yb::tablet::DocWriteOperation::Start() (src/yb/tablet/tablet.cc:2636)
    @     0x7f730bb54fd1  yb::tablet::Tablet::StartDocWriteOperation(unique_ptr<yb::tablet::WriteOperation, std::default_delete<yb::tablet::WriteOperation> >, yb::ScopedRWOperation, boost::function<void (unique_ptr<yb::tablet::WriteOperation, std::default_delete<yb::tablet::WriteOperation> >, yb::Status const&)>) (src/yb/tablet/tablet.cc:2853)
    @     0x7f730bb5686a  yb::tablet::Tablet::KeyValueBatchFromPgsqlWriteBatch(unique_ptr<yb::tablet::WriteOperation, std::default_delete<yb::tablet::WriteOperation> >) (src/yb/tablet/tablet.cc:1748)
    @     0x7f730bb56b9b  yb::tablet::Tablet::AcquireLocksAndPerformDocOperations(unique_ptr<yb::tablet::WriteOperation, std::default_delete<yb::tablet::WriteOperation> >) (src/yb/tablet/tablet.cc:1774)
    @     0x7f730bb8d891  yb::tablet::TabletPeer::WriteAsync(unique_ptr<yb::tablet::WriteOperationState, std::default_delete<yb::tablet::WriteOperationState> >, long, std::chrono::time_point<yb::CoarseMonoClock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >) (src/yb/tablet/tablet_peer.cc:597)
    @     0x7f730c432c26  yb::tserver::TabletServiceImpl::Write(yb::tserver::WriteRequestPB const*, yb::tserver::WriteResponsePB*, yb::rpc::RpcContext) (src/yb/tserver/tablet_service.cc:1406)
    @     0x7f73092e5ac7  yb::tserver::TabletServerServiceIf::Handle(shared_ptr<yb::rpc::InboundCall>) (src/yb/tserver/tserver_service.service.cc:148)
    @     0x7f7304db62d6  yb::rpc::ServicePoolImpl::Handle(shared_ptr<yb::rpc::InboundCall>) (src/yb/rpc/service_pool.cc:262)
    @     0x7f7304d5164a  yb::rpc::InboundCall::InboundCallTask::Run() (src/yb/rpc/inbound_call.cc:212)
    @     0x7f7304dc4196  Execute (src/yb/rpc/thread_pool.cc:100)
    @     0x7f7304dc1a7d  operator()<, void> (gcc/5.5.0_4/include/c++/5.5.0/functional:600)
    @     0x7f7304dc1a7d  __call<void, 0ul> (gcc/5.5.0_4/include/c++/5.5.0/functional:1074)
    @     0x7f7304dc1a7d  operator()<, void> (gcc/5.5.0_4/include/c++/5.5.0/functional:1133)
    @     0x7f7304dc1a7d  _M_invoke (gcc/5.5.0_4/include/c++/5.5.0/functional:1871)
    @     0x7f73036beded  std::function<void ()>::operator()() const (gcc/5.5.0_4/include/c++/5.5.0/functional:2267)
    @     0x7f73036beded  yb::Thread::SuperviseThread(void*) (src/yb/util/thread.cc:759)
    @     0x7f72fe091693  start_thread (/tmp/glibc-20181130-26094-cs1x60/glibc-2.23/nptl/pthread_create.c:333)
    @     0x7f72fddd341c  (unknown) (sysdeps/unix/sysv/linux/x86_64/clone.S:109)
    @ 0xffffffffffffffff 
I0707 12:26:24.095206 28744 tablet_service.cc:400] Write failed: Corruption (yb/docdb/docdb.cc:205): Unable to decode key prefixes from: 21

This causes PgLibPqTest.TableColocation to fail when backfill is enabled by default.

@jaki jaki added kind/bug This issue is a bug area/ysql Yugabyte SQL (YSQL) labels Jul 7, 2020
@jaki jaki assigned ndeodhar and jaki Jul 7, 2020
@jaki
Copy link
Contributor Author

jaki commented Jul 7, 2020

This appears to be because pgtableid is not set for the doc_key_ of the PgsqlWriteOperation. Need to look into why it is not being set in this particular case.

jaki added a commit that referenced this issue Jul 8, 2020
Summary:

`pgtable_id` for schemas are not set consistently for master metadata
and tserver metadata.

- Create table request from postgres does not set it
- `CatalogManager::CreateTable` does not set it
- `RaftGroupMetadata::AddTable` does set it
- `CatalogManager::AlterTable` does set it
- `CatalogManager::AddIndexInfoToTable` does not set it
- `CatalogManager::MarkIndexInfoFromTableForDeletion` does not set it
- `RaftGroupMetadata::SetSchema` does not set it

Ideally, `pgtable_id` would be set into the master schema on create,
either on the create table request from postgres or
`CatalogManager::CreateTable`.  However, this may cause backwards
compatibility issues.

For now, tackle this problem at `RaftGroupMetadata::SetSchema` by
setting `pgtable_id` there.  This is just a copy-paste from
`RaftGroupMetadata::AddTable`.  Ideally, both shouldn't have to do this
because the master schema would already have it.

Test Plan:

```sh
./bin/yb-ctl create \
  --master_flags "ysql_disable_index_backfill=false" \
  --tserver_flags "ysql_disable_index_backfill=false"
./bin/ysqlsh
```

```sql
CREATE TABLE t (i int);
CREATE INDEX ON t (i);
DROP INDEX t_i_idx;
DROP TABLE t; -- no corruption from docdb
```

```sh
# Set `ysql_disable_index_backfill` default to false in the code, then
./yb_build.sh \
  --cxx-test pgwrapper_pg_libpq-test \
  --gtest_filter PgLibPqTest.TableColocation
```

Reviewers: neha

Reviewed By: neha

Subscribers: yql

Differential Revision: https://phabricator.dev.yugabyte.com/D8834
jaki added a commit that referenced this issue Jul 8, 2020
Summary:

`pgtable_id` for schemas are not set consistently for master metadata
and tserver metadata.

- Create table request from postgres does not set it
- `CatalogManager::CreateTable` does not set it
- `RaftGroupMetadata::AddTable` does set it
- `CatalogManager::AlterTable` does set it
- `CatalogManager::AddIndexInfoToTable` does not set it
- `CatalogManager::MarkIndexInfoFromTableForDeletion` does not set it
- `RaftGroupMetadata::SetSchema` does not set it

Ideally, `pgtable_id` would be set into the master schema on create,
either on the create table request from postgres or
`CatalogManager::CreateTable`.  However, this may cause backwards
compatibility issues.

For now, tackle this problem at `RaftGroupMetadata::SetSchema` by
setting `pgtable_id` there.  This is just a copy-paste from
`RaftGroupMetadata::AddTable`.  Ideally, both shouldn't have to do this
because the master schema would already have it.

Test Plan:

```sh
./bin/yb-ctl create \
  --master_flags "ysql_disable_index_backfill=false" \
  --tserver_flags "ysql_disable_index_backfill=false"
./bin/ysqlsh
```

```sql
CREATE TABLE t (i int);
CREATE INDEX ON t (i);
DROP INDEX t_i_idx;
DROP TABLE t; -- no corruption from docdb
```

```sh
# Set `ysql_disable_index_backfill` default to false in the code, then
./yb_build.sh \
  --cxx-test pgwrapper_pg_libpq-test \
  --gtest_filter PgLibPqTest.TableColocation
```

Reviewers: neha

Reviewed By: neha

Subscribers: yql

Differential Revision: https://phabricator.dev.yugabyte.com/D8834
@jaki
Copy link
Contributor Author

jaki commented Jul 8, 2020

Closed by commit d92b234.

@jaki jaki closed this as completed Jul 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug
Projects
None yet
Development

No branches or pull requests

2 participants