Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PageStorage v3 reports Trying to add ref to non-exist page #5570

Closed
breezewish opened this issue Aug 9, 2022 · 1 comment · Fixed by #5612
Closed

PageStorage v3 reports Trying to add ref to non-exist page #5570

breezewish opened this issue Aug 9, 2022 · 1 comment · Fixed by #5612
Assignees
Labels
affects-6.0 affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.2 component/storage severity/moderate type/bug The issue is confirmed as a bug.

Comments

@breezewish
Copy link
Member

breezewish commented Aug 9, 2022

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

Operation sequence:

// Flush
Put 951

// Segment Merge
Ref 954->951
Del 951

// Segment Logical Split
Ref 972->954
Ref 985->954
Del 954

// Segment Logical Split
Ref 998->985   <-- Crash
Ref ...
Del ...

Unit test code to reproduce:

TEST_F(PageStorageTest, MyTest)
try
{
    {
        PageStorage::Config config;
        config.blob_heavy_gc_valid_rate = 1.0; /// always run full gc
        page_storage = reopenWithConfig(config);
    }

    const size_t buf_sz = 1024;
    char c_buff[buf_sz];
    for (size_t i = 0; i < buf_sz; ++i)
    {
        c_buff[i] = i % 0xff;
    }

    const UInt64 tag = 12345;
    {
        // Page 951
        WriteBatch log;
        log.putPage(951, tag, std::make_shared<ReadBufferFromMemory>(c_buff, buf_sz), buf_sz, {});
        page_storage->write(std::move(log));
    }

    {
        // Page 954->951
        WriteBatch log;
        log.putRefPage(954, 951);
        page_storage->write(std::move(log));
    }
    {
        // DelPage 951
        WriteBatch removed_log;
        removed_log.delPage(951);
        page_storage->write(std::move(removed_log));
    }

    {
        // Page 972->954
        // Page 985->954
        WriteBatch log;
        log.putRefPage(972, 954);
        log.putRefPage(985, 954);
        page_storage->write(std::move(log));
    }
    {
        // DelPage 954
        WriteBatch removed_log;
        removed_log.delPage(954);
        page_storage->write(std::move(removed_log));
    }

    {
        // Page 998->985
        // Page 1011->985
        WriteBatch log;
        log.putRefPage(998, 985);
        log.putRefPage(1011, 985);
        page_storage->write(std::move(log));
    }

    {
        auto page = page_storage->read(998);
        for (size_t i = 0; i < buf_sz; ++i)
        {
            EXPECT_EQ(*(page.data.begin() + i), static_cast<char>(i % 0xff));
        }
    }
}
CATCH

2. What did you expect to see? (Required)

Test pass

3. What did you see instead (Required)

Code: 0. DB::Exception: Trying to add ref to non-exist page [page_id=1000.998] [ori_id=1000.985] [ver=<6,0>] [resolve_id=1000.951] [resolve_ver=<4,0>]:  [type=2] [page_id=1000.998] [ver=<6,0>] [edit_size=2]

Stack trace:

     0x10db56d34	StackTrace::StackTrace() [gtests_dbms+4510706996]
                	dbms/src/Common/StackTrace.cpp:23
     0x10db56d70	StackTrace::StackTrace() [gtests_dbms+4510707056]
                	dbms/src/Common/StackTrace.cpp:22
     0x100da2530	DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) [gtests_dbms+4295009584]
                	dbms/src/Common/Exception.h:32
     0x100da1638	DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) [gtests_dbms+4295005752]
                	dbms/src/Common/Exception.h:34
     0x10c576bfc	DB::PS::V3::PageDirectory::applyRefEditRecord(std::__1::map<DB::UInt128, std::__1::shared_ptr<DB::PS::V3::VersionedPageEntries>, std::__1::less<DB::UInt128>, std::__1::allocator<std::__1::pair<DB::UInt128 const, std::__1::shared_ptr<DB::PS::V3::VersionedPageEntries> > > >&, std::__1::shared_ptr<DB::PS::V3::VersionedPageEntries> const&, DB::PS::V3::PageEntriesEdit::EditRecord const&, DB::PS::V3::PageVersion const&) [gtests_dbms+4487769084]
                	dbms/src/Storages/Page/V3/PageDirectory.cpp:1027
     0x10c5775d4	DB::PS::V3::PageDirectory::apply(DB::PS::V3::PageEntriesEdit&&, std::__1::shared_ptr<DB::WriteLimiter> const&) [gtests_dbms+4487771604]
                	dbms/src/Storages/Page/V3/PageDirectory.cpp:1106
     0x10c5aa920	DB::PS::V3::PageStorageImpl::writeImpl(DB::WriteBatch&&, std::__1::shared_ptr<DB::WriteLimiter> const&) [gtests_dbms+4487981344]
                	dbms/src/Storages/Page/V3/PageStorageImpl.cpp:110
     0x102d1d258	DB::PageStorage::write(DB::WriteBatch&&, std::__1::shared_ptr<DB::WriteLimiter> const&) [gtests_dbms+4328018520]
                	dbms/src/Storages/Page/PageStorage.h:276
     0x102d40960	DB::PS::V3::tests::PageStorageTest_MyTest_Test::TestBody() [gtests_dbms+4328163680]
                	dbms/src/Storages/Page/V3/tests/gtest_page_storage.cpp:1697
     0x10941e1d8	void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) [gtests_dbms+4436025816]
                	contrib/googletest/googletest/src/gtest.cc:2443
     0x1093f0d30	void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) [gtests_dbms+4435840304]
                	contrib/googletest/googletest/src/gtest.cc:2479
     0x1093f0c80	testing::Test::Run() [gtests_dbms+4435840128]
                	contrib/googletest/googletest/src/gtest.cc:2517
     0x1093f1c7c	testing::TestInfo::Run() [gtests_dbms+4435844220]
                	contrib/googletest/googletest/src/gtest.cc:2693
     0x1093f2a88	testing::TestCase::Run() [gtests_dbms+4435847816]
                	contrib/googletest/googletest/src/gtest.cc:2811
     0x1093fcf08	testing::internal::UnitTestImpl::RunAllTests() [gtests_dbms+4435889928]
                	contrib/googletest/googletest/src/gtest.cc:5177
     0x109423084	bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) [gtests_dbms+4436045956]
                	contrib/googletest/googletest/src/gtest.cc:2443
     0x1093fc9f4	bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) [gtests_dbms+4435888628]
                	contrib/googletest/googletest/src/gtest.cc:2479
     0x1093fc8e4	testing::UnitTest::Run() [gtests_dbms+4435888356]
                	contrib/googletest/googletest/src/gtest.cc:4786
     0x10304ceec	RUN_ALL_TESTS() [gtests_dbms+4331359980]
                	contrib/googletest/googletest/include/gtest/gtest.h:2341
     0x10304cbf0	main [gtests_dbms+4331359216]
                	dbms/src/TestUtils/gtests_dbms_main.cpp:77
     0x151fe1088	<unknown symbol>

4. What is your TiFlash version? (Required)

master

@breezewish breezewish added the type/bug The issue is confirmed as a bug. label Aug 9, 2022
@JaySon-Huang JaySon-Huang self-assigned this Aug 11, 2022
@JaySon-Huang
Copy link
Contributor

JaySon-Huang commented Aug 11, 2022

// Flush, writebatch with version=1
Put 951

// Segment Merge, version=2
Ref 954->951
// del after merge, version=3
Del 951

// Segment Logical Split, version=4
Ref 972->954
Ref 985->954
// del after logical split, version=5
Del 954

=====
After applying the WriteBatches above, we can describe the status in PageStorage as JSON format:

{
    "951": [
        {
            "type": "entry",
            "create_ver": 1,
            "being_ref_count": 3, // being ref by id 972 and 985, and itself
            "entry": "..some offset to blob file"
        },
        {
            "type": "delete",
            "delete_ver": 3,
        }
    ],
    "954": {
        "type": "ref",
        "ori_page_id": 951,
        "create_ver": 2,
        "delete_ver": 5,
    },
    "972": {
        "type": "ref",
        "ori_page_id": 951, // directly point to 951
        "create_ver": 4,
    },
    "985": {
        "type": "ref",
        "ori_page_id": 951, // directly point to 951
        "create_ver": 4,
    },
}

=====

// Segment Logical Split, version=6
Ref 998->985 <-- Crash

=====
When we try to apply this WriteBatch to PageStorage
First we will see the type of id=985 is ref, and will try to find the original entry in ori_page_id=951 with version=985.create_ver => 4,
but there is a "delete" with delete_ver=3, so an exception is thrown with message "Trying to add ref to non-exist page ..."

Actually, we should find the original entry in ori_page_id=951 with version=2, which is the create_version of the entry to be ref. But we missed to save the field.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.0 affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.2 component/storage severity/moderate type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants