Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

power loss reboot triggered internal assert #343

Open
eastmoutain opened this issue Dec 2, 2019 · 6 comments
Open

power loss reboot triggered internal assert #343

eastmoutain opened this issue Dec 2, 2019 · 6 comments

Comments

@eastmoutain
Copy link

After random power loss and reboot, littlefs triggered internal assert, the function call trace is as below:

littlefs/littlefs/lfs.c:75 --> LFS_ASSERT(block < lfs->cfg->block_count);

littlefs/littlefs/lfs.c:2301 -> lfs_bd_read()

littlefs/littlefs/lfs.c:3786 -> lfs_ctz_traverse()

littlefs/littlefs/lfs.c:4093 -> lfs_fs_traverse()

littlefs/littlefs/lfs.c:1491 -> lfs_fs_size()

littlefs/littlefs/lfs.c:1812 -> lfs_dir_compact()

littlefs/littlefs/lfs.c:3145 -> lfs_dir_commit()

I added log, and found that block is 0x98fb7d37, it's an invalid block.

Littfs configuration:
block size: 4096 Bytes
block count: 4096
lookhead_size: 16 Bytes
rcache size: 256 Bytes
pcache size : 256 Bytes
cache size 256 Bytes

@geky
Copy link
Member

geky commented Dec 5, 2019

Hi @eastmoutain, thanks for creating an issue. This looks like the same as #295

@geky geky added needs investigation no idea what is wrong needs test all fixes need test coverage to prevent regression labels Dec 19, 2019
@eastmoutain
Copy link
Author

@geky Recently am focusing on this issue, and I have reproduced the case more than two times. May i know your email to provide you some clue. if it's not convenient, you can send email to me. My email is [email protected]

@eastmoutain
Copy link
Author

@geky is it right if one meta block contains two same tag id, one is file tag and the other is directory tag? as to my case, it's to read directory meta pair, however it gets 8 bytes followed by inline struct.

@eastmoutain
Copy link
Author

eastmoutain commented Dec 24, 2019

@geky
when attempted to open update directory, it met lfs assert, lfs_bd_read: Assertion `block < lfs->cfg->block_count' on lfs.c:75.

I parsed the corrupted image with lfs-tool. I had updated littlefs souce code of lfs-tool to littlefs v2.1.3 as it used one old version, and added some debug log to lfs-tool/src/lfs/lfs.c.

The first picture shows that lfs_dir_find() -> lfs_dir_fetchmatch() firstly found the name tag of "update" directory is 0x201c06, the tag id is 0x7, however after involving LFS_TYPE_CREATE tag type, finally the returned tag for "update" directory is 0x202006, the tag id was changed to 0x8.

Then lfs_dir_get() tried to get the directory "update" meta block pair, the argument ftag of lfs_dir_get() is generated by LFS_MKTAG(LFS_TYPE_STRUCT, lfs_tag_id(tag), 8). lfs_dir_get()->lfs_dir_getslice() to match the directory "update" DIRSTRUCT tag backwards, unfortunately it mismatched tag 0x2010202a whose type is INLINESTRUCT. experiment shows that force filterring out tag 0x2010202a littlefs can open and read directory "update".

image
image
parse.log

  • Here I noticed there are two identical tag id in the same dir, tag id 0x7 and tag id 0x8, I am not sure whethre it's normal case.
  • Am not very clear about LFS_TYPE_SPLICE, why should it add the chunk of LFS_TYPE_CREATE tag if it meets LFS_TYPE_CREATE to and sub the chunk from tag when traverse the dir blk, could you please give me some explaination of the design, thank you.

@eastmoutain
Copy link
Author

eastmoutain commented Dec 29, 2019

@geky I have found the way to reproduce the assertion on PC, maybe it caused by lfs_rename().

How to reproduce:

  1. Port the code to lfs-tool, and update littlefs source code version to v2.1.3.
  2. run cmd lfs-tool -c -d dir -i dir.disk
  3. the assertion will appear soon.
void test2()
{
    int fd;
    int upg_check_fd;

    fd = vfs_lfs->open(vfs_lfs, "/test.1", O_WRONLY |O_CREAT | O_TRUNC);
    vfs_lfs->close(vfs_lfs, fd);

    fd = vfs_lfs->open(vfs_lfs, "/test.2", O_WRONLY |O_CREAT | O_TRUNC);
    vfs_lfs->close(vfs_lfs, fd);

    vfs_lfs->mkdir(vfs_lfs, "/update");

    upg_check_fd = vfs_lfs->open(vfs_lfs, "/upg_check", O_WRONLY |O_CREAT | O_TRUNC);
    vfs_lfs->write(vfs_lfs, upg_check_fd, "hello upg_check\n", sizeof("hello upg_check\n"));

    vfs_lfs->remove(vfs_lfs, "/test.2");

    vfs_lfs->rename(vfs_lfs, "/test.1", "/test.3");

    vfs_lfs->close(vfs_lfs, upg_check_fd);

    fd = vfs_lfs->open(vfs_lfs, "/update/ppp", O_WRONLY |O_CREAT | O_TRUNC);
    vfs_lfs->close(vfs_lfs, fd);
}

if rename "test.1" to another directory, it runs okay. I noticed your comments in lfs_rename, if rename the file in the same directory and the new id is less than the old id, it's a bit messy.

The onearth reason why rename results in assertion if rename file across two different diretories, as far as I can understand, as the test case above shows, is the opened file's id is changed to wrong id during rename. In my case, rename test.1 to test.3 changed upg_check file id, closing upg_check flushes the cached data of upg_check, I checked the image and found the last commit tag of upg_check is an inline struct. Finally, open /update/ppp matched the last committed inline struct tag as /update directory DIRSTRUCT.

I analysed the source code in function lfs_dir_commit(), for renaming opertion, there are two tags are committed at the same time, the create tag for the new file and the delete tag for the old file. If rename the file in the same directory, the current logic ignored create tag, just handled delete tag.

            if (d->id == lfs_tag_id(deletetag)) {
                d->m.pair[0] = LFS_BLOCK_NULL;
                d->m.pair[1] = LFS_BLOCK_NULL;
            } else if (d->id > lfs_tag_id(deletetag)) {
                d->id -= 1;
                if (d->type == LFS_TYPE_DIR) {
                    ((lfs_dir_t*)d)->pos -= 1;
                }
            } else if (&d->m != dir && d->id >= lfs_tag_id(createtag)) {
                d->id += 1;
                if (d->type == LFS_TYPE_DIR) {
                    ((lfs_dir_t*)d)->pos += 1;
                }
            }

I modified the code above as follow, then it runs okay. But am still not very sure my modification.

            if (d->id == lfs_tag_id(deletetag)) {
                d->m.pair[0] = LFS_BLOCK_NULL;
                d->m.pair[1] = LFS_BLOCK_NULL;
            } else if (d->id > lfs_tag_id(deletetag)) {
                d->id -= 1;
                if (d->type == LFS_TYPE_DIR) {
                    ((lfs_dir_t*)d)->pos -= 1;
                }
            }

           if (&d->m != dir && d->id >= lfs_tag_id(createtag)) {
                d->id += 1;
                if (d->type == LFS_TYPE_DIR) {
                    ((lfs_dir_t*)d)->pos += 1;
                }
            }

@geky
Copy link
Member

geky commented Jan 27, 2020

Hi @eastmoutain, sorry about the absence.

Good find here. This was definitely the root of the problem. You're right that is was caused by overlapping tags in lfs_dir_commit.

Also many thanks for isolating the bug! I was able to reproduce the bug and have added the test case to this patch: f4b6a6b (in #372).

Splitting out the if statement was a good starting point, but unfortunately there were a few other issues. The order of the if statements matters, so depending on what order the create/delete tags are passed to lfs_rename, it can still break.

I ended up tearing out most of the handling for the tags and rewriting it to hopefully be more robust.

This should be fixed in #372 but let me know if I'm wrong. Thanks for finding the issue!

@geky geky added fixed? v2.2 and removed needs investigation no idea what is wrong needs test all fixes need test coverage to prevent regression labels Jan 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants