This homework introduces fsck.py
, a simple file system simulator. Some
familiarity with the vsfs.py
simulator (described in an earlier chapter on
file system implementation) would be useful, but is not necessary.
The file system the tool changes the on-disk state of the file system, and then leaves it to you, the user, to detect the problem, much like a file system consistency checker would do (e.g., fsck). The file system itself is based on a simplified VSFS, the very simple file system described in an earlier chapter. The tool first generates an on-disk file system; then, either randomly or through more specific controls, the tool changes the on-disk state of one aspect of the file system.
The challenge for you, the user, is to figure out which part of the file system's on-disk state was changed and hence has become inconsistent. In many cases, the problem should be readily detected; in others, perhaps less so.
Let's get into the details. Here is an example run of the tool:
prompt> ./fsck.py -S 1
...
Final state of file system:
inode bitmap 1100100010000010
inodes [d a:0 r:4] [f a:-1 r:1] [] [] [d a:8 r:2] [] [] []
[d a:6 r:2] [] [] [] [] [f a:15 r:1] [f a:12 r:1] []
data bitmap 1000001010001001
data [(.,0) (..,0) (g,8) (t,14) (w,4) (m,13)] [] [] [] [] []
[(.,8) (..,0)] [] [(.,4) (..,0) (p,1)] [] [] [] [z] [] [] [g]
Can you figure out how the file system was corrupted?
prompt>
In this case, we run the tool with flag -S
set to 1 (for reasons
that will be clear later). To understand the change to on-disk state,
and the possible inconsistency, first you have to understand the file
system format. It is (as promised!) very simple.
First, there is an inode bitmap, which marks whether each corresponding inode is allocated (1) or free (0); in this case, there are 16 inodes, with inodes 0, 1, 5, 9, and 14 marked as allocated.
Second, there are 16 inodes. Unallocated inodes are shown as empty brackets --
[]
-- whereas allocated inodes have contents such as [f a:15 r:1] above. Each
allocated inode has three fields. The first is the type of the file, either
'd' for a directory or 'f' for a regular file. The second is a single address
field 'a' which either points to a single data block (in this limited file
system, files/directories can only contain a single block) or has a -1
indicating the file is empty. The third is a reference count; for regular
files, this represents the number of times this file is linked into the file
system; for directories, it represents the number of directories within this
directory.
Third is the data bitmap. Much like the inode bitmap, it indicates whether each data block in the file system is allocated or free.
Finally, there are the data blocks themselves. For regular files, data block contents are arbitrary (but usually contain something simple, like a letter; in real life, these would just contain a block of file data), whereas directory contents are structured as a list of (name, inode number) pairs.
The root directory in this file system is always inode number 0; it is from this point that you can start in order to build knowledge of the entire file system and its contents.
This particular file system contains the following files and directories:
Directories: '/', '/g', '/w'
Files: '/t', '/m', '/w/p'
Can you see why? Spend some time, starting with the root directory, and see if you can figure it out. If not, well, this homework will be hard for you.
Now, let's get to the inconsistency part. This particular file system has a simple specific inconsistency within it. Can you see it?
Once you've spent some time on this, run the tool with the -c
flag to see if
you were right:
prompt> ./fsck.py -S 1 -c
Initial state of file system:
inode bitmap 1100100010000110
inodes [d a:0 r:4] [f a:-1 r:1] [] [] [d a:8 r:2] [] [] [] [d a:6 r:2]
[] [] [] [] [f a:15 r:1] [f a:12 r:1] []
data bitmap 1000001010001001
data [(.,0) (..,0) (g,8) (t,14) (w,4) (m,13)] [] [] [] [] []
[(.,8) (..,0)] [] [(.,4) (..,0) (p,1)] [] [] [] [z] [] [] [g]
CORRUPTION::INODE BITMAP corrupt bit 13
Final state of file system:
inode bitmap 1100100010000010
inodes [d a:0 r:4] [f a:-1 r:1] [] [] [d a:8 r:2] [] [] [] [d a:6 r:2]
[] [] [] [] [f a:15 r:1] [f a:12 r:1] []
data bitmap 1000001010001001
data [(.,0) (..,0) (g,8) (t,14) (w,4) (m,13)] [] [] [] [] []
[(.,8) (..,0)] [] [(.,4) (..,0) (p,1)] [] [] [] [z] [] [] [g]
prompt>
As you can see from the output, the inode bitmap was changed in bit 13,
marking that bit free when actually the inode is allocated. You can see the
inconsistency by looking at inode 13 itself, which looks like this:
[f a:15 r:1]
(a regular file, pointing to data block 15). You also know that
this file is in the root directory (/m). Thus, concluding that the inode
bitmap changed here is fairly straightforward.
Many other corruptions are possible (but the tool only introduces one state change at a time, to keep your life simple). Do the homework to find out more about them.
The other flags available for the tool are:
prompt> ./fsck.py -h
Options:
-h, --help show this help message and exit
-s SEED, --seed=SEED first random seed (for a filesystem)
-S SEEDCORRUPT, --seedCorrupt=SEEDCORRUPT
second random seed (for corruptions)
-i NUMINODES, --numInodes=NUMINODES
number of inodes in file system
-d NUMDATA, --numData=NUMDATA
number of data blocks in file system
-n NUMREQUESTS, --numRequests=NUMREQUESTS
number of requests to simulate
-p, --printFinal print the final set of files/dirs
-w WHICHCORRUPT, --whichCorrupt=WHICHCORRUPT
do a specific corruption
-c, --compute compute answers for me
-D, --dontCorrupt don't actually corrupt file system
The reason for two random seeds is worth describing. The first one, -s
,
creates different random file systems. The second one, -S
, inserts different
file system state changes (corruptions). Thus, you can generate one particular
random file system and then try a bunch of different random corruptions upon
it.
Some other flags just control the size of the file system (-i
and -d
), whereas
others determine how many file operations are run against the file system
(-n
). The -p
flag prints the contents of the (non-corrupted) file system; the
-w
flag allows you to specify a particular corruption (although this is mostly
for testing); the -D
flag turns off corruption to, allowing you to examine an
intact file system.