A test file generator that fills files with random bytes in a single director
File Contents
- Binary (default) from /dev/urandom. (Not compressible)
- -P any printable characters from the ASCI set. Optional
- -D Only the 0 to 9 digits. Optioal
- -S Sparse file nulls. Only to be used in Linux, Unix and WSL2. The sparse file option is NOT to be used on MS Windows filesystems and WSL1, MSYS2, Gitbash and Cygwin - see below
Other Options
-f <file size> Manditory, minimum 1 byte. File sizes have to be designated by B, K, M or G. Example 2KiB = 2K, 3MiB = 3M 200GiB = 200G
-o <exisiting directory> Optional, create file in a exisiting directory. Default is your current directory.
File Contents: Digits
-D n Where n is a number 1 to 10.
This selects the numbers in the character pool used
- n=1 files will be filled with the character zero '0'
- n=2 files will be filled with 0 and 1
- n=3 files will be filled with 0,1 and 2
- n=7 files will be filled with 0 and numbers 1 to 6
- n=10 files will be filled with a randon numbers of 0 to 9
File Contents: Printable ASCI characters:_
-P n Where n is a number 1 to 95.(the ASCI character set)
- n=1 file will filled with repeats of the character 'A'
- n>=2 or n<=26 all characters will be from the lower-case Latin alphabet
- n>=27 characters will be from the 95 printable character set.
Examples:
- n = 10 will select all characters stating at "a" to "j"
- n = 4 will select all characters "a" to "d"
- n = 60 will select all characters from "!" to "\"
- n = 46 will select all characters from "!" to "M"
File Contents: Sparse File
-S Null filled sparse files. Although disk usage is minimal their apparent size may be used to determine capacity usage. Not to be used on Windows filesystems.
VALIDATE FILE CONTENTS
Validate all Files:
od -N <bytes> -Ax -t x1z <file name>
- Where <bytes> is the number to check from beginning of file.
- Non-printable characters will appear as "dots"
- For sparse files omit <bytes> as it not required.
Validate printable character distribution:
od -a <file name> | cut -b 9- | tr " " \\n | egrep -v "^$" | sort | uniq -c
Output
- Column 1 : Character count
- Column 2 : Character being counted. This column should only contain a single charcter, if not then file contents is binary data.
Validate sparse fILes:
- ls -lsh Shows disk usage at start of output, apparent size is in its usual place.
- du -b Shows disk usage in bytes.
- du -b --apparent-size Shows the size the user sees and used filesystem capacity calculations
NOTES
- Binary data may render storage and transmission data dedplication and compression ineffective.
- The smaller range of printable characters used the longer the script runs.
- Testing data duplication, the moost useful range is alphanumeric of 1 to 20 chracters i.e -P 1 to -P 20
- Testing compression, the most useful range is alphanumeric 1 to 40 i.e. -P 1 to -P 40
EXAMPLES
Create fifty 5GiB files containing only the randonly distrubeted digits 0, 1 , 2, 3 and 4
BTFcreate -D 5 -f 5G -n 50
Create a single 1MiB file containing these characters randomly distributed !"#$%&'()*+,-./0123456789:;<=>?@ABC
BTFcreate -P 35 -f 1M -n 1
Same as previous, but with files being generated in the test_files directory within the user's home directory.
BTFcreate -P 35 -f 1M -n 1 -o ~/test_files
Create 1,000 1KiB files containing these characters randomly distributed abcdefghij
BTFcreate -P 10 -f 1K -n 1000
WINDOWS SPARSE FILES - How to make:
On Windows run the following as administrator
fsutil File CreateNew <file name> <size in bytes>
fsutil Sparse SetFlag <file name>
fsutil Sparse SetRange <file name> 0 <size in bytes>
WSL2-Linux cannot run windows commands and programs that require administrator privilege.