Releases: fcorbelli/zpaqfranz
Windows 32/64 binary, 64 bit-HW accelerated
-home switch for add
It is possible to archive different folders inside different .zpaq, this is useful for splitting individual users (inside /home or c:\users) to different .zpaq
zpaqfranz a z:\utenti.zpaq c:\users -home
zpaqfranz a /temp/test1 /home -home -not franco
zpaqfranz a /temp/test2 /home -home -only "*SARA" -only "*NERI"
Support of selections in r robocopy commmand
Now you can select files just like the add command
zpaqfranz r c:\d0 z:\dest -kill -minsize 10GB
zpaqfranz r c:\d0 z:\dest -kill -only *.e01 -only *.zip
Fix for Mac PowerPC
Yes, someone compile zpaqfranz on PPC
Improved compatibility with ancient compilers on Slackware
Slack seems to run very old gccs
Workaroud for gcc's buggy versions
Newer gcc is bugged... too
A bit of refactoring
Slower but cleaner
Replaced $ to %, because Linux's scripts does not like $ at all
- %hour
- %min
- %sec
- %weekday
- %year
- %month
- %day
- %week
- %timestamp
- %datetime
- %date
- %time
Example: zpaqfranz r c:\d0 z:\backup%day -kill_
Examples for -orderby switch (in add)
zpaqfranz a z:\test.txt c:\dropbox -orderby ext;name
zpaqfranz a z:\test.txt c:\dropbox -orderby size -desc
filecopy with variable buffer size (-buffer)
Just for test on different platforms
The versum command takes only file starting with |
versum will elaborate now this kind of "strange" files
|SHA-256: 8A9C2486E9E9DAC489FC5748CF400359BB6DD5F10276429EED5F3E647DA25B0D [ 522.192.336.767] |pippo.zip
|SHA-256: 000064E741776F57D3961170A3C03679B45F37BCB1DD1A63FE5D288FD5374D94 [ 110.637] |pluto43/435396f46adc648df0a5f5c13667ee3cb9ea4eca
disclaimer after help for USE THE DOUBLE QUOTES, LUKE!
After each help
************ REMEMBER TO USE DOUBLE QUOTES! ************
*** -not *.cpp is bad, -not "*.cpp" is good ***
*** test_???.zpaq is bad, "test_???.zpaq" is good ***
Windows 32/64 binary,HW accelerated, ESXi, Linux, Free/Open BSD
Fix for older Windows (<10) console
I tested this version but zpaqfranz does not copy any file or folder :
Now zpaqfranz should autodetect (the dirtiest way) Windows version
A bit of internal fixing for future Debian package
Updated statically-linked executables
Windows 32/64 binary, 64 bit-HW accelerated
Improved robocopy command
Robocopy command not preserving the timestamps of the copied folders
Please note: the -pakka switch will enable a special mode, very good for backups over LAN (SMB/NFS servers etc)
The default behavior of robocopy is to perform the "touch" of each target folder, whether date and attributes are changed or not
This is the best setting for local 1:1 use.
Using -pakka the destination data are read and, if different from the source data, get changed
It is much faster on LAN and with different destinations (using -ssd)
It depends on the individual situation, I suggest trying both the default mode and the -pakka mode to find the fastest one
Robocopy shows more information during operations (progress % on size - previously it was only % on number of files - and also ETA)
As well as folder touching speed (so you can tell if -pakka is faster or not)
[](https://sour...
Windows 32/64 binary, 64 bit-HW accelerated
New command fzf
Output filenames in a "friendly" manner for the fzf command
Some examples here
zpaqfranz fzf 1.zpaq
zpaqfranz fzf 1.zpaq -all | fzf
For command r (robocopy)
Faster. On modern machines (Windows=>Windows) it is just as fast as robocopy in copying large files over local network. In mixed cases (Windows=>Linux), the speed is increased too
On "*nix" systems if the executable is called "robocopy"... do the r (robocopy) command
robocopy /tmp/zp /tmp/backup1 /tmp/backup2
-buffer switch
Select a different buffer for copy, just like -buffer 512KB or whatever
-big switch (on Windows)
Use the Windows' API function to copy file. It's faster than the default one, writing to SMB shares and with big files.It depends on the circumstances, it is not possible to predict which version will be faster, really it depends on too many parameters
A lot of profiling infos with -verbose switch
Just nerd-stuff
In command l (list) and x (extract)
New range switch "last versions"
-range ::X Last X versions
You can extract (or list) files from the last X versions, IF the files are present in the latest X versions
Example: 10 versions, 1 to 10, ask to extract or list -all -range ::3 (the last 3), is equivalent to -all -range 8:10
If nothing is present in versions [8-10], nothing will be listed or extracted, and vice-versa
Various
On *nix system try to heuristically take the freespace of a non existent folder
Reduces the need to use the -space switch to force execution on non-existing folders
If you try to allocate on non-Intel systems memory without using -DNOJIT, warn that you might want to do so (e.g., Mac M1/M2)
Windows 32/64 binary, 64 bit-HW accelerated
Bug fixing
fwrite() on 58.5 broke some commands
More developing on -fasttxt (aka: automagically computing the CRC-32 of the archive)
Works on multipart / indexed multipart archive
Not yet 100% tested
The -verify switch will run a test-against-the-filesystem [good for debugging]
Encrypted-indexed
zpaqfranz a z:\test_??? c:\zpaqfranz\*.exe -index z:\indez.zpaq -key pippo -fasttxt
zpaqfranz a z:\test_??? c:\zpaqfranz\*.cpp -index z:\indez.zpaq -key pippo -fasttxt -verify
zpaqfranz a z:\test_??? c:\zpaqfranz\*.txt -index z:\indez.zpaq -key pippo -fasttxt -verify
zpaqfranz versum z:\test*.zpaq -fasttxt
Backup
zpaqfranz backup z:\baz c:\zpaqfranz\*.cpp -fasttxt
zpaqfranz backup z:\baz c:\zpaqfranz\*.exe -fasttxt
zpaqfranz backup z:\baz c:\zpaqfranz\*.txt -fasttxt
zpaqfranz versum z:\baz*.zpaq -fasttxt
With a couple more releases I should be ready to start the actual implementation of zpaqfranz-over-TCP.
Basically only the index will be stored locally, not the data that will be sent to zpaqfranz-server in the cloud.
Really complicated, with all the special cases provided by zpaq, but I am starting to see light at the end of the tunnel
The system will be 100% ransomware insensitive [of course if the server is not compromised!], allowing recovery (at least in intentions) in any situation, even the most catastrophic
Basically I am operating a bottom-up plus divide-et-impera. Work in progress...
Windows 32/64 binary, 64 bit-HW accelerated
Fixed a small but nasty bug in t for big files
Example in this thread
Added during refactoring sorting until 10-chars long instead of 40. It doesn't actually invalidate anything, but still it is unpleasant
Automagically add files
Every promise is a debt
zpaqfranz a z:\58_5 -key pippo => if ./58_5 file|folder does exists, automagically add to the archive
New fasttxt switch / format
zpaqfranz now can automagically calculate the CRC-32 of the archive (without, of course, re-reading from filesystem), writing down in archivename_crc32.txt file
C:\zpaqfranz>zpaqfranz a z:\1.zpaq *.cpp -fasttxt
zpaqfranz v58.5o-JIT-GUI-L,HW SHA1/2,SFX64 v55.1,(2023-07-12)
franz:-fasttxt -hw
Creating z:/1.zpaq at offset 0 + 0 )
Add 2023-07-12 14:00:13 27 89.286.021 ( 85.15 MB) 32T (0 dirs)
27 +added, 0 -removed.
0 + (89.286.021 -> 16.670.812 -> 2.069.455) = 2.069.455 @ 57.38 MB/s
62655: CRC-32 EXPECTED E948770C
62682: Updating fasttxt z:/1_crc32.txt :OK
1.500 seconds (000:00:01) (all OK)
Getting something like that
C:\zpaqfranz>type z:\1_crc32.txt
$zpaqfranz fasttxt|1|2023-07-12 14:00:14|z:/1.zpaq
E948770C 8293084830611972 0 [2.069.455] (0)
In this example the first data (E948770C) is the (expected) CRC-32 of the archive.
The second 8293084830611972 , is the getted "quick" hash, the third (0) in this case the initial CRC-32, then filesizes
"Quick hash" is the heuristic hash introduced some release earlier
Using the versum command, with -fasttxt, it is possible to check very quickly
C:\zpaqfranz>zpaqfranz versum z:\1.zpaq -fasttxt
zpaqfranz v58.5o-JIT-GUI-L,HW SHA1/2,SFX64 v55.1,(2023-07-12)
franz:versum | - command
franz:-fasttxt -hw
66764: Test CRC-32 of .zpaq against _crc32.txt
87163: Bytes to be checked 2.069.455 (1.97 MB) in files 1
66323: OK CRC-32: z:/1.zpaq
====================================================================
66356: TOTAL 1
66357: OK 1
66358: WARN 0
66359: ERROR 0
0.016 seconds (00:00:00) (all OK)
with -quick in (almost) no time
C:\zpaqfranz>zpaqfranz versum z:\1.zpaq -fasttxt -quick
zpaqfranz v58.5o-JIT-GUI-L,HW SHA1/2,SFX64 v55.1,(2023-07-12)
franz:versum | - command
franz:-quick -fasttxt -hw
66764: Test QUICK of .zpaq against _crc32.txt
87163: Bytes to be checked 2.069.455 (1.97 MB) in files 1
66323: OK QUICK: z:/1.zpaq
====================================================================
66356: TOTAL 1
66357: OK 1
66358: WARN 0
66359: ERROR 0
0.031 seconds (00:00:00) (all OK)
You can run even with .zpaq (on Linux ".zpaq")
zpaqfranz versum *.zpaq -fasttxt
Why this "thing", most like -checktxt ?
Because the CRC-32 calculation is performed during the writing phase to the disk, so it has minimal impact in terms of time and CPU, and is ONLY performed on the added part
Let's take a concrete example, otherwise it is difficult to understand the incredible usefulness (in certain scenarios, of course)
Suppose you make a backup with a certain tool (e.g. 7z, rar, tar) of a certain folder.
Suppose the archive is 500GB in size and resides (as normal) on a slow device, e.g. a NAS with magnetic disks, used by many others
Suppose you want to transfer it to another device (as normal), e.g. with rsync.
This will require reading all 500GB (locally, maybe painfully slow), calculating the relevant checksums (for rsync they are basically md5, high CPU usage), remotely sending all 500GB (=saturating all bandwidth), remotely calculating 500GB (=high I/O and CPU) of md5 hashes, and comparing them.
Now you are paranoid: your archive is full of precious data, therefore you launch a local CRC-32 (for the .7z, rar, tar...) AND a remote CRC-32, just to be sure
So far, so good, zpaqfranz pay the same "cost" (for the FIRST run)
The backups, however, are typically always repeated, say daily (even more often, say at night as a typical case)
On the 2nd run, with tar, 7z, rar etc, you will be in the exact situation
Suppose the new archive is 501GB (in the source folder 2GB changed)
Creating (aka: writing) a 501GB giant file, read everything back, calculate md5, calculate (remotely by rsync) 500GB and and and... hours in local, hours in remote, a LOT of I/O local, a LOT of CPUs
- local: Read 2GB
- local: Write 1GB
- local: MD5 of 501GB
- local: Send ~1GB
- remote: MD5 of 500GB
- remote: Write of 1GB
With zpaqfranz 58.4 and checktxt...
- local: Read 2GB
- local: Write 1GB
- local: MD5 of 501GB
- local: Send 1GB
- remote: Write of 1GB
- remote: MD5 of 501GB
With zpaqfranz 58.5 and fasttxt...
- local: Read 2GB
- local: Write 1GB
- local: Send 1GB
- remote: Write of 1GB
- remote: CRC-32 of 501GB
In future release the 5) step will become "CRC-32 of 1GB"
Real-world Windows example
Therefore here a little (!) Windows batch file
Suppose you want to backup to a remote server (a Linux box) "something", some Windows' data, using a local encryption password
Since you are lazy, you want not only the local copy to be verified, but also the remote one CRC-32 compared with the local, and you want a different e-mail depending on the verification (in case of error or not) BUT DO NOT WANT TO SEND THE PASSWORD TO THE REMOTE SERVER
Since you use an FTTH connection you really want to send the minimum amount of information changed, and you do NOT want to run rsync on huge files (hundreds of GB) that can take hours
We have a key-based authentication (for ssh, then rsync-over-ssh)
First step: make the archive, in this example into k:\franco\test\zpaqfranz_pippo.zpaq
Of the two folders c:\zpaqfranz c:\stor
with password (key) pippo
support for longer than 255 files (-longpath)
using CRC-32 for late cloud test (-fasttxt)
no ETA (this is a batch file afterall, who cares, -noeta)
and we want a BIG confirmation (-big) easier to spot on e-mails
@echo off
date /t >c:\stor\result.txt
time /t >>c:\stor\result.txt
c:\stor\bin\zpaqfranz a k:\franco\test\zpaqfranz_pippo.zpaq c:\zpaqfranz c:\stor -longpath -key pippo -fasttxt -noeta -big >>c:\stor\result.txt
Now we want to list all the versions, just to make sure the update is done (few things are worse than a backup update that does not update anything)
c:\stor\bin\zpaqfranz i k:\franco\test\zpaqfranz_pippo.zpaq -key pippo -noeta >>c:\stor\result.txt
Now we want to (locally) test the archive.
Please note: locally. The password "pippo" is NOT sent over internet
c:\stor\bin\zpaqfranz t k:\franco\test\zpaqfranz_pippo.zpaq -key pippo -noeta -big >>c:\stor\result.txt
OK, we make the same thing, for a second archive file (just an example) k:\franco\test\nz_pippo.zpaq
c:\stor\bin\zpaqfranz a k:\franco\test\nz_pippo.zpaq c:\nz -longpath -key pippo -fasttxt -big >>c:\stor\result.txt
c:\stor\bin\zpaqfranz i k:\franco\test\nz_pippo.zpaq -key pippo -noeta >>c:\stor\result.txt
c:\stor\bin\zpaqfranz t k:\franco\test\nz_pippo.zpaq -key pippo -noeta -big >>c:\stor\result.txt
Now we upload everything with --append
Only the data changed from the last run will be sended over rsync (on ssh) to the remote Linux box
This will usually takes minute
c:\stor\bin\rsync -e "c:\stor\bin\ssh.exe -p 22 -i c:\stor\bin\thekey" -I -r --append --partial --progress --chmod=a=rwx,Da+x /k/franco/test/ [email protected]:/home/theuser/copie/test/ >>c:\stor\result.txt
Now we enforce the upload of the *.txt files (forcing to "refresh" the *_crc32.txt) with --checksum
c:\stor\bin\rsync -e "c:\stor\bin\ssh.exe -p 22 -i c:\stor\bin\thekey" -I -r --include="*.txt" --exclude="*" --checksum --chmod=a=rwx,Da+x /k/franco/test/ [email protected]:/home/theuser/copie/test/ >>c:\stor\result.txt
Now we get the size (of the /home/theuser) folder, and the space free, with the command s
BEWARE you may need something like /usr/local/bin/zpaqfranz, it depend on PATH
c:\stor\bin\ssh -p22 -i c:\stor\bin\thekey [email protected] zpaqfranz s /home/theuser >>c:\stor\result.txt
Run some other remote command, for example ls all things (zpool status, df -h, whatever, just an example)
echo --------- >>c:\stor\result.txt
c:\stor\bin\ssh -p22 -i c:\stor\bin\thekey [email protected] ls -l '/home/theuser/copie/test/*' >>c:\stor\result.txt
And now remotely test (by CRC-32) the uploaded *.zpaq, with the _crc32.txt, NO PASSWORD sent
c:\stor\bin\ssh -p22 -i c:\stor\bin\thekey [email protected] zpaqfranz versum '/home/theuser/copie/test/*.zpaq' -fasttxt -noeta -big >>c:\stor\result.txt
Now well'do a very dirty trick, counting the OK in the output log, with grep
In this example should be 5
Beware: you need the very latest zpaqfranz here (58.5m+)
We make two of them, one for the body, one for the attachment of the email
echo ==================================== >>c:\stor\result.txt
echo ============ COUNT OK =========== >>c:\stor\result.txt
echo ==================================== >>c:\stor\result.txt
echo 5 >c:\stor\countok.txt
echo 5 >c:\stor\countbody.txt
c:\stor\bin\egrep "# # ###!" c:\stor\result.txt -c >>c:\stor\countok.txt
c:\stor\bin\egrep "# # ###!" c:\stor\result.txt -c >>c:\stor\countbody.txt
c:\stor\bin\zpaqfranz last2 c:\stor\countok.txt -big >>c:\stor\result.txt
Pack the report with 7z (reports can become very BIG in case of errors)
...
Windows 32/64 binary,HW accelerated, ESXi, Linux, Free/Open BSD
New command consolidatebackup
Convert multiple .zpaq chunks into one backup, or convert a single .zpaq to the new backup format
Convert archive to backup consolidatebackup z:\foo.zpaq -to k:\newbackup -key pippo
New switch -checktxt for command versum
Compare the md5s with the .zpaq (s), taking from
- filename.md5
- filename_md5.txt
zpaqfranz a prova.zpaq c:\dropbox -checksum
zpaqfranz versum "*.zpaq" -checksum
Cross-check of rsync-transferred archives
H:\backup\abc\abc>zpaqfranz versum *.zpaq -checktxt
zpaqfranz v58.4s-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2023-06-23)
franz:versum | - command
franz:-checktxt -hw
66265: Test MD5 hashes of .zpaq against _md5.txt
66136: Searching for jolly archive(s) in <<*.zpaq>> for extension <<zpaq>>
66288: Bytes to be checked 72.114.708.571 (67.16 GB) in files 4
66323: OK: nas_email.zpaq
66323: OK: nas_gestione.zpaq
66323: OK: nas_nextcloud.zpaq
66323: OK: nextvm.zpaq
===========================================
66356: Total couples 4
66357: OK 4
66358: WARN 0
66359: ERROR 0
320.969 seconds (000:05:20) (all OK)
No error access denied for system volume information (on Windows)
Do not allow multiple instance of a running backup
Tries to prevent corruption of backups launched, for example, from a crontab
New switch --backupxxh3
Use xxh3 instead of md5 in backups. md5 is good, but xxh3 is faster
Bug fix
Various
Different update (projected size)
When adding data new infos
zpaqfranz a z:\pizza c:\zpaqfranz
zpaqfranz v58.4s-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2023-06-23)
franz:-hw
Creating z:/pizza.zpaq at offset 0 + 0
Add 2023-06-23 17:51:31 3.192 1.603.682.848 ( 1.49 GB) 32T (234 dirs)
Long filenames (>255) 1 *** WARNING *** (-fix255)
55.40% 00:00:03 ( 847.31 MB)->( 84.06 MB)=>( 151.73 MB) 211.83 MB/sec
Carefully look: the 1.49GB will be stored (linear projection) in 151.73MB
Support for wildcards (ex. *.zpaq) on *nix
The handling of wildcards is different between Windows and *nix. Basically in the second case the expansion is done, almost always, at the shell level. Now there are specific functions that-even on *nix-do an enumeration of files of the type *.zpaq. This is used, clearly, for commands such as multiple tests
zpaqfranz t "./*.zpaq"
BEWARE OF DOUBLE QUOTES!
Windows 32/64 binary, 64 bit-HW accelerated
Some bug fixing
Catching Control-C is not so easy or painless
Some kludges in -longpath
The mighty -longpath switch, on Windows, is for... paths. Therefore should not be used with... files, or wildcards. I realized I didn't spell it explicitly, that by "path" I meant... a "path"
Now this should be OK...
C:\Users\utente>zpaqfranz a z:\ok.zpaq * -longpath
C:\Users\utente>zpaqfranz a z:\ok.zpaq *.* -longpath
C:\Users\utente>zpaqfranz a z:\ok.zpaq c: -longpath
C:\Users\utente>zpaqfranz a z:\ok.zpaq c:* -longpath
C:\Users\utente>zpaqfranz a z:\ok.zpaq c:*.* -longpath
C:\Users\utente>zpaqfranz a z:\ok.zpaq c:\users\utente -longpath
C:\Users\utente>zpaqfranz a z:\ok.zpaq c:\users\utente\ -longpath
C:\Users\utente>zpaqfranz a z:\ok.zpaq c:\users\utente\* -longpath
C:\Users\utente>zpaqfranz a z:\ok.zpaq c:\users\utente\*.* -longpath
With explicit fail otherwise
C:\Users\utente>zpaqfranz a z:\ok.zpaq *.txt -longpath
zpaqfranz v58.3c-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2023-05-08)
franz:-longpath
38992: INFO: getting Windows' long filenames
59854: -longpath does not work with *, select A PATH!
0.015 seconds (00:00:00) (with warnings)
Work-in-progress: "smarter" zfsproxbackup
_This version includes a "smarter" (so to speak, of course) parser for searching the path of virtual machines zfs-stored (in files, NOT on block zfs devices) on proxmox. Far from perfection, in fact. Just an improvement.
As someone might guess I'm increasing my cloud server fleet :)
Next week I should get a rather big one (~16TB) for further development, stay tuned if you are a "proxxymoxxy" _
Windows 32/64 binary, 64 bit-HW accelerated
zpaqfranz now...
- "hopefully" intercept control-c to delete empty 0 bytes long chunks
- "hopefully" automagically delete 0 bytes long chunks before run
- "hopefully" intercept control-c to delete 0 bytes long archives
- get a better scanning... update (every 1 sec)
New hasher QUICK (just a fake hash!)
zpaqfranz sum j:\ -quick -summary -ssd
This is a "fake" hash, better a similarity estimator.
For (smaller than 64KB) file get a full xxhash64, for larger one takes xxhash64 of 16KB (head), 16KB (middle), 16KB (tail).
The use, as can be understood (!), is twofold
1) Rapid estimation of file-level duplication of very large amounts of data
Using "exact" systems, i.e., calculating the hashes of each individual file to search for duplicates, is (still) very slow and expensive
"Quick" hashing, of course, does not guarantee against "wrong collision" at all (this happens even for small amounts of data)
The effect is to depend more on the number of files than on their size, running @ 50GB/s or even more, much more.
Sometimes you want to quickly "understand" if a new file server can benefit from de-duplication
2) fast check for backups
New command backup
As everyone knows (or may be not) my very first contribute to zpaq was a rudimentary implementation of multipart archives, then merged by Mahoney (with his usual high skill).
Unfortunately, however, zpaq is more an archiver rather than a backup system: there are no realistic ways to check the integrity of multipart archives.
There are critical cases where you want to do cloud backups on systems that do NOT allow the --append of rsync (OK, rclone and robocopy, I'm talking about you)
Similarly, computing hashes on inexpensive VPS cloud systems, usually with very slow disks is difficult, already for sizes around ~50GB
This new release creates a text-based index file that keeps the list of multiparts, their size, their MD5 and their quick hash
Multipart backup with zpaqfranz
zpaqfranz backup z:\prova.zpaq *.cpp
Will create automagically create
- a multipart archive starting from prova_00000001.zpaq, prova_00000002.zpaq, prova_00000001.zpaq...
- a textfile index prova_00000000_backup.txt
- a binary index prova_00000000_backup.index
Why? Spiegone is coming...
When you use "?" inside filename, you will get a multipart archive
zpaq a z:\pippo_???????.zpaq *.txt
Every new version, in zpaq, is just appended to the archive, but in this case the file is "splitted" in "pieces".
This is almost perfect for a rclone / rsync (without --append) / robocopy, whatever, to send the minimum amount of data.
So far, so good.
BUT
zpaq does not handle very well
- the zero length: if you press control-C during compression, a 0-bytes long pippo_00000XX.zpaq is (can) be made
- the "hole" (a missing "piece", pippo001, pippo002, pippo007, pippo008...)
- mixing different backups. You can replace one piece of a zpaq multipart archive with another, and zpaq will joyfully consider it, without noticing the error (!). Since each session is "self-sufficient" zpaq not only does not warn the user, but in the case of encryption (i.e., with -key) nasty thing happens.
- cannot really (quickly) check the archive for missing parts: if a "piece" is lost, it is possible that everything (from those version to the last) is lost too. Even more, if you hold data from third-party clients, for testing an encrypted archive you need the password, which you simply don't have. And 99.9 percent of backups are encrypted, even the one on LAN-connected NASes.
- speed. If you have a 10.000 "pieces" backup, splitted in 10.000 "chunks", with zpaq you really cannot say if everything is OK, unless you run a (lengthy) full-scale test, this can take hours (ex. virtual machine disks)
Therefore...
New command testbackup
zpaqfranz testbackup z:\prova.zpaq
This command does a lot of different things, with either the optional switches
- -verify enforce a full MD5 check
- -ssd for multithreaded run (on solid state)
- -verbose show infos
- -range from:to to check only "some" pieces
- -to where-the-zpaq-pieces-are
- -paranoid
WHAT?
The answer is how quickly testing remote "cloud" backups: usually you will
- zpaqfranz to a local drive
- robocopy / rsync / zpaqfranz r to a "remote" location
- run a remote script (to locally check, locally in the cloud server) || download the remote file, locally, then check back
The last point is the key: getting a smaller file (the last multipart) makes everything much faster.
You can md5sum the "remote" file, comparing against the stored MD5, that's it
Currently (before 58.2) you need to do a full-hash of the entire archive (that can become quite big). Not a big deal for a full-scale Debian o FreeBSD server.
I hope this is clear (?), I'll post a full real world-wiki example here
A few examples, better than a thousand words
zpaqfranz testbackup z:\prova
Use the "quick hash" to check if all the pieces are the exact size, and "seems" to be filled with the right data. Almost instantaneous
zpaqfranz testbackup z:\prova -verify
Check all pieces with MD5. Now if everything is OK you are almost sure. In this case the files are expected in the same position of creation
zpaqfranz testbackup z:\prova -paranoid
Compare the binary index vs the zpaq parts. If the data match perfectly you can be confident. For encrypted volumes the password is needed by -key
zpaqfranz testbackup z:\prova -verify -ssd -to z:\restored
Test MD5 (in multithread mode), searching the .zpaqs inside z:\restored
zpaqfranz testbackup z:\prova -range 10: -verify
Check from chunk 10 until the last (range examples: -range 3:5 -range :3, range 10:)
New command last
This will return the last partname, usually for scripting
zpaqfranz last z:\prova_????????
New command last2
Compare the last 2 rows of a textfile, assuming hash name. As you can guess, it facilitates, in scripted backup processing, the comparison of remote hashes with local ones. Refer to the example wiki, I will put up some working scripts.
zpaqfranz last2 c:\stor\confronto.txt -big
New sum switches
To make a md5sum-like, you can use a barrage of switches
zpaqfranz sum *.txt -md5 -pakka -noeta -stdout -nosort
Do not forget -ssd for non spinning drives
FAQ
Is this a panacea?
Of course NOT
Personally, I don't like splitting backups into many different parts at all, the risk of one being lost, garbled or corrupted is high
However, in certain cases, there is no alternative. I do not mention one of the most well-known Internet service providers, to avoid publicity (after all... they do not pay me :)
Better a or backup ?
I use both of them
I am thinking of an evolution of multipart with error correction (not detect, correction), but the level of priority is modest
Why the ancient MD5?
Unless now zpaqfranz use XXH3 for this kind of detection (-checktxt)
But, sometimes, you must choose the fastest among "usual" ones (spoiler: some cheap cloud vendors)
Windows 32/64 binary,HW accelerated, Linux, FreeBSD
This is a brand new branch, full of bugs, ehm "features" :)
HW accelerated SHA1/SHA2
Up to version 57 the hardware acceleration was only available for the Windows version (zpaqfranzhw.exe)
From version 58 (obviously still to be tested) it also becomes activatable on different systems (newer Linux-BSD-based AMD/Intel), via the compilation switch -DHWSHA2
zpaqfranz (should) then autodetect the availability of those CPU extensions, nothing is needed by the user
It is possible to enforce with the -hw
To see more "things" use b -debug
TRANSLATION
If you compile with -DHWSHA2 you will get something like that
zpaqfranz v58.1e-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2023-03-21)
In this example this is a INTEL (JIT) executable, with (kind of) GUI (on Windows), with HW BLAKE3 acceleration, SHA1/2 HW acceleration, win SFX64 bit module (build 55.1)
So far, so good
Then run
zpaqfranz b -debug
If you are lucky you will get something like
(...)
zpaqfranz v58.1e-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2023-03-21)
FULL exename <<C:/zpaqfranz/release/58_1/zpaqfranz.exe>>
42993: The chosen algo 3 SHA-1
1838: new ecx 2130194955
1843: new ebx 563910569
SSSE3 :OK
SSE41 :OK
SHA :OK
DETECTED SHA1/2 HW INSTRUCTIONS
(...)
zpaqfranz will "automagically" runs HW acceleration, because your CPU does have SSSE3, SSE4.1, and SHA extension
Of course if you get a "NO"... bye bye
This kind of CPUs should be AMD Zen family (Ryzen, Threadripper, etc), Intel mobile 10th+, Intel desktop 11th+ generation
BTW the old zpaqfranzhw.exe (Win64) is
zpaqfranz v58.1e-JIT-GUI-L,HW BLAKE3,SHA1,SFX64 v55.1,(2023-03-21)
Beware: this is SHA1 acceleration, NOT SHA1/2. Therefore you will need to enter the -hw switch manually (to enable)
RECAP
- With -DHWSHA2 enabled, zpaqfranz will detect and use the HW acceleration, if it thinks your CPU supports it
- If, for some reason, you want to force its use, even on CPUs that do not officially have these extensions, use the switch -hw; usually you will get a segmentation fault or something like that (depending on the operating system), not my fault
- If you want to know if zpaqfranz "thinks" that your CPU is enabled, use zpaqfranz b -debug and look at the output
- Will you get a huge improvement in compression times? No, not really. You will have the biggest difference if you use SHA256 hashing functions, which benefit so much from the acceleration. SHA1 much less (the software version is already very fast)
- Is -DHWSHA2 faster than -DHWSHA1 ? In fact, no. SHA1 is "just a tiny bit" faster. Why? Too long to explain.
- Why does my even relatively modern Intel CPU not seem to support it? Who knows, the short version: not my fault. Even relatively recent CPUs have not been equipped by the manufacturer Intel
- Does it work on SPARC-ARM-PowerPC-whatever-strange-thing? Of course NO
- Is it production-safe? Of course NOT. As the very first release some nasty things can happend
Luke, remember. The more feedback, the more bug-fixing. Luke, report bugs, use the Force...
And don't forget the github star and sourceforce review! (I am becoming like a youtuber who invites people to subscribe to channels LOL)