Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backup command index path #109

Closed
sheckandar opened this issue Jun 20, 2024 · 17 comments
Closed

Backup command index path #109

sheckandar opened this issue Jun 20, 2024 · 17 comments
Labels
enhancement New feature or request

Comments

@sheckandar
Copy link

sheckandar commented Jun 20, 2024

First of all, I would like to thank you for this project. I've been using original zpaq for many years now and was glad to find your upgraded version with many useful features.

As I started testing zpaqfranz, I noticed that the backup command creates an index file and a hash file in the same directory as the archive file, however, the add command allows me to specify a path to an index file. Does that limitation have a technical explanation ? Or did I overlook a potion of the WiKi on how to do that ?

Our backup archives are stored in a B2 bucket and are locked for a period of time for protection and compliance purposes. This makes it impossible to append any data to any file in the bucket, only create new files.

So as you can see the current backup command cannot be used with such a setup.

I was wondering if you could add the ability to set a path for the index and checksum files.

@fcorbelli
Copy link
Owner

fcorbelli commented Jun 21, 2024

Sure, it is a suggestion I can implement
I'll change the .pid file too

@fcorbelli
Copy link
Owner

You can try the attached pre-release, using -index to specify "where" to write the data

BEWARE: putting index files in other folder will weaken the test!

zpaqfranz backup z:\ugo\prova *.cpp
zpaqfranz backup z:\ugo\prova *.txt -index c:\temp

this seems good, but it is broken

zpaqfranz testbackup z:\ugo\prova

you need something like this

zpaqfranz testbackup z:\ugo\prova -index c:\temp -paranoid

You should do

zpaqfranz backup z:\ugo\prova *.cpp -index c:\temp
zpaqfranz backup z:\ugo\prova *.txt -index c:\temp
zpaqfranz testbackup z:\ugo\prova -index c:\temp

1

59_9a.zip

=>Take care to pair .zpaq files with the correct indexes

Maybe I will add more heuristic checks in the future

@fcorbelli fcorbelli added the enhancement New feature or request label Jun 21, 2024
@fcorbelli
Copy link
Owner

OK, this is really interesting

The same source code, compiled on two different versions of gcc, runs in a different way
Digging underway...

@sheckandar
Copy link
Author

Just tested the new feature and it works as expected. Thank you.


I'm not sure about the issue with gcc compiler that you mentioned above.

I'm compiling on RedHat with gcc v8.5 and everything seems to work for me.

Let me know if you would like me to test something for you.

@graphixillusion
Copy link

For me this command doesn't work.

zpaqfranz backup backup\ *.mkv -index c:\Temp\
zpaqfranz v59.9b-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-06-21)
franz:-index                             c:/Temp/
*** WARNING: It's YOUR job to preserve _backup.index and _backup.txt! ***
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
part0 backup/_00000000.zpaq i_filename backup/_????????.zpaq
Multipart backup seems OK
part0 backup/_00000000.zpaq i_filename backup/_????????.zpaq

QUIT: total size,file/folder count == zero. Already archived/wrong/inaccessible source?
0.110 seconds (00:00:00) (with warnings)

@sheckandar
Copy link
Author

sheckandar commented Jun 22, 2024

I think you have a syntax error. Assuming you want to back up the "backup" folder in the current directory and name the zpaq archive backup_00000001.zpaq, the following syntax would be appropriate:

zpaqfranz backup "backup" "backup\" *.mkv -index "c:\Temp\"

Double quotes are required for all paths as far as I know.

Edit:

After I took a look at the log file you posted again, I think this is what would work for you:

zpaqfranz backup "backup" *.mkv -index "c:\Temp\"

@graphixillusion
Copy link

Nope. Doesn't work.

zpaqfranz backup "backup" *.mkv -index "c:\Temp"
zpaqfranz v59.9b-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-06-21)
franz:-index                              c:/Temp
*** WARNING: It's YOUR job to preserve _backup.index and _backup.txt! ***
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
part0 ./backup_00000000.zpaq i_filename ./backup_????????.zpaq
Multipart backup seems OK
part0 ./backup_00000000.zpaq i_filename ./backup_????????.zpaq

QUIT: total size,file/folder count == zero. Already archived/wrong/inaccessible source?
0.109 seconds (00:00:00) (with warnings)
zpaqfranz backup "backup" *.mkv -index "c:\Temp\"
zpaqfranz v59.9b-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-06-21)
franz:-index                             c:/Temp"
the folder (of -index) need to (already) exists c:/Temp"/
0.031 seconds (00:00:00) (with errors)

@sheckandar
Copy link
Author

Do you have any mkv files in the path in which zpaqfranz is being executed ?

If you execute zpaqfranz find *.mkv, does it list any mkv files ?

@graphixillusion
Copy link

graphixillusion commented Jun 22, 2024

Ah ok, now i have understood the logic. I thought that the command was structured like this:

zpaqfranz backup (command) backup\ (source folder to backup) *.mkv (filter the type of files to backup in the source folder) -index c:\Temp

Now i get that backup\ is the target folder where the backup will be stored and zpaqfranz wants the files to backup in the current folder. Now it works as expected.

@sheckandar
Copy link
Author

Almost. zpaqfranz syntax is quite flexible. You can simply specify full paths and back up any folder.

For example, let's assume I have my mkv files in C:\Videos, then the following syntax can be used:

zpaqfranz backup (the command) "C:\Backups\matroska_files" (path to directory where to save the archive and its name) "C:\Videos" (path to the folder to backup) *.mkv (filter) -index "C:\Temp" (path to directory where to save the index file)

The result will be an archive with path C:\Backups\matroska_files_00000001.zpaq
which will only contain matroska files from C:\Videos,
index file with path C:\Temp\matroska_files_00000000_backup.index
and the hash file with path C:\Temp\matroska_files_00000000_backup.txt

Hope that makes sense.

@fcorbelli
Copy link
Owner

In fact, it might be better to specify

zpaqfranz the command sequence of files/folder/wildcards to be added theswitches
zpaqfranz a z:\1.zpaq c:\myfirstfolder d:\thesecond *.cpp e:\thethird

This will a (add) in z:\1.zpaq 3 folders AND every .cpp files in the current directory

zpaqfranz a z:\1.zpaq c:\zpaqfranz -only *.exe

This will add only .exe files (from the c:\zpaqfranz folder) inside the z:\1.zpaq archive

zpaqfranz a z:\1.zpaq c:\zpaqfranz -only *.exe -only *.cpp

This will add only .exe and .cpp files (from the c:\zpaqfranz folder) inside the z:\1.zpaq archive

zpaqfranz a z:\1.zpaq c:\zpaqfranz -not *.exe -not *.zip

This will everything EXCEPT .exe and .zip

@fcorbelli fcorbelli reopened this Jun 23, 2024
@fcorbelli
Copy link
Owner

Therefore

zpaqfranz backup z:\thebackup.zpaq *.cpp

will backup all *.cpp (in the current folder)

zpaqfranz backup z:\thebackup.zpaq c:\nz

will take everything inside c:\nz

zpaqfranz backup z:\thebackup.zpaq d:\pluto e:\paperino -longpath

Will store d:\pluto and e:\paperino folder, with support for >255 paths

@fcorbelli
Copy link
Owner

BTW the backup command, with -index, only makes sense in limited cases (if you know what you are doing). The best choice is backup and that's it.
The backup command creates a multipart archive 'reinforced' with additional controls.
Multipart means that each execution creates an additional file, numbered progressively

I leave a few examples

zpaqfranz a z:\1.zpaq *.cpp
zpaqfranz a z:\1.zpaq *.bat
zpaqfranz a z:\1.zpaq *.txt

This will make ONE archive (1.zpaq) with 3 versions inside

zpaqfranz a z:\2_????.zpaq *.cpp
zpaqfranz a z:\2_????.zpaq *.bat
zpaqfranz a z:\2_????.zpaq *.txt

This will make THREE files (2_0001.zpaq, 2_0002.zpaq, 2_0003.zpaq) each with one version.
BTW: using 4 ? => will create _0001, 0002... You can use more (for example 8 ???????? => 00000001, 00000002...)

zpaqfranz backup z:\3.zpaq *.cpp
zpaqfranz backup z:\3.zpaq *.bat
zpaqfranz backup z:\3.zpaq *.txt

This will make FIVE files

3_00000000_backup.index

zpaq's index file

3_00000000_backup.txt

zpaqfranz's index file
3_00000001.zpaq, 3_00000002.zpaq,3_00000003.zpaq

@fcorbelli
Copy link
Owner

With archives (single or multipart) AND backups you can use the t (test) command
Beware of wildcards length (4 ?, or 8 ? in this example)

zpaqfranz t z:\1.zpaq
zpaqfranz t z:\2_????.zpaq
zpaqfranz t z:\3_????????.zpaq

With backups you can use the command testbackup

zpaqfranz testbackup z:\3.zpaq
zpaqfranz testbackup z:\3.zpaq -verify
zpaqfranz testbackup z:\3.zpaq -verify -ssd
zpaqfranz testbackup z:\3.zpaq -paranoid
zpaqfranz testbackup z:\3.zpaq -paranoid -verify -ssd

  • very quick, very dirty (non really sure)
  • slower, more robust
  • slower, more robust, multithreaded (use on SSD or NVMe disks, not on HDD!)
  • compare zpaq and zpaqfranz's indexes (more in-depth control)
  • more reliable test battery

@fcorbelli
Copy link
Owner

Why the backup command?
Because multipart archives are more fragile than single-file archives.

Suppose you have 5 different parts, the sequence
foo_0001.zpaq,
foo_0002.zpaq,
foo_0003.zpaq,
foo_0004.zpaq,
foo_0005.zpaq
then you delete/change/corrupt (for example) foo_0004.zpaq

Using the command t (test) will check parts 1, 2 and 3, and stop (as 4 is missing), saying that everything is OK.

If you created the archive with backup, you can check it with testbackup: you won't restore lost data, but you will know there is a problem. Otherwise you will think your backup is perfect, but it is not, and you cannot know.

@fcorbelli
Copy link
Owner

Finally, the best way to familiarise yourself is... read the manual (!) or even better watch the examples

Use

zpaqfranz h h

to get a list of commands (yes, it is h (help) on h (help))
1

If you want to see of backup command, run a

zpaqfranz h backup

3

If you want to see testbackup

zpaqfranz h testbackup

4

The examples cover all common cases and are added in new releases
For example, if you want to see what has changed for the command a (add) you will do

zpaqfranz h a

If you see examples that you are not familiar with, it means that some functions have been added

Sometimes switches are cryptic, I realise, such as -backupxxh3 in the backup command
If in doubt... you can always ask 😄

PS -backupxxh3 means: use a faster hash algorithm, XXH3, instead of the default one (MD5). The default one is here because it is compatible with Unix systems such as hetzner's storagebox. If you use Windows machines (with SSD or NVMe) -backupxxh3 is much faster.

@sheckandar
Copy link
Author

The main issue is resolved. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants