Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Machine-parseable output (-terse ?) and failure to restore a file #120

Closed
luckman212 opened this issue Jul 24, 2024 · 31 comments
Closed

Machine-parseable output (-terse ?) and failure to restore a file #120

luckman212 opened this issue Jul 24, 2024 · 31 comments
Labels
documentation Improvements or additions to documentation enhancement New feature or request

Comments

@luckman212
Copy link

luckman212 commented Jul 24, 2024

Related to #63 I am trying to adapt my script to extract a specific version of a file in my archive. I'm using macOS 14.5, zpaqfranz v60.5e-NOJIT-L(2024-07-20) from Homebrew on M1 Mac.

For example, I want to restore version 2178 of IMG-20240516101824054.png below:

/tmp $ zpaqfranz l backup.zpaq -find "20240516101824054.png" -all -terse
2024-05-16 14:18:23  0644             207.434   87% 2178|+  /Users/luke/Sync/Obsidian/Main/attach/zzzzzzzzzzzzzzzzzzzzzzzzzz/IMG-20240516101824054.png
deleted/inacessible                         0   del 2187|-  /Users/luke/Sync/Obsidian/Main/attach/zzzzzzzzzzzzzzzzzzzzzzzzzz/IMG-20240516101824054.png

What is the most efficient command to do this based on the output? I noticed the fields are space-delimited (not tab) which makes it harder to parse (what if file name contains spaces?) I believe -terse should use NUL byte or TAB as field delimiters.

Also, if I don't specify -all then I get zero output. Is that correct? (Yes the original file was deleted). Is there any way to find or list only recoverable files? This is why I am trying to parse the output of -all by the way, since I need to filter out anything that says deleted/inacessible)

@luckman212
Copy link
Author

luckman212 commented Jul 24, 2024

By the way this is the crazy insane pipeline I am using now to choose the file for restore (I will pass the version and filename to zpaqfranz x ...)

zpaqfranz l backup.zpaq -find "20240516101824054.png" -all -terse |
  grep -v ^deleted |
  cut -c53- |
  awk 'BEGIN { FS="|"; OFS="\t" } { print $1, substr($2,4) }' |
  fzf --exact --multi --no-select-1 --header "Ver$'\t'Filename"

😐

@luckman212
Copy link
Author

And yet even with all that hand-waving, I still can't figure out how extract the file I need...

FAIL 1

$ zpaqfranz x backup.zpaq /Users/luke/Sync/Obsidian/Main/attach/zzzzzzzzzzzzzzzzzzzzzzzzzz/IMG-20240516101824054.png -range 2178 -to /tmp/foo.png
zpaqfranz v60.5e-NOJIT-L(2024-07-20)
franz:-range                                 2178
franz:rangefrom (version)                   2.178
franz:rangeto   (version)                   2.178
franz:-to                   <</tmp/foo.png>>

backup.zpaq:
2718 versions, 86.547 files, 1.230.674.231 bytes (1.15 GB)
Extract 0 bytes (0.00  B) in 0 files (0 folders) / 8 T
Path does not exists   /tmp/foo.png
Getting free space for /tmp/

1.493 seconds (00:00:01) (all OK)

FAIL 2

$ zpaqfranz x backup.zpaq /Users/luke/Sync/Obsidian/Main/attach/zzzzzzzzzzzzzzzzzzzzzzzzzz/IMG-20240516101824054.png -range 2178 -to /tmp/
zpaqfranz v60.5e-NOJIT-L(2024-07-20)
franz:-range                                 2178
franz:rangefrom (version)                   2.178
franz:rangeto   (version)                   2.178
franz:-to                   <</tmp/>>
MAGIC: selected 1 file extracting to a folder => merge to /tmp/IMG-20240516101824054.png

backup.zpaq:
2718 versions, 86.547 files, 1.230.674.231 bytes (1.15 GB)
Extract 0 bytes (0.00  B) in 0 files (0 folders) / 8 T
Path does not exists   /tmp/IMG-20240516101824054.png
Getting free space for /tmp/

1.483 seconds (00:00:01) (all OK)

FAIL 3

$ zpaqfranz x backup.zpaq /Users/luke/Sync/Obsidian/Main/attach/zzzzzzzzzzzzzzzzzzzzzzzzzz/IMG-20240516101824054.png -range 2178 -to /tmp
zpaqfranz v60.5e-NOJIT-L(2024-07-20)
franz:-range                                 2178
franz:rangefrom (version)                   2.178
franz:rangeto   (version)                   2.178
franz:-to                   <</tmp>>
Cannot write on <<-to /tmp>>
519910: Aborting. Use -space to bypass and enforcing.
0.001 seconds (00:00:00) (with errors)

The .zpaq archive seems ok...

$ zpaqfranz t backup.zpaq
zpaqfranz v60.5e-NOJIT-L(2024-07-20)

backup.zpaq:
2718 versions, 86.547 files, 1.230.674.231 bytes (1.15 GB)
To be checked 1.352.276.056 in 22.853 files (8 threads)
7.15 stage time      33.19 no error detected (RAM ~128.52 MB), try CRC-32 (if any)
Checking            23.266 blocks with CRC-32 (1.352.276.056 not-0 bytes)
Block 00022K          1.22 GB
CRC-32 time           0.39s
Blocks       1.352.276.056 (      23.266)
Zeros                    0 (           0) 0.000000 s
Total        1.352.276.056 speed 3.494.253.374/s (3.25 GB/s)
GOOD            : 00022853 of 00022853 (stored=decompressed)
VERDICT         : OK                   (CRC-32 stored vs decompressed)
33.572 seconds (00:00:33) (all OK)

@luckman212 luckman212 changed the title Machine-parseable output (-terse ?) Machine-parseable output (-terse ?) and failure to restore a file Jul 24, 2024
@fcorbelli
Copy link
Owner

I am a bit confused, there are numerous topics (already answered, but rewriting)

1) Extracting a single file to a SINGLE FILE

Please note: this is A FILE, not A FILE TO A FOLDER

You can use more than one way (usually on the "complexity" of the filename)

The "standard" way: use the fullname and -to A FILE (not a folder)

zpaqfranz x thearchivename THEFULLFILENAME -to THEFULLEXTRACTEDFILENAME -until THEVERSIONYOU WANT

Let's suppose you want to extract the file f:/zarc/inctrl/readme.txt of the version 659

zpaqfranz x copia_zarc.zpaq f:/zarc/inctrl/readme.txt -to z:/the_restored_file.txt -until 659

@fcorbelli
Copy link
Owner

2) A file TO A FOLDER

In this example in the z:\ugo folder. Please note the -only

zpaqfranz x copia_zarc.zpaq -only f:/zarc/inctrl/readme.txt -to z:\ugo -until 659

If the filename is unique you can use *filename to extract to A FOLDER (in this example the z:\allread)

zpaqfranz x copia_zarc.zpaq -only *readme_123.txt -to z:\allread -until 659

@fcorbelli
Copy link
Owner

Also, if I don't specify -all then I get zero output. Is that correct?
Yes, it is
If you list an archive, without anything (aka: zpaqfranz l thearchive.zpaq) you will get the current content

If you use "something" (-all, -range or whatever) then you will go to "show-everything-in-the-archive"

@luckman212
Copy link
Author

Ok, this worked:

zpaqfranz x backup.zpaq -only /Users/luke/Sync/Obsidian/Main/attach/zzzzzzzzzzzzzzzzzzzzzzzzzz/IMG-20240516101824054.png -to /private/tmp/zzz -until 2178

I feel very dumb when trying to figure out zpaqfranz syntax.

Thank you for the working command. 🙏

@fcorbelli
Copy link
Owner

When running on *nix beware that a -space should be handy extracting to non-existent paths

TRANSLATION

when you extract something to /my/good/path zpaqfranz will try to figure if /my/good/path does exists, is writeable and there is enough freespace.

This is easy for Windows, virtually impossible for *nix.
It can therefore happen that you get a resounding failure not because of some mistake, but because (for some reason, even of rights) zpaqfranz cannot figure out whether a certain path is “good”
In that case with -space you bypass everything: zpaqfranz tries to write, and good night

If you wonder why it is relative to the risk of filling a path, that is, running out of free space. This is a very frequent nightmare for people making batch copies. The execution ends, but the written file is incomplete, and therefore unserviceable

Short version: if in doubt, put -space

@fcorbelli
Copy link
Owner

The syntax of zpaqfranz is indeed strange, but it was introduced by its originator (Dr. Mahoney) and I have retained it for backward compatibility, with some mitigation

Remember that you can extract a PATH TO A PATH and a FILE TO A FILE, but you cannot extract a FILE TO A PATH
Except by using the “trick” of -only
Remember that you can have multiple -only, and multiple -not as well.
And that on *nix machines it is good to use the “ if you use wildcards
-only *foo.jpg is no good, on Macs.
-only “*foo.jpg” on the other hand is

Last but not least (!) zpaq use -to, and zpaqfranz added -find and -replace to manipulate paths

@fcorbelli
Copy link
Owner

This should be OK (assuming only a single file with this name)

zpaqfranz x backup.zpaq -only "*IMG-20240516101824054.png" -to /private/tmp/zzz -until 2178 -space

@fcorbelli
Copy link
Owner

On -terse
This is a fixed-width output
Should (??) easy to parse
The latest zpaqfranz's output is variable-sized (the size columns grow if needed)

Is there any way to find or list only recoverable files?
Every file with size>0 is recoverable
Yes, ... but... WHEN?

Suppose you have a $$$$$$$$$$$$$$$$$$.cpp file
WHEN it was seen... the last time?

C:\zpaqfranz>zpaqfranz l z:\pippo.zpaq -all -only "*$$$$*"
zpaqfranz v60.6c-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2024-07-24)
franz:-all                                      4
franz:-only                                *$$$$*
----------------------------------------------------------------------------------------------------
franz:-hw

z:/pippo.zpaq:
4 versions, 19 files, 1.365.228 bytes (1.30 MB)


   Date      Time   Size Ratio  Ver Name/Info
---------- -------- ---- ----- ---- ----------
2024-07-24 12:52:31 3.697.163   13% 0001|+ $$$$$$$$$$$$$$$$$$.cpp
deleted/inacessible    0   del 0002|- $$$$$$$$$$$$$$$$$$.cpp
2024-07-24 19:03:24 3.695.705   13% 0003|+ $$$$$$$$$$$$$$$$$$.cpp
deleted/inacessible    0   del 0004|- $$$$$$$$$$$$$$$$$$.cpp

            7.392.868 (7.05 MB) of 7.392.868 (7.05 MB) in 4 files shown
            1.365.228 compressed  Ratio 0.185 <<z:/pippo.zpaq>>
0.015 seconds (00:00:00) (all OK)

Version 3 is the very last recoverable version of this file
Of course you can get version 1 too (if you want)

@fcorbelli
Copy link
Owner

You want ALL the versions of this file
This will create one folder for version (default padded to 4)

zpaqfranz x z:\pippo.zpaq -only "*$$$.cpp" -all -to z:\allz

This is padded to 8

zpaqfranz x z:\pippo.zpaq -only "*$$$.cpp" -all 8 -to z:\allz8

@fcorbelli
Copy link
Owner

BTW

zpaqfranz h x

1

@fcorbelli fcorbelli added the documentation Improvements or additions to documentation label Jul 24, 2024
@fcorbelli
Copy link
Owner

The -find is just about |grep on systems where grep does not exists
double

@luckman212
Copy link
Author

Thank you for all of this. I didn't mean to accuse you of creating this arcane syntax. I understand it was inherited.

Back to what is actually my original question. About -terse

Yes I see it is fixed-width, hence I am "parsing" it using simple tools like cut and awk. Ok if that is the suggested method.

But, would you consider adding a flag to convert the space-delimited output to tabs instead? It would make the parsing more reliable in my opinion.

@fcorbelli
Copy link
Owner

But, would you consider adding a flag to convert the space-delimited output to tabs instead? It would make the parsing more reliable in my opinion.

Of course I can
Do you want a CSV-delimited, or simply TABS between every columns?
BTW
you got me thinking about the lack of a switch (one of a thousand!) to enumerate the versions in which there is a file (without those in which it is deleted).
Like -enumerate “$$$$”

@luckman212
Copy link
Author

Yes TAB between each column would be ideal. I always prefer TAB because filenames can contain commas.

👍 of course ... -enumerate or -extractable would be a great addition.

@fcorbelli
Copy link
Owner

Yes TAB between each column would be ideal. I always prefer TAB because filenames can contain commas.

👍 of course ... -enumerate or -extractable would be a great addition.

The very best is | (cannot be in filename) but it is hard to parse
Added -nodel (do not show deleted files), working on -tab...

@luckman212
Copy link
Author

The very best is | (cannot be in filename)

You sure about that? 😉

image

c8e7cfee063457befbd0738cdfcd469296d8980e

@fcorbelli
Copy link
Owner

I do not use Mac 😄

OK, a programmable one...

@luckman212
Copy link
Author

Best is probably the NUL byte (\0) but TAB is a close second I think.

If you would like a Mac to test with I can ship you my old MacBook Air 2015 (only runs up to macOS 12.x Monterey but otherwise works fine) for free.

@fcorbelli
Copy link
Owner

Houston we have a problem with \t 😄
No big deal, require a bit of spaghetti code...

PS thank you, but I have a PowerPC Minimac (!!!!!!!) per hard-code tests

@fcorbelli
Copy link
Owner

60_6e.zip
Please check the attached pre-release

zpaqfranz l z:\pippo.zpaq -terse -csv "\t"
zpaqfranz l z:\pippo.zpaq -terse -csv "|"
zpaqfranz l z:\pippo.zpaq -terse -csv ","
zpaqfranz l z:\pippo.zpaq -terse -csv "\",\""

Do you want a string AFTER the file name?

@luckman212
Copy link
Author

luckman212 commented Jul 24, 2024

Thank you! Almost, but not quite (spaces should not be there, and delimiter should not be repeated, just 1 per field):

$ zpaqfranz l backup.zpaq -find "20240516101824054.png" -terse -csv "|"
2024-05-16 14:18:23 | 0644 |            207.434 |  87% |+  /Users/luke/Sync/Obsidian/Main/attach/IMG-20240516101824054.png

should be

2024-05-16 14:18:23|0644|207.434|87%|+|/Users/luke/Sync/Obsidian/Main/attach/IMG-20240516101824054.png

Do you want a string AFTER the file name?

No.

@fcorbelli
Copy link
Owner

mmmhhh... I will finish tomorrow
60_6f.zip

@fcorbelli
Copy link
Owner

60_6g.zip

@luckman212
Copy link
Author

luckman212 commented Jul 24, 2024

60_6g looks pretty good! Only thing I see is an extra space before the file mode. Not sure if that's intentional

image

One other thing, the version and +/- column are printed joined together instead of as separate columns when using -csv:

$ zpaqfranz l backup.zpaq -find "20240516101824054.png" -all -nodel -terse -csv '|'
2024-05-16 14:18:23| 0644|207.434|87%|2178+|/Users/luke/Sync/Obsidian/Main/attach/zzzzzzzzzzzzzzzzzzzzzzzzzz/IMG-20240516101824054.png
2024-05-16 14:18:23| 0644|207.434|87%|2719+|/Users/luke/Sync/Obsidian/Main/attach/IMG-20240516101824054.png
$ ./zpaqfranz l backup.zpaq -find "20240516101824054.png" -all -nodel -terse
2024-05-16 14:18:23  0644             207.434   87% 2178|+ /Users/luke/Sync/Obsidian/Main/attach/zzzzzzzzzzzzzzzzzzzzzzzzzz/IMG-20240516101824054.png
2024-05-16 14:18:23  0644             207.434   87% 2719|+ /Users/luke/Sync/Obsidian/Main/attach/IMG-20240516101824054.png

Is that intentional?

@fcorbelli
Copy link
Owner

  1. no, it is a different attr printout (instead of windows)
  2. do not know, I'll dig now
    1

@fcorbelli
Copy link
Owner

1

60_6g.zip

With -terse -all I cut off the version infos. Much easier parsing

@fcorbelli fcorbelli added the enhancement New feature or request label Jul 25, 2024
@luckman212
Copy link
Author

60_6g working perfectly! A thing of beauty!

@luckman212
Copy link
Author

Update: still working well. I would consider this solved @fcorbelli

Thank you very much again for the wonderful tool.

@Lennart00
Copy link

Just to chip in - this looks great adding more things to programmatically control and build on top of zpaqfranz :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants