You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
cksum has some really weird and funky implicit tags going on, see #6256
So let's figure out what exactly cksum is doing.
$ ../gnu/src/cksum -a md5 --tag README.md # This is the tagged format:MD5 (README.md) = add2d697731ef0facc3a56207aa03a9b
$ ../gnu/src/cksum -a md5 README.md # tagged by default:MD5 (README.md) = add2d697731ef0facc3a56207aa03a9b
$ ../gnu/src/cksum -a md5 --text README.md # tagged+text is a problem:../gnu/src/cksum: --text mode is only supported with --untaggedTry '../gnu/src/cksum --help' for more information.[$? = 1]
$ ../gnu/src/cksum -a md5 --text --tag README.md # tagged+text is not a problem?!MD5 (README.md) = add2d697731ef0facc3a56207aa03a9b
So yes, something funny is going on. Let's just brute-force all possible 1024 + 256 + 64 + 16 + 4 + 1 combinations of zero to five arguments (--binary, --text, --tag, --untagged), and visualize the behavior as a graph:
(legend: edges are marked b/t/T/U for binary/text/Tag/Untagged, and vertices are the observed behavior: E/T/A/S for Error/Tagged/UntaggedSpace/UntaggedAsterisk)
First, observe that -b/-t seems to be doing precisely what we would hope for: toggle between binary/text mode. Good!
Next, observe that --tag/--untagged seems to be the flags that have the weird behavior attached to them. In particular, the T state seems to be more that one actual state, probably differentiated along the "text-binary-axis".
Removing --untagged from the brute-force search reveals that --tag always pulls the state in the binary direction:
Removing --binary from the brute-force search reveals that --untagged always pulls the state away from E (so a binary-ish direction), but A is unreachable ("Asterisk", which indicated a binary file in the untagged format):
Hypothesis: There are three steps along the "text-binary-axis": always-binary, always-text, and binary-ish. For simplicity, let's assume the same thing along the tagged-ness-axis.
By the previous observations, --tagged implies either always-binary or binary-ish. (Probably "binary-ish".)
Ending in bU does not determine the result:
bU outputs A
TbU outputs S
UbU outputs A
bUbU outputs A
TUbU outputs A
UTbU outputs S
Therefore, U does not set the binary-ness to a constant, but rather depends on the tagged-ness. Huh?
Assuming that we start with "tagged-ish" and T/U set "always-tagged/always-untagged", this means that "tagged-ish" and "always-untagged" do not interfere with the binary-ness, but in the "always-tagged" state it sets "binary-ish". What a surprising decision! (It probably made sense at the time it was written, and is probably also why it is no longer listed in --help.)
… and that finally predicts the correct behavior without any exceptions, hooray!
cksum has some really weird and funky implicit tags going on, see #6256
So let's figure out what exactly cksum is doing.
So yes, something funny is going on. Let's just brute-force all possible 1024 + 256 + 64 + 16 + 4 + 1 combinations of zero to five arguments (
--binary
,--text
,--tag
,--untagged
), and visualize the behavior as a graph:(legend: edges are marked b/t/T/U for binary/text/Tag/Untagged, and vertices are the observed behavior: E/T/A/S for Error/Tagged/UntaggedSpace/UntaggedAsterisk)
First, observe that
-b/-t
seems to be doing precisely what we would hope for: toggle between binary/text mode. Good!Next, observe that
--tag/--untagged
seems to be the flags that have the weird behavior attached to them. In particular, theT
state seems to be more that one actual state, probably differentiated along the "text-binary-axis".Removing
--untagged
from the brute-force search reveals that--tag
always pulls the state in the binary direction:Removing
--binary
from the brute-force search reveals that--untagged
always pulls the state away fromE
(so a binary-ish direction), butA
is unreachable ("Asterisk", which indicated a binary file in the untagged format):Hypothesis: There are three steps along the "text-binary-axis": always-binary, always-text, and binary-ish. For simplicity, let's assume the same thing along the tagged-ness-axis.
By the previous observations,
--tagged
implies either always-binary or binary-ish. (Probably "binary-ish".)Ending in
bU
does not determine the result:bU
outputsA
TbU
outputsS
UbU
outputsA
bUbU
outputsA
TUbU
outputsA
UTbU
outputsS
U
does not set the binary-ness to a constant, but rather depends on the tagged-ness. Huh?T/U
set "always-tagged/always-untagged", this means that "tagged-ish" and "always-untagged" do not interfere with the binary-ness, but in the "always-tagged" state it sets "binary-ish". What a surprising decision! (It probably made sense at the time it was written, and is probably also why it is no longer listed in--help
.)… and that finally predicts the correct behavior without any exceptions, hooray!
A simple piece of logic, but so much pain.
End result: https://github.com/BenWiederhake/worsethanfailure_cksum/blob/master/check_model.py#L19
The text was updated successfully, but these errors were encountered: