You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the case where binary files have a large text header, the current
checksum routine will treat said files as text files and normalize
line-endings before performing the checksum. Not only is it dangerous to
manipulate binary files like this, it also doubles the runtime of the
checksum routine, as every block of data must be read twice.
As noted in #3264, at the minimum, DVC should probably match
Git's text file detection routine, which interrogates the first 8 kilobytes
(and doesn't do heuristics on ratio of printable characters,
as DVC currently does).
The text was updated successfully, but these errors were encountered:
In the case where binary files have a large text header, the current
checksum routine will treat said files as text files and normalize
line-endings before performing the checksum. Not only is it dangerous to
manipulate binary files like this, it also doubles the runtime of the
checksum routine, as every block of data must be read twice.
As noted in #3264, at the minimum, DVC should probably match
Git's text file detection routine, which interrogates the first 8 kilobytes
(and doesn't do heuristics on ratio of printable characters,
as DVC currently does).
The text was updated successfully, but these errors were encountered: