Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

diff between two relative path files still shows up as deleted+added #599

Closed
tstromberg opened this issue Nov 7, 2024 · 6 comments
Closed

Comments

@tstromberg
Copy link
Collaborator

I think we still have our logic a bit off if you are passing two filenames as arguments:

go run ./cmd/mal diff ../malcontent-samples/linux/clean/ls.x86_64 ../malcontent-samples/macOS/clean/ls
├─ 🟡 Deleted: ls.x86_64 [MEDIUM]
│     ≡ data [LOW]
│       🟢 compression/lzma — works with lzma files
│     ≡ discovery [LOW]
│       🟢 system/hostname_get — get computer host name: gethostname
│     ≡ execution [LOW]
│       🟢 shell/TERM — Look up or override terminal settings: TERM
│     ≡ filesystem [LOW]
│       🟢 link_read — read value of a symbolic link: readlink
│     ≡ networking [LOW]
│       🟢 url/embedded — contains embedded HTTPS URLs:
│            https://gnu.org/licenses/gpl.html, https://translationproject.org/team/, https://wiki.xiph.org/MIME_Types_and_Fil…
│     ≡ process [MEDIUM]
│       🟡 name_set — get or set the current process name: __progname
│
├─ 🟢 Added: ls [LOW]
│     ≡ execution [LOW]
│       🟢 shell/TERM — Look up or override terminal settings: TERM
│     ≡ filesystem [LOW]
│       🟢 directory/traverse — traverse filesystem hierarchy: _fts_children, _fts_close, _fts_open, _fts_read, _fts_set
│       🟢 link_read — read value of a symbolic link: readlink

similar to the diff(1) command, this should show up as a single changed file.

@tstromberg tstromberg changed the title diff between two files still shows up as deleted+added diff between two relative path files still shows up as deleted+added Nov 13, 2024
@egibs
Copy link
Member

egibs commented Nov 13, 2024

I think this comes down to the added/removed logic. The first file will be stored in the map with the key ls and we'll use that to do the lookup for ls.x86_64 and it won't match so it will be treated as an add/remove rather than a modify.

Using the diff example above:

relPath: ls
fromMap: map[ls.x86_64:0xc0001a4140]

Here's where we store the relative path in the map:

fromRelPath[rel] = files.Value

The lookup in processSrc:

tr, exists := dest[relPath]
if !exists {
d.Removed.Set(relPath, fr)
continue
}

The lookup in processDest:

fr, exists := from[relPath]
if !exists {
d.Added.Set(relPath, tr)
continue
}

@egibs
Copy link
Member

egibs commented Nov 13, 2024

If I store the value of rel with any extension trimmed, this works as described above:

$ go run cmd/mal/mal.go diff ../malcontent-samples/linux/clean/ls.x86_64 ../malcontent-samples/macOS/clean/ls
├─ 🔵 Changed: ../malcontent-samples/macOS/clean/ls [MEDIUM → LOW]
│     X data [LOW → NONE]
---     🔵 compression/lzma — works with lzma files
│     X discovery [LOW → NONE]
---     🔵 system/hostname — get computer host name
│     ≡ execution [LOW]
│       🔵 shell/TERM — Look up or override terminal settings: TERM
│     ≡ filesystem [LOW]
│       🔵 link_read — read value of a symbolic link: readlink
+++     🔵 directory/traverse — traverse filesystem hierarchy: _fts_children, _fts_close, _fts_open, _fts_read, _fts_set
│     X networking [LOW → NONE]
---     🔵 url/embedded — contains embedded HTTPS URLs
│     X process [MEDIUM → NONE]
---     🟡 name_set — get or set the current process name
│

Do we want to go this route? Two files of different types with the same name (sans extension) would show odd behavior drift but maybe we want that? Conversely, two files with different names may be the same underlying file and the diff would show as deleted/added when it would be more appropriate to show as changed.

@tstromberg
Copy link
Collaborator Author

FWIW, like diff(1), the file name shouldn't matter at all if two filenames are being passed into it. Filename logic only happens when directories are passed in.

@tstromberg
Copy link
Collaborator Author

I mentioned relative paths in the initial bug report, but this happens for absolute paths too:

go run ./cmd/mal/ diff ../malcontent-samples/linux/clean/ls.x86_64 /bin/ls

@egibs
Copy link
Member

egibs commented Nov 15, 2024

FWIW, like diff(1), the file name shouldn't matter at all if two filenames are being passed into it. Filename logic only happens when directories are passed in.

Ah okay, this clarifies things. I'll hack on that tomorrow.

@egibs
Copy link
Member

egibs commented Nov 15, 2024

Addressed via #628.

@egibs egibs closed this as completed Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants