viv: insn: string: handle viv bug around substrings #1273

williballenthin · 2023-01-09T13:02:44Z

try to detect the case in which viv incorrectly returned a substring length and truncate the returned data appropriately. i think this will work better for ASCII than for UTF16 because we can better guess about embedded NULL bytes.

Checklist

No CHANGELOG update needed

No new tests needed

No documentation update needed

github-actions

Please add bug fixes, new features, breaking changes and anything else you think is worthwhile mentioning to the master (unreleased) section of CHANGELOG.md. If no CHANGELOG update is needed add the following to the PR description: [x] No CHANGELOG update needed

williballenthin · 2023-01-09T13:04:09Z

capa/features/extractors/viv/insn.py

+            if b"\x00" in buf:
+                # account for bug #1271.
+                # remove when vivisect is fixed.
+                buf = buf.partition(b"\x00")[0]


this should really never be the case, but viv is broken

williballenthin · 2023-01-09T13:05:39Z

capa/features/extractors/viv/insn.py

+            # partition to account for bug #1271.
+            # remove when vivisect is fixed.
+            return read_memory(vw, offset, ulen).decode("utf-16").partition("\x00")[0]


we're not really able to guess at the correct string length. this has an edge case where we have UTF16 | NULL | junk and when we try to decode as UTF16 then it fails due to the junk. we'll fail to extract some valid strings in this case, which is unfortunate.

mr-tz · 2023-01-09T13:24:58Z

tests/fixtures.py

@@ -627,6 +629,8 @@ def parametrize(params, values, **kwargs):
        ("mimikatz", "function=0x40105D", capa.features.common.String("ACR  > "), True),
        ("mimikatz", "function=0x40105D", capa.features.common.String("nope"), False),
        ("773290...", "function=0x140001140", capa.features.common.String(r"%s:\\OfficePackagesForWDAG"), True),
+        # overlapping string, see #1271
+        ("294b8d...", "function=0x404970,bb=0x404970,insn=0x40499F", capa.features.common.String("\r\n"), True),


so we expect this string?

correct. the sample contains the string "\r\n\r\n\x00HTTP" and this string is fetched using &string[2]. this test demonstrates the substring handling works as expected (and no embedded NULL bytes).

capa supports whitespace in strings, including newlines. filtering that out would be another discussion.

Thanks for the clarification.

In capa/features/extractors/strings.py we do not extract \r\n so this seems inconsistent between file and instruction scope?

We have a lint to warn on strings with length smaller than 4, "capa only extracts strings with length >= 4"?

Should we make this uniform across the extractors or keep as is (since we have more knowledge at the instruction scope, for example)?

ah, great points!

lets create two new issues, one for each of those points.

this is still the only test case i know of for this issue, but maybe we can find another that more clearly shows the problem.

Maybe test here instead that the wrong string is not extracted?

CHANGELOG updated or no update needed, thanks! 😄

closes #1271

mr-tz · 2023-01-11T12:55:42Z

ca9c940 closes #1278

mr-tz · 2023-01-11T13:37:20Z

@williballenthin, please review

williballenthin

looks good to me (cant review cause im an author)

williballenthin added the bug Something isn't working label Jan 9, 2023

williballenthin added this to the 5.0.0 milestone Jan 9, 2023

github-actions bot previously requested changes Jan 9, 2023

View reviewed changes

williballenthin commented Jan 9, 2023

View reviewed changes

mr-tz reviewed Jan 9, 2023

View reviewed changes

This was referenced Jan 10, 2023

Strings include or don't include \r\n #1277

Closed

Minimum string length across scopes/extractors #1278

Open

williballenthin and others added 4 commits January 11, 2023 13:52

viv: insn: string: handle viv bug around substrings

ffef4a2

closes #1271

use minimum string length 4

ca9c940

update changelog

3e83a66

update overlapping string test

dcf5a0f

mr-tz force-pushed the fix/issue-1271 branch from 1e0aa4d to dcf5a0f Compare January 11, 2023 12:53

fixup vivisect elf analysis missing function

25a1575

williballenthin commented Jan 19, 2023

View reviewed changes

Merge branch 'master' into fix/issue-1271

4c1828d

mr-tz merged commit 5513d4c into master Jan 19, 2023

mr-tz deleted the fix/issue-1271 branch January 19, 2023 12:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

viv: insn: string: handle viv bug around substrings #1273

viv: insn: string: handle viv bug around substrings #1273

williballenthin commented Jan 9, 2023 •

edited

Loading

github-actions bot left a comment

williballenthin Jan 9, 2023

williballenthin Jan 9, 2023

mr-tz Jan 9, 2023

williballenthin Jan 9, 2023

mr-tz Jan 10, 2023

williballenthin Jan 10, 2023

mr-tz Jan 10, 2023

mr-tz commented Jan 11, 2023

mr-tz commented Jan 11, 2023

williballenthin left a comment

viv: insn: string: handle viv bug around substrings #1273

viv: insn: string: handle viv bug around substrings #1273

Conversation

williballenthin commented Jan 9, 2023 • edited Loading

Checklist

github-actions bot left a comment

Choose a reason for hiding this comment

williballenthin Jan 9, 2023

Choose a reason for hiding this comment

williballenthin Jan 9, 2023

Choose a reason for hiding this comment

mr-tz Jan 9, 2023

Choose a reason for hiding this comment

williballenthin Jan 9, 2023

Choose a reason for hiding this comment

mr-tz Jan 10, 2023

Choose a reason for hiding this comment

williballenthin Jan 10, 2023

Choose a reason for hiding this comment

mr-tz Jan 10, 2023

Choose a reason for hiding this comment

mr-tz commented Jan 11, 2023

mr-tz commented Jan 11, 2023

williballenthin left a comment

Choose a reason for hiding this comment

williballenthin commented Jan 9, 2023 •

edited

Loading