You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
If one uses the Mach-O parser on a file that has invalid library ordinals in its dyld exports trie, it ends up being accepted and validated with meaningless export info.
To Reproduce
Steps to reproduce the behavior:
Apply the following patch on LIEF's sources in order to have more precise runtime information (easier for me than starting a debugger from Python):
Details
diff --git a/src/MachO/BinaryParser.cpp b/src/MachO/BinaryParser.cpp
index 4c63f22d..5a806a00 100644
--- a/src/MachO/BinaryParser.cpp+++ b/src/MachO/BinaryParser.cpp@@ -138,11 +138,14 @@ ok_error_t BinaryParser::init_and_parse() {
ok_error_t BinaryParser::parse_export_trie(exports_list_t& exports, uint64_t start,
uint64_t end, const std::string& prefix)
{
+ LIEF_DEBUG("start={}, end={}, prefix={}", start, end, prefix);
if (stream_->pos() >= end) {
+ LIEF_DEBUG("1");
return make_error_code(lief_errors::read_error);
}
if (start > stream_->pos()) {
+ LIEF_DEBUG("2");
return make_error_code(lief_errors::read_error);
}
@@ -154,10 +157,12 @@ ok_error_t BinaryParser::parse_export_trie(exports_list_t& exports, uint64_t sta
uint64_t children_offset = stream_->pos() + *terminal_size;
if (*terminal_size != 0) {
+ LIEF_DEBUG("3");
uint64_t offset = stream_->pos() - start;
auto res_flags = stream_->read_uleb128();
if (!res_flags) {
+ LIEF_DEBUG("4");
return make_error_code(lief_errors::read_error);
}
uint64_t flags = *res_flags;
@@ -168,14 +173,18 @@ ok_error_t BinaryParser::parse_export_trie(exports_list_t& exports, uint64_t sta
Symbol* symbol = nullptr;
auto search = memoized_symbols_.find(symbol_name);
if (search != memoized_symbols_.end()) {
+ LIEF_DEBUG("5");
symbol = search->second;
} else {
+ LIEF_DEBUG("6");
symbol = binary_->get_symbol(symbol_name);
}
if (symbol != nullptr) {
+ LIEF_DEBUG("7");
export_info->symbol_ = symbol;
symbol->export_info_ = export_info.get();
} else { // Register it into the symbol table
+ LIEF_DEBUG("8");
auto symbol = std::make_unique<Symbol>();
symbol->origin_ = SYMBOL_ORIGINS::SYM_ORIGIN_DYLD_EXPORT;
@@ -194,6 +203,7 @@ ok_error_t BinaryParser::parse_export_trie(exports_list_t& exports, uint64_t sta
// REEXPORT
// ========
if (export_info->has(EXPORT_SYMBOL_FLAGS::EXPORT_SYMBOL_FLAGS_REEXPORT)) {
+ LIEF_DEBUG("9");
auto res_ordinal = stream_->read_uleb128();
if (!res_ordinal) {
LIEF_ERR("Can't read uleb128 to determine the ordinal value");
@@ -208,21 +218,26 @@ ok_error_t BinaryParser::parse_export_trie(exports_list_t& exports, uint64_t sta
return make_error_code(lief_errors::parsing_error);
}
if (imported_name->empty() && export_info->has_symbol()) {
+ LIEF_DEBUG("10");
imported_name = export_info->symbol()->name();
}
Symbol* symbol = nullptr;
auto search = memoized_symbols_.find(*imported_name);
if (search != memoized_symbols_.end()) {
+ LIEF_DEBUG("11");
symbol = search->second;
} else {
+ LIEF_DEBUG("12");
symbol = binary_->get_symbol(*imported_name);
}
if (symbol != nullptr) {
+ LIEF_DEBUG("13");
export_info->alias_ = symbol;
symbol->export_info_ = export_info.get();
symbol->value_ = export_info->address();
} else {
+ LIEF_DEBUG("14");
auto symbol = std::make_unique<Symbol>();
symbol->origin_ = SYMBOL_ORIGINS::SYM_ORIGIN_DYLD_EXPORT;
symbol->value_ = export_info->address();
@@ -239,12 +254,15 @@ ok_error_t BinaryParser::parse_export_trie(exports_list_t& exports, uint64_t sta
if (ordinal < binary_->libraries().size()) {
+ LIEF_DEBUG("15");
DylibCommand& lib = binary_->libraries()[ordinal];
export_info->alias_location_ = &lib;
} else {
+ LIEF_DEBUG("16");
// TODO: Corrupted library name
}
} else {
+ LIEF_DEBUG("17");
auto address = stream_->read_uleb128();
if (!address) {
LIEF_ERR("Can't read export address");
@@ -256,6 +274,7 @@ ok_error_t BinaryParser::parse_export_trie(exports_list_t& exports, uint64_t sta
// STUB_AND_RESOLVER
// =================
if (export_info->has(EXPORT_SYMBOL_FLAGS::EXPORT_SYMBOL_FLAGS_STUB_AND_RESOLVER)) {
+ LIEF_DEBUG("18");
auto other = stream_->read_uleb128();
if (!other) {
LIEF_ERR("Can't read 'other' value for the export info");
@@ -274,6 +293,7 @@ ok_error_t BinaryParser::parse_export_trie(exports_list_t& exports, uint64_t sta
return make_error_code(lief_errors::parsing_error);
}
for (size_t i = 0; i < *nb_children; ++i) {
+ LIEF_DEBUG("19, i={}", i);
auto suffix = stream_->read_string();
if (!suffix) {
LIEF_ERR("Can't read suffix");
@@ -289,10 +309,12 @@ ok_error_t BinaryParser::parse_export_trie(exports_list_t& exports, uint64_t sta
auto child_node_offet = static_cast<uint32_t>(*res_child_node_offet);
if (child_node_offet == 0) {
+ LIEF_DEBUG("20");
break;
}
if (visited_.count(start + child_node_offet) > 0) {
+ LIEF_DEBUG("21");
break;
}
visited_.insert(start + child_node_offet);
@@ -301,6 +323,7 @@ ok_error_t BinaryParser::parse_export_trie(exports_list_t& exports, uint64_t sta
parse_export_trie(exports, start, end, name);
stream_->setpos(current_pos);
}
+ LIEF_DEBUG("22");
return ok();
}
@@ -353,6 +376,7 @@ ok_error_t BinaryParser::parse_dyldinfo_export() {
uint32_t size = std::get<1>(dyldinfo->export_info());
if (offset == 0 || size == 0) {
+ LIEF_DEBUG("empty: exit");
return ok();
}
@@ -374,6 +398,7 @@ ok_error_t BinaryParser::parse_dyldinfo_export() {
dyldinfo->export_trie_ = content.subspan(rel_offset, size);
stream_->setpos(offset);
+ LIEF_DEBUG("0");
parse_export_trie(dyldinfo->export_info_, offset, end_offset, "");
return ok();
}
Enable all debug in LIEF: compile with -DCMAKE_BUILD_TYPE=Debug -DLIEF_LOGGING=ON -DLIEF_LOGGING_DEBUG=ON and lief.logging.debug = true in the Python API config file.
on such a file, for example the Yontoo malware that you can obtain from the following attachment: yontoo.tar.gz. Obviously, one should avoid running it, although I am not sure exactly how it could even run as it is heavily broken. Still, just don't.
(Concerning the Python code, I haven't actually ran it in this exact form, but one close to it should work if it does not as-is.)
The following output can be observed:
Details
Parsing MachO
Arch: x86_64
[+] Building Load commands
[+] Parsing dyld information
[+] Parsing symbols
nlist[0].str_idx seems corrupted (0x41358d31)
nlist[1].str_idx seems corrupted (0x358d3148)
nlist[2].str_idx seems corrupted (0x8d314800)
[...]
nlist[3952].str_idx seems corrupted (0x3815ffe7)
nlist[3953].str_idx seems corrupted (0x15ffe789)
nlist[3954].str_idx seems corrupted (0xffe7894c)
[+] Parsing dynamic symbols
[+] Building UUID
[+] Parsing LC_THREAD
FLAVOR: 4 | COUNT: 42
[^] Post processing LC_DYSYMTAB
Indirect symbol index is out of range (3884534784 vs max sym: 3955)
indirect_symbols_.size(): 0 (nb_indirect_symbols: 211)
No relocations in __text
No relocations in __symbol_stub1
No relocations in __cstring
No relocations in __ustring
No relocations in __const
No relocations in __stub_helper
No relocations in __gcc_except_tab
No relocations in __unwind_info
No relocations in __eh_frame
No relocations in __program_vars
No relocations in __nl_symbol_ptr
No relocations in __la_symbol_ptr
No relocations in __const
No relocations in __cfstring
No relocations in __objc_data
No relocations in __objc_msgrefs
No relocations in __objc_selrefs
No relocations in __objc_classrefs
No relocations in __objc_superrefs
No relocations in __objc_const
No relocations in __objc_classlist
No relocations in __objc_catlist
No relocations in __objc_protolist
No relocations in __objc_imageinfo
No relocations in __data
No relocations in __bss
[+] LC_DYLD_INFO.exports
0
start=239544, end=239656, prefix=
3
6
7
9
12
14
16
19, i=0
start=239544, end=239656, prefix=1�5A`
19, i=0
21
22
19, i=1
start=239544, end=239656, prefix=���8`
3
6
8
9
12
14
16
18
19, i=0
21
22
19, i=2
21
22
[+] LC_DYLD_INFO.bindings
Unsupported opcode: 0xf0
[+] LC_DYLD_INFO.rebases
[^] Post processing LC_SYMTAB
Heap could be executable
\x89\xe7\xff8`
\x89\xe7\xff8`
which indicates that the following code path is taken:
if (ordinal < binary_->libraries().size()) {
LIEF_DEBUG("15");
DylibCommand& lib = binary_->libraries()[ordinal];
export_info->alias_location_ = &lib;
} else {
LIEF_DEBUG("16"); // <-- this one// TODO: Corrupted library name
}
in BinaryParser::parse_export_trie from src/MachO/BinaryParser.cpp, meaning that ordinal >= binary_->libraries().size(), which in turn means the number is invalid.
Expected behavior
The Mach-O executable file inside the FAT archive should be rejected, or at least the exports trie parsing should fail in this case and avoid including broken symbols in the resulting object. If I understand things correctly, // TODO: Corrupted library name should be replaced with a LIEF_ERR(...); followed by a return make_error_code(lief_errors::read_error);, or something like that. After that, there should not be any symbol in the symbols table.
Environment:
System and Version: Debian testing.
Target format: Mach-O.
LIEF commit version: 0.13.2-.
The text was updated successfully, but these errors were encountered:
Following the reply in #994, it follows the same logic:
LIEF does not aim at following the same logic as the loader. It should cover
all the cases that are supported by the different loaders, but if it covers more,
it should not be a big deal.
In this case, if you need to enforce some checks for your use case, you can add
an additional check on the unicode status of the symbols.
Thanks for the detailed answer. Indeed, I would have expected the LIEF parser to comply with the "specification". From my point of view, the parsing just ends up being lost in random bytes and the reported data does not make much sense as the original bytes did not either.
However, I understand your design choices and can stand by them. If your decision is final, then this issue can be closed and I'll try to add some additional checks on my end.
Describe the bug
If one uses the Mach-O parser on a file that has invalid library ordinals in its dyld exports trie, it ends up being accepted and validated with meaningless export info.
To Reproduce
Steps to reproduce the behavior:
Details
-DCMAKE_BUILD_TYPE=Debug -DLIEF_LOGGING=ON -DLIEF_LOGGING_DEBUG=ON
andlief.logging.debug = true
in the Python API config file.on such a file, for example the Yontoo malware that you can obtain from the following attachment: yontoo.tar.gz. Obviously, one should avoid running it, although I am not sure exactly how it could even run as it is heavily broken. Still, just don't.
(Concerning the Python code, I haven't actually ran it in this exact form, but one close to it should work if it does not as-is.)
Details
which indicates that the following code path is taken:
in
BinaryParser::parse_export_trie
fromsrc/MachO/BinaryParser.cpp
, meaning thatordinal >= binary_->libraries().size()
, which in turn means the number is invalid.Expected behavior
The Mach-O executable file inside the FAT archive should be rejected, or at least the exports trie parsing should fail in this case and avoid including broken symbols in the resulting object. If I understand things correctly,
// TODO: Corrupted library name
should be replaced with aLIEF_ERR(...);
followed by areturn make_error_code(lief_errors::read_error);
, or something like that. After that, there should not be any symbol in thesymbols
table.Environment:
0.13.2-
.The text was updated successfully, but these errors were encountered: