Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial commit for dissect.executable.pe #10

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

sud0woodo
Copy link

First version for the dissect PE format parser. Currently the parser supports standard PE files, as well as some patching of executables and building an executable from scratch. It is expected there will be some issues that will be found during usage and it will need quite some refactoring. The API as it currently exists might need some revising too :)

First version for the dissect PE format parser. Currently the parser
supports standard PE files, as well as some patching of executables and
building an executable from scratch. It is expected there will be some
issues that will be found during usage and it will need quite some
refactoring. The API as it currently exists might need some revising too
:)
Copy link

codecov bot commented Mar 9, 2024

Codecov Report

Attention: Patch coverage is 87.37864% with 143 lines in your changes are missing coverage. Please review.

Project coverage is 88.60%. Comparing base (4f08029) to head (137d17b).

Files Patch % Lines
dissect/executable/pe/helpers/resources.py 73.65% 44 Missing ⚠️
dissect/executable/pe/pe.py 82.88% 38 Missing ⚠️
dissect/executable/pe/helpers/sections.py 85.08% 17 Missing ⚠️
dissect/executable/pe/helpers/tls.py 68.62% 16 Missing ⚠️
dissect/executable/pe/helpers/imports.py 94.02% 8 Missing ⚠️
dissect/executable/pe/helpers/patcher.py 95.69% 8 Missing ⚠️
dissect/executable/pe/helpers/exports.py 87.23% 6 Missing ⚠️
dissect/executable/pe/helpers/builder.py 98.03% 3 Missing ⚠️
dissect/executable/pe/helpers/relocations.py 89.65% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##            main      #10       +/-   ##
==========================================
+ Coverage   0.00%   88.60%   +88.60%     
==========================================
  Files          5       17       +12     
  Lines        306     1439     +1133     
==========================================
+ Hits           0     1275     +1275     
+ Misses       306      164      -142     
Flag Coverage Δ
unittests 88.60% <87.37%> (+88.60%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@Miauwkeru Miauwkeru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done an initial review of the code, mostly just commented on some typehint weirdness and the like. I'll go deeper into it at a later time.

dissect/executable/pe/helpers/c_pe.py Outdated Show resolved Hide resolved
dissect/executable/pe/helpers/c_pe.py Outdated Show resolved Hide resolved
dissect/executable/pe/helpers/c_pe.py Outdated Show resolved Hide resolved
dissect/executable/pe/helpers/c_pe.py Outdated Show resolved Hide resolved
dissect/executable/pe/helpers/c_pe.py Outdated Show resolved Hide resolved
tests/test_pe_modifications.py Outdated Show resolved Hide resolved
tests/test_pe_modifications.py Outdated Show resolved Hide resolved
tests/test_pe_modifications.py Outdated Show resolved Hide resolved
tests/test_pe_modifications.py Outdated Show resolved Hide resolved
dissect/executable/pe/helpers/builder.py Outdated Show resolved Hide resolved
dissect/executable/pe/helpers/c_pe.py Outdated Show resolved Hide resolved
dissect/executable/pe/helpers/c_pe.py Outdated Show resolved Hide resolved
self.patched_pe = BytesIO()
self.functions = []

@property
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This being a property/attribute doesn't make much sense. I wouldn't expect assigning the attribute to some other variable would build the whole patched binary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's just a design preference, happy to discuss other options for this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depends on what you expect of a property, most properies just return a value of some internal state. This goes a bit beyond that IMHO, where it produces the whole binary. Feels like it should be the responsibility of a method instead.

dissect/executable/pe/pe.py Outdated Show resolved Hide resolved
dissect/executable/pe/pe.py Outdated Show resolved Hide resolved
dissect/executable/pe/helpers/resources.py Outdated Show resolved Hide resolved
Comment on lines +85 to +93
self.raw_resources.append(
{
"offset": entry.OffsetToDirectory,
"entry": data_entry,
"data": data,
"data_offset": raw_offset,
"resource": rsrc,
}
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having an additional raw_resources list, why not add these offset and information to Resource itself? Where you can have the same functionality as a dict if you add def raw() to Resource

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kept for patching resources, adding them as a dict like proposed would need some more extensive refactoring of the code. We might opt for that indeed as it would be cleaner

Comment on lines 376 to 405
for rsrc_entry in sorted(self.pe.raw_resources, key=lambda rsrc: rsrc["data_offset"]):
entry_offset = rsrc_entry["offset"]
entry = rsrc_entry["entry"]

if entry._type.name == "IMAGE_RESOURCE_DATA_ENTRY":
rsrc_obj = rsrc_entry["resource"]
data_offset = rsrc_entry["data_offset"]

# Normally the data is separated by a null byte, increment the new offset by 1
new_data_offset = prev_offset + prev_size
# if new_data_offset and (new_data_offset > data_offset or new_data_offset < data_offset):
if new_data_offset and new_data_offset != data_offset:
data_offset = new_data_offset
rsrc_entry["data_offset"] = data_offset
rsrc_obj.offset = self.section.virtual_address + data_offset

data = rsrc_obj.data

# Write the resource entry data into the section
section_data.seek(data_offset)
section_data.write(data)

# Take note of the offset and size so we can update any of these values after changing the data within
# the resource
prev_offset = data_offset
prev_size = rsrc_obj.size

# Write the resource entry into the section
section_data.seek(entry_offset)
section_data.write(entry.dumps())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels out of place in the Resource class. I wouldn't imagine that everything gets immediately changed when assigning data to a resource. I would expect either the manager to take care of that (which holds the raw resources) or a seperate utility that does so during patching.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to move those things around if it doesn't make sense in the place where it is at right now!

from dissect.executable.pe.pe import PE


class PESection:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also a random thought, in line with the other managers and reduce a bit of the complexity of this class.
Is a SectionManager an idea that does the patching and keeping account of the patched and unpatched sections?

aka, something like:

class PESectionManager(Manager):
    def __init__(self, pe, section, ...):
       self.patched_items = {}

    def patch(self, name, data):
       # get data from self.items
       # Copy data
       # insert it in self.patched_items

    def __getitem__(self, name: str):
        if section := self.patched_items.get(name):
            return section
        else:
            return super().__getitem__(name)

    def parse():
        ... # Do something

this is just a rough draft of such a manager class

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I like the idea, initially that is kind of what I had in mind as well but I lacked (still do) some understanding of the language to do it in a nice manner.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

want me to look into that a bit?

Comment on lines 186 to 204
for name, section in self.pe.patched_sections.items():
if section.virtual_address == prev_va:
continue

pointer_to_raw_data = utils.align_int(integer=prev_ptr + prev_size, blocksize=self.pe.file_alignment)
virtual_address = utils.align_int(integer=prev_va + prev_vsize, blocksize=self.pe.section_alignment)

if section.virtual_address < virtual_address:
"""Set the virtual address and raw pointer of the section to the new values, but only do so if the
section virtual address is lower than the previous section. We want to prevent messing up RVA's as
much as possible, this could lead to binaries that are a bit larger than they need to be but that
doesn't really matter."""
self.pe.patched_sections[name].virtual_address = virtual_address
self.pe.patched_sections[name].pointer_to_raw_data = pointer_to_raw_data

prev_ptr = pointer_to_raw_data
prev_size = section.size_of_raw_data
prev_va = virtual_address
prev_vsize = section.virtual_size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as the Resource class, it doesn't feel like this class should be responsible for patching all the other sections.

@Miauwkeru
Copy link
Contributor

to help with a lot of these style issues, you could check out the new linting/formatting we plan to use. It will probably also warn you about stuff I missed: fox-it/dissect.util#52

* Refactored based on comments given
* Ran `ruff` for formatting
* Applied style and formatting on `dissect.executable.elf` as well (hope you don't mind)
Copy link
Contributor

@Miauwkeru Miauwkeru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I answered your comments a bit, if you need help with any of the changes I requested do not hesitate to ask!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

(4) Initial commit for Dissect PE format parser PR #10
2 participants