fix #82: correctly handle filenames reported by git.ls_files()
when they contain unicode escapes
#83
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR addresses an issue (#82) with the handling of filenames containing non-ASCII characters in the
get_tracked_files
method of theCoder
class. The previous implementation was not correctly decoding the Unicode escape sequences in the filenames (which is howgit.ls_files()
returns them), leading toFileNotFoundError
when trying to open these files.Changes
The change is in the
get_tracked_files
method. When a filename is enclosed in quotes, indicating it contains special characters, we now strip those quotes and then decode the filename using a combination of 'latin1', 'unicode_escape', and 'utf-8' encodings. This ensures that the Unicode escape sequences are correctly decoded to their corresponding non-ASCII characters.Here's the updated
get_tracked_files
method:Testing
The changes have been tested with filenames containing non-ASCII characters on the https://github.com/cgrothaus/sample-repo-demonstrate-aider-bug-special-filenames repo, and the
get_tracked_files
method now correctly decodes these filenames. As a result, theFileNotFoundError
no longer occurs when trying to open these files.The changes have been tested on macOS with python 3.10.11.final.0. I did not test them on Windows.
Impact
This fix improves the robustness of
aider
when dealing with repositories containing files with non-ASCII characters in their names. It should not affect the functionality ofaider
in other aspects.