Encoding issues with ANSI files. #243

AlexFielder · 2017-03-01T16:05:31Z

Hi all,

We're now using Hound to index a fairly large codebase >1.5GB and I just noticed that there are 1,000s of files being ignored if they have ANSI encoding instead of UTF-8. They appear in the excluded_files.json file

Is this something that can be fixed within Hound or will I likely have to figure out how to change the encoding of all these files?

Thanks,

Alex.

AlexFielder · 2017-03-22T11:17:56Z

For anyone else having the same issue, here is the (quickest) fix I could come up with:

Install Notepad++ (NP++) 32 bit (important because there's currently no plugin manager for the default x64 installation from chocolatey.org!)
Install the Python Script plugin for NP++
Create new script:
Inside of the new script add the following:

import os;
import sys;
filePathSrc="C:\\CM"
for root, dirs, files in os.walk(filePathSrc):
    for fn in files:
      if fn[-3:] == '.cm' :
        notepad.open(root + "\\" + fn)
        console.write(root + "\\" + fn + "\r\n")
        notepad.runMenuCommand("Encoding", "Convert to UTF-8")
        notepad.save()
        notepad.close()

(You can set up the script to only work on specific file extensions like I have or change it to suit.)
5. Save and close the script file, then NP++ itself. (This is necessary to get Python Script to pick up the newly saved script file)
6. Reopen NP++ and select your script file (whatever it's called):

7. Go and grab a coffee, because if (as I did) you have 20,000+ files to check it will take some time. Whilst the script is running the system will fight with NP++ for focus as it cycles through the folder structure you pointed it at.

FWIW: I am using an NVMe Samsung SSD and this script barely taxes it; if you have a regular spinny HDD then you might need to be selective where you point it.

dschott68 added the question label Jun 28, 2020

tgulacsi mentioned this issue Mar 11, 2021

Add fallback-encoding per-repo option for non-utf8 text files #388

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Encoding issues with ANSI files. #243

Encoding issues with ANSI files. #243

AlexFielder commented Mar 1, 2017

AlexFielder commented Mar 22, 2017

Encoding issues with ANSI files. #243

Encoding issues with ANSI files. #243

Comments

AlexFielder commented Mar 1, 2017

AlexFielder commented Mar 22, 2017