Updated to include a GPT version with significantly better performance and accuracy than the original. You can see an example here https://github.com/ip-rw/yakit_english/ this is a large, mature electron GUI translated from Chinese into English without manual intervention.
Rough and ready way to translate source code into English. It will extract and translate blocks of non-ASCII text and then write it all back to the file. It expects you to pipe/pass as args a list of file paths to translate. It uses GoogleTranslate from the deep_translate library and works best via pool of rotating proxies. There's no reason the other deep_translate backends wouldn't work, just untested.
It's been tested with Russian and Chinese source and does as good a job as one could hope. YMMV but it seems to be okay at not mangling files.
It batches things up and handles Google's antics as best it can, there's a fair bit of juggling but it should go as quickly as it can without filling the files with nonsense.
I made it because the Chinese particularly release a lot of interesting code now, and unfortunately its just squiggles to me.
I use like this:
find ~/ksubdomain -type f |grep -v 'git\|svg'| python3 main.py
If you want to supercharge things (have a rotating proxy handy) then use xargs but beware of unescaped file paths (I avoid spaces):
find ~/ksubdomain -type f |grep -v 'git\|svg'| xargs -n 30 -P5 python3 main.py
You'll need Python 3 along with the following packages:
- argparse
- cypunct
- charset-normalizer
- deep_translator
- thefuzz
Before:
user@flex:~/ksubdomain$ go run cmd/ksubdomain/*.go e
NAME:
cmd enum - 枚举域名
USAGE:
cmd enum [command options] [arguments...]
OPTIONS:
--domain value, -d value 域名
--band value, -b value 宽带的下行速度,可以5M,5K,5G (default: "2m")
--resolvers value, -r value dns服务器文件路径,一行一个dns地址,默认会使用内置dns
--output value, -o value 输出文件名
--silent 使用后屏幕将仅输出域名 (default: false)
--retry value 重试次数,当为-1时将一直重试 (default: 3)
--timeout value 超时时间 (default: 6)
--stdin 接受stdin输入 (default: false)
--only-domain, --od 只打印域名,不显示ip (default: false)
--not-print, --np 不打印域名结果 (default: false)
--dns-type value dns类型 可以是a,aaaa,ns,cname,txt (default: "a")
--domainList value, --dl value 从文件中指定域名
--filename value, -f value 字典路径
--skip-wild 跳过泛解析域名 (default: false)
--ns 读取域名ns记录并加入到ns解析器中 (default: false)
--level value, -l value 枚举几级域名,默认为2,二级域名 (default: 2)
--level-dict value, --ld value 枚举多级域名的字典文件,当level大于2时候使用,不填则会默认
--help, -h show help (default: false)
After:
user@flex:~/ksubdomain$ go run cmd/ksubdomain/*.go e
NAME:
cmd enum - Enumerate the domain name
USAGE:
cmd enum [command options] [arguments...]
OPTIONS:
--domain value, -d value Domain name
--band value, -b value broadband downlink speed, can be 5M, 5K, 5G (default: "2m")
--resolvers value, -r value dns server file path, one dns address per line, the default will use the built-in dns
--output value, -o value output file name
--silent After using it, the screen will only output the domain name (default: false)
--retry value retry times, when it is -1, (default: 3)
--timeout value timeout Time (default: 6)
--stdin accepts stdin input (default: false)
--only-domain, --od only prints the domain name, does not display the ip (default: false)
--not-print, --np does not print the domain name result (default: false)
--dns-type value dns type can be a, aaaa, ns, cname, txt (default: "a")
--domainList value, --dl value Specify the domain name
--filename value, -f value dictionary path
--skip-wild skip pan-analysis domain name (default: false)
--ns Read the ns record of the domain name and add it to the ns parser (default: false)
--level value, -l value Enumerate several levels of domain names, the default is 2, the second-level domain name (default: 2)
--level-dict value, --ld value Enumerate the dictionary file of multi-level domain names, used when the level is greater than 2, if not filled, it will default to
--help, -h show help (default: false)