-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added chardet to detect the encoding of the content #6
Conversation
ping @xnl-h4ck3r |
Hey @Nishantbhagat57 , thanks for the pull request! This looks great! I just need to try it out locally first to make sure. |
Hey there, thank you for reviewing this! My bad, the last commit regarding the URL change was intended for my personal fork and ended up in this PR inadvertently. Didn't think it would sneak into this PR :| |
Can you remove that commit from the pull request?
Thanks! |
Also, I have a question... you mentioned that character |
Hi @xnl-h4ck3r I have implemented the changes. Now, about the symbol Even though the endpoint isn't correct, which could be categorized as a false positive, I believe That's why I suggest continuing with the chardet modification despite the input error - it maintains our ability to handle encoding/decoding issues effectively :) |
Hi @Nishantbhagat57 , I agree 100% it shouldn't crash with an error, even if it's not a valid url. |
@xnl-h4ck3r I don't know why it's saying But you can try these things:
I again tested the modified script and it works perfectly without issue in my environment |
Can you run these and let me know what versions you have?
|
Thank you. Sorry this is taking so long to sort out. |
I get exactly the same, which although very strange, doesn't help mr figure out why I can't get mine to run with chardet :( |
Hey @Nishantbhagat57. Sorry to go quiet on this... I am still trying to figure out what the problem is. It's def something on my setup rather than your changes obviously. I still need to fix it properly on mine to test properly first. Thanks for you patience |
When running urless on a txt file I encountered this error:
ERROR processInput 2 'utf-8' codec can't decode byte 0xac in position 5030: invalid start byte
The error message is indicating that Python is trying to decode a file as utf-8 but it is encountering a byte (in this case,
0xac
) which is not valid for utf-8 encoding.To fix this issue I have modified the
processInput()
function to usechardet
to handle encoding/decoding issues without modifying any content.chardet
can help to detect the encoding of the content. We can use that encoding to open the file, rather than blindly assumingutf-8
.Before modifications:
After modifications: