Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting Error: binascii.Error: Incorrect padding #78

Open
s4ksh1 opened this issue Feb 3, 2024 · 1 comment
Open

Getting Error: binascii.Error: Incorrect padding #78

s4ksh1 opened this issue Feb 3, 2024 · 1 comment

Comments

@s4ksh1
Copy link

s4ksh1 commented Feb 3, 2024

My IOC is https://example[.]com/k265/aHR0cHM6Ly91NzAwNy5zY29y

iocextract.extract_urls(IOC, refang=True)

Getting error:
File "/usr/local/lib/python3.11/dist-packages/iocextract.py", line 522, in extract_encoded_urls
url = base64.b64decode(url).decode("utf-8", "replace")
^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/base64.py", line 88, in b64decode
return binascii.a2b_base64(s, strict_mode=validate)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
binascii.Error: Incorrect padding

How to ignore base64 strings while extracting URL from content (iocextract.extract_urls)??

image

How to ignore extraction of encoded strings present in URI??
Say in above example I want to ignore extraction of 'aHR0cHM6Ly91NzAwNy5zY29y'

@InQuest InQuest deleted a comment from DragonistYJ Feb 6, 2024
@Synse
Copy link
Contributor

Synse commented Jul 30, 2024

@s4ksh1 iocextract.extract_urls() calls iocextract.extract_unencoded_urls() and iocextract.extract_encoded_urls(). In this case you can just use iocextract.extract_unencoded_urls() directly:

>>> import iocextract
>>> # this extracts all urls, encoded and unencoded
>>> list(iocextract.extract_urls("ioc is hxxps://u7007.scor1[.]com/k265/aHR0cHM6Ly91NzAwNy5zY29y", refang=True))
['https://u7007.scor1.com/k265/aHR0cHM6Ly91NzAwNy5zY29y', 'https://u7007.scor1.com/k265/aHR0cHM6Ly91NzAwNy5zY29y', 'https://u7007.scor']
# this extracts just unencoded urls
>>> list(iocextract.extract_unencoded_urls("ioc is hxxps://u7007.scor1[.]com/k265/aHR0cHM6Ly91NzAwNy5zY29y", refang=True))
['https://u7007.scor1.com/k265/aHR0cHM6Ly91NzAwNy5zY29y', 'https://u7007.scor1.com/k265/aHR0cHM6Ly91NzAwNy5zY29y']
# this extracts just encoded urls
>>> list(iocextract.extract_encoded_urls("ioc is hxxps://u7007.scor1[.]com/k265/aHR0cHM6Ly91NzAwNy5zY29y", refang=True))
['https://u7007.scor']
>>> 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants