You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
>>>importiocextract>>># this extracts all urls, encoded and unencoded>>>list(iocextract.extract_urls("ioc is hxxps://u7007.scor1[.]com/k265/aHR0cHM6Ly91NzAwNy5zY29y", refang=True))
['https://u7007.scor1.com/k265/aHR0cHM6Ly91NzAwNy5zY29y', 'https://u7007.scor1.com/k265/aHR0cHM6Ly91NzAwNy5zY29y', 'https://u7007.scor']
# this extracts just unencoded urls>>>list(iocextract.extract_unencoded_urls("ioc is hxxps://u7007.scor1[.]com/k265/aHR0cHM6Ly91NzAwNy5zY29y", refang=True))
['https://u7007.scor1.com/k265/aHR0cHM6Ly91NzAwNy5zY29y', 'https://u7007.scor1.com/k265/aHR0cHM6Ly91NzAwNy5zY29y']
# this extracts just encoded urls>>>list(iocextract.extract_encoded_urls("ioc is hxxps://u7007.scor1[.]com/k265/aHR0cHM6Ly91NzAwNy5zY29y", refang=True))
['https://u7007.scor']
>>>
My IOC is https://example[.]com/k265/aHR0cHM6Ly91NzAwNy5zY29y
iocextract.extract_urls(IOC, refang=True)
Getting error:
File "/usr/local/lib/python3.11/dist-packages/iocextract.py", line 522, in extract_encoded_urls
url = base64.b64decode(url).decode("utf-8", "replace")
^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/base64.py", line 88, in b64decode
return binascii.a2b_base64(s, strict_mode=validate)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
binascii.Error: Incorrect padding
How to ignore base64 strings while extracting URL from content (iocextract.extract_urls)??
How to ignore extraction of encoded strings present in URI??
Say in above example I want to ignore extraction of 'aHR0cHM6Ly91NzAwNy5zY29y'
The text was updated successfully, but these errors were encountered: