Skip to content
This repository has been archived by the owner on Mar 30, 2023. It is now read-only.

HOW TO FIX - Traceback (most recent call last): File "/usr/local/bin/twint", line 33, in <module> && File "/home/rangersmyth_74/.local/lib/python3.7/site-packages/twint/token.py", line 69, in refresh raise RefreshTokenException('Could not find the Guest token in HTML') twint.token.RefreshTokenException: Could not find the Guest token in HTML #1320

Open
rangersmyth74 opened this issue Dec 30, 2021 · 24 comments · Fixed by jarle/twint#2 · May be fixed by #1322

Comments

@rangersmyth74
Copy link

rangersmyth74 commented Dec 30, 2021

I really love this app and I have only used it once. But now I want to use it and I can't. I have searched so much that my eyes are sore, why am I getting these errors and why can't I find a fix for it!! hehe...

Linux Kali-ROG 5.14.0-kali4-amd64 #1 SMP Debian 5.14.16-1kali1 (2021-11-05) x86_64 GNU/Linux

Please help not just me but anyone who searches the 400 pages of information to find an answer. 9 hours of work to try this, the time I will never get back and I just want to chuck in Linux altogether and do anything else.

  • [] Python version is 3.6;
  • [] Updated Twint with pip3 install --user --upgrade -e git+https://github.com/twintproject/twint.git@origin/master#egg=twint;

These are the steps I took to install twint and get try and get it to work on 2 versions of Linux and 1 Deb. I have spent 9 hours trying my best, but no fix anywhere, none at all, not one.

What errors am I getting, all listed below

1: Install on Kali. This did not work for me.

Linux Kali-ROG 5.14.0-kali4-amd64 #1 SMP Debian 5.14.16-1kali1 (2021-11-05) x86_64 GNU/Linux

Instructions for install : https://github.com/twintproject/twint

    git clone –depth=1 https://github.com/twintproject/twint.git
    cd twint
    pip3 install . -r requirements.txt        

Try and run these commands to see if it works.

   $ twint -s pineapple
    $ twint -u networkchuck -s "raspberry pi" 

2: Install on Kali AWS. This did not work for me.

Linux kali 5.15.0-kali2-cloud-amd64 #1 SMP Debian 5.15.5-2kali2 (2021-12-22) x86_64 GNU/Linux

It seems that twint and AWS do not work due to

Instructions for install : https://www.geeksforgeeks.org/how-to-use-twint-osint-tool-on-google-cloud-console/

    sudo git clone –depth=1 https://github.com/twintproject/twint.git
    sudo cd twint
    sudo pip3 install . -r requirements.txt
    sudo pip3 install twint

Try and run these commands to see if it works.

    $ twint -s pineapple
    $ twint -u networkchuck -s "raspberry pi" 

3: Install on Google Cloud Shell and This worked for me.

Lightbox Linux cs-899333161534-default-boost-h8jtm 5.10.68+ #1 SMP Wed Dec 1 10:07:21 UTC 2021 x86_64 GNU/Linux

Instructions for install: https://www.geeksforgeeks.org/how-to-use-twint-osint-tool-on-google-cloud-console/

FYI 1 - Google didn't like the command git clone --depth=1

So I removed it and it installed.

FY 2 - Using pip3 install twint command said all was installed, but I added sudo

$ sudo git clone https://github.com/twintproject/twint.git
$ ls
$ cd twint
$ ls
$ sudo pip3 install . -r requirements.txt
$ sudo pip3 install twint

Ran commands to see if it works.

    $ twint -s pineapple
    $ twint -u networkchuck -s "raspberry pi"

I was able to run twint until I logged out and logged back in, it didn't work anymore.

The Kali twint install journey.

My Install on Kali Linux.

┌──(kali㉿kali)-[~]
└─$ git clone --depth=1 https://github.com/twintproject/twint.git
Cloning into 'twint'...
remote: Enumerating objects: 47, done.
remote: Counting objects: 100% (47/47), done.
remote: Compressing objects: 100% (44/44), done.
remote: Total 47 (delta 3), reused 14 (delta 0), pack-reused 0
Receiving objects: 100% (47/47), 42.95 KiB | 1.59 MiB/s, done.
Resolving deltas: 100% (3/3), done.

┌──(kali㉿kali)-[~]
└─$ ls
Desktop Documents Downloads Music Pictures Public Templates Videos bash.txt nmap twint

┌──(kali㉿kali)-[~]
└─$ cd twint

┌──(kali㉿kali)-[~/twint]
└─$cd twint

┌──(kali㉿kali)-[~/twint]
└─$ pip3 install . -r requirements.txt
Command 'pip3' not found, but can be installed with:
sudo apt install python3-pip
Do you want to install it? (N/y)y
sudo apt install python3-pip

┌──(kali㉿kali)-[~/twint]
└─$ pip3 install . -r requirements.txt

WARNING: The script twint is installed in '/home/kali/.local/bin' which is not on PATH.

Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed aiohttp-socks-0.4.1 cchardet-2.1.7 dataclasses-0.6 elasticsearch-7.16.2 fake-useragent-0.1.11 geographiclib-1.52 geopy-2.2.0 googletransx-2.4.2 pandas-1.3.5 schedule-1.1.0 twint-2.1.21

(I added two commands for the path later on in the install, see far below).

┌──(kali㉿kali)-[~/.local]
└─$ cd bin

┌──(kali㉿kali)-[~/.local/bin]
└─$ ls
cchardetect translate twint

┌──(kali㉿kali)-[~/.local/bin]
└─$ twint -u networkchuck
Traceback (most recent call last):
File "/usr/local/bin/twint", line 33, in
sys.exit(load_entry_point('twint==2.1.21', 'console_scripts', 'twint')())
File "/usr/local/bin/twint", line 25, in importlib_load_entry_point
return next(matches).load()
File "/usr/lib/python3.9/importlib/metadata.py", line 77, in load
module = import_module(match.group('module'))
File "/usr/lib/python3.9/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1030, in _gcd_import
File "", line 1007, in _find_and_load
File "", line 972, in _find_and_load_unlocked
File "", line 228, in _call_with_frames_removed
File "", line 1030, in _gcd_import
File "", line 1007, in _find_and_load
File "", line 986, in _find_and_load_unlocked
File "", line 680, in _load_unlocked
File "", line 850, in exec_module
File "", line 228, in _call_with_frames_removed
File "/home/kali/.local/lib/python3.9/site-packages/twint/init.py", line 14, in
from . import run
File "/home/kali/.local/lib/python3.9/site-packages/twint/run.py", line 4, in
from . import datelock, feed, get, output, verbose, storage
File "/home/kali/.local/lib/python3.9/site-packages/twint/get.py", line 12, in
from aiohttp_socks import ProxyConnector, ProxyType
File "/home/kali/.local/lib/python3.9/site-packages/aiohttp_socks/init.py", line 5, in
from .connector import (
File "/home/kali/.local/lib/python3.9/site-packages/aiohttp_socks/connector.py", line 8, in
from aiohttp.helpers import CeilTimeout # noqa
ImportError: cannot import name 'CeilTimeout' from 'aiohttp.helpers' (/usr/lib/python3/dist-packages/aiohttp/helpers.py)

┌──(kali㉿kali)-[~/.local/bin]
└─$ echo $SHELL

/bin/zsh

┌──(kali㉿kali)-[~/.local/bin]
└─$ bash

┌──(kali㉿kali)-[~/.local/bin]
└─$ myip
Command 'myip' not found, did you mean:
command 'mzip' from deb mtools
Try: sudo apt install

┌──(kali㉿kali)-[~/.local/bin]
└─$ twint -u networkchuck -s "raspberry pi"
Traceback (most recent call last):
File "/usr/local/bin/twint", line 33, in
sys.exit(load_entry_point('twint==2.1.21', 'console_scripts', 'twint')())
File "/usr/local/bin/twint", line 25, in importlib_load_entry_point
return next(matches).load()
File "/usr/lib/python3.9/importlib/metadata.py", line 77, in load
module = import_module(match.group('module'))
File "/usr/lib/python3.9/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1030, in _gcd_import
File "", line 1007, in _find_and_load
File "", line 972, in _find_and_load_unlocked
File "", line 228, in _call_with_frames_removed
File "", line 1030, in _gcd_import
File "", line 1007, in _find_and_load
File "", line 986, in _find_and_load_unlocked
File "", line 680, in _load_unlocked
File "", line 850, in exec_module
File "", line 228, in _call_with_frames_removed
File "/home/kali/.local/lib/python3.9/site-packages/twint/init.py", line 14, in
from . import run
File "/home/kali/.local/lib/python3.9/site-packages/twint/run.py", line 4, in
from . import datelock, feed, get, output, verbose, storage
File "/home/kali/.local/lib/python3.9/site-packages/twint/get.py", line 12, in
from aiohttp_socks import ProxyConnector, ProxyType
File "/home/kali/.local/lib/python3.9/site-packages/aiohttp_socks/init.py", line 5, in
from .connector import (
File "/home/kali/.local/lib/python3.9/site-packages/aiohttp_socks/connector.py", line 8, in
from aiohttp.helpers import CeilTimeout # noqa
ImportError: cannot import name 'CeilTimeout' from 'aiohttp.helpers' (/usr/lib/python3/dist-packages/aiohttp/helpers.py)

──(kali㉿kali)-[~/.local/bin]
└─$ cd ..

┌──(kali㉿kali)-[~/.local]
└─$ cd..
cd..: command not found

┌──(kali㉿kali)-[~/.local]
└─$ cd ..

┌──(kali㉿kali)-[~]
└─$

┌──(kali㉿kali)-[~]
└─$ twint -u networkchuck -s "raspberry pi"
Command 'twint' not found, did you mean:
command 'twine' from deb twine
Try: sudo apt install

┌──(kali㉿kali)-[~/twint]
└─$ python3
Python 3.9.9 (main, Dec 16 2021, 23:13:29)
[GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

┌──(kali㉿kali)-[~/twint]
└─$ twint -u networkchuck -s "raspberry pi"
Command 'twint' not found, did you mean:
command 'twine' from deb twine
Try: sudo apt install

┌──(kali㉿kali)-[~]
└─$ sudo apt update && sudo apt upgrade

┌──(kali㉿kali)-[~]
└─$ systemctl reboot -i

┌──(kali㉿kali)-[~/twint]
└─$cd twint

┌──(kali㉿kali)-[~/twint]
└─$ chmod +x setup.py

┌──(kali㉿kali)-[~/twint]
└─$ python3 setup.py
usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
or: setup.py --help [cmd1 cmd2 ...]
or: setup.py --help-commands
or: setup.py cmd --help

error: no commands supplied
──(kali㉿kali)-[~/twint]
└─$ python3 setup.py --help
Common commands: (see '--help-commands' for more)

setup.py build will build the package underneath 'build/'
setup.py install will install the package

┌──(kali㉿kali)-[~/twint]
└─$ sudo python3 setup.py install

running install

┌──(kali㉿kali)-[~/twint]
└─$ sudo python3 setup.py build
running build
running build_py

┌──(kali㉿kali)-[~/twint]
└─$ twint -u networkchuck -s "raspberry pi"
Traceback (most recent call last):
File "/usr/local/bin/twint", line 33, in
sys.exit(load_entry_point('twint==2.1.21', 'console_scripts', 'twint')())
File "/usr/local/bin/twint", line 25, in importlib_load_entry_point
return next(matches).load()
File "/usr/lib/python3.9/importlib/metadata.py", line 77, in load
module = import_module(match.group('module'))
File "/usr/lib/python3.9/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1030, in _gcd_import
File "", line 1007, in _find_and_load
File "", line 972, in _find_and_load_unlocked
File "", line 228, in _call_with_frames_removed
File "", line 1030, in _gcd_import
File "", line 1007, in _find_and_load
File "", line 986, in _find_and_load_unlocked
File "", line 680, in _load_unlocked
File "", line 850, in exec_module
File "", line 228, in _call_with_frames_removed
File "/home/kali/.local/lib/python3.9/site-packages/twint/init.py", line 14, in
from . import run
File "/home/kali/.local/lib/python3.9/site-packages/twint/run.py", line 4, in
from . import datelock, feed, get, output, verbose, storage
File "/home/kali/.local/lib/python3.9/site-packages/twint/get.py", line 12, in
from aiohttp_socks import ProxyConnector, ProxyType
File "/home/kali/.local/lib/python3.9/site-packages/aiohttp_socks/init.py", line 5, in
from .connector import (
File "/home/kali/.local/lib/python3.9/site-packages/aiohttp_socks/connector.py", line 8, in
from aiohttp.helpers import CeilTimeout # noqa
ImportError: cannot import name 'CeilTimeout' from 'aiohttp.helpers' (/usr/lib/python3/dist-packages/aiohttp/helpers.py)

Research Issue.

Fix 1: I searched on Google with the error and found,

#1071

Asked to run the command below to check if there was a .local/bin folder created.

$ echo $PATH | tr -s ":" "\n" | sort

┌──(kali㉿kali)-[~/Downloads/twint]
└─$ echo $PATH | tr -s ":" "\n" | sort 1 ⨯
/bin
/home/kali/.dotnet/tools
/home/kali/.local/bin
/sbin
/usr/bin
/usr/games
/usr/local/bin
/usr/local/games
/usr/local/sbin
/usr/sbin

The advice was to add these two lines to the .zshrc script as I am using zsh shell.
I am having the same issue in bash, so I don't think it matters at all for the moment.

$ export PYTHON_BIN_PATH="$(python3 -m site --user-base)/bin"

$ export PATH="$PATH:$PYTHON_BIN_PATH"

This method didn't work for me.
Still have the error.

Fix2: $ pip3 install yarl --force-reinstall --no-cache-dir

https://stackoverflow.com/questions/64747304/twint-python-library-is-causing-exception-for-search-query?rq=1

A fix was mentioned by a user - "I had the same issue with python3.7 and resolve it by reinstalling yarl, like this way:"

$ pip3 install yarl --force-reinstall --no-cache-dir

This method didn't work for me.

Fix3: To uninstall and reinstall twint with the upgrade.

#915

    pip3 uninstall twint
    pip3 install --user --upgrade git+https://github.com/twintproject/twint.git@origin/master#egg=twint
    sudo pip3 install twint

This method didn't work for me.

Fix4: Install on Google Cloud Shell.

Google Cloud New build

    sudo git clone –depth=1 https://github.com/twintproject/twint.git
    cd twint
    sudo pip3 install . -r requirements.txt
    pip3 install twint

This worked for me but once you leave the session, twint seems to be causing the same issue as I had on the Kali machine.

twint -u networkchuck –limit 20 (Did not recognise -limit 20)

twint -u networkchuck -s “raspberry pi”

New Error message.

$ twint -u networkchuck -s crypto -o rightnow.json–json

Traceback (most recent call last):
File "/home/rangersmyth_74/.local/bin/twint", line 10, in
sys.exit(run_as_command())
File "/home/rangersmyth_74/.local/lib/python3.7/site-packages/twint/cli.py", line 339, in run_as_command
main()
File "/home/rangersmyth_74/.local/lib/python3.7/site-packages/twint/cli.py", line 330, in main
run.Search(c)
File "/home/rangersmyth_74/.local/lib/python3.7/site-packages/twint/run.py", line 410, in Search
run(config, callback)
File "/home/rangersmyth_74/.local/lib/python3.7/site-packages/twint/run.py", line 329, in run
get_event_loop().run_until_complete(Twint(config).main(callback))
File "/home/rangersmyth_74/.local/lib/python3.7/site-packages/twint/run.py", line 36, in init
self.token.refresh()
File "/home/rangersmyth_74/.local/lib/python3.7/site-packages/twint/token.py", line 69, in refresh
raise RefreshTokenException('Could not find the Guest token in HTML')
twint.token.RefreshTokenException: Could not find the Guest token in HTML

Searching for twint.token

#1146
Fix 5. Add to PATH.
https://issueexplorer.com/issue/twintproject/twint/1189

"" MikeTheScriptKid wrote this answer on 2021-05-08 ""
So I finally got the twint command working in Kali.
I added the following to /etc/environment
/your_user/.local/bin

"" HarryWestFord wrote this answer on 2021-05-10 ""
"sorry for being such a newbie but how excalty do you add this on the command line?

"" hackingbutlegal wrote this answer on 2021-05-15 ""

nano /etc/environment

This didn't work for me.

Found a fix to sort out the token problem

SCRIPT from >> #1146

import re
import time
import logging as logme
import requests

class TokenExpiryException(Exception):
def init(self, msg):
super().init(msg)

class RefreshTokenException(Exception):
def init(self, msg):
super().init(msg)

class Token:
def init(self, config):
self._session = requests.Session()
self._session.headers.update(
{'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0'})
self.config = config
self._proxies = self._set_proxies()
self._retries = 5
self._timeout = 10
self.url = 'https://twitter.com'

def _set_proxies(self) -> dict:
    settings = [self.config.Proxy_type, self.config.Proxy_host, self.config.Proxy_port]
    if not all(settings):
        logme.debug(f"No proxy in config")
        return {}

    proxy_type = self.config.Proxy_type.lower()
    proxy_val = f"{self.config.Proxy_host}:{self.config.Proxy_port}"
    proxies = {proxy_type: proxy_val}
    if proxy_type == 'http':
        proxies['https'] = proxy_val
    return proxies

def _request(self):
    for attempt in range(self._retries + 1):
        # The request is newly prepared on each retry because of potential cookie updates.
        req = self._session.prepare_request(requests.Request('GET', self.url))
        logme.debug(f'Retrieving {req.url}')
        try:
            if self._proxies:
                r = self._session.send(
                    req,
                    allow_redirects=True,
                    timeout=self._timeout,
                    proxies=self._proxies,
                    verify=False
                )
            else:
                r = self._session.send(req, allow_redirects=True, timeout=self._timeout)
        except requests.exceptions.RequestException as exc:
            if attempt < self._retries:
                retrying = ', retrying'
                level = logme.WARNING
            else:
                retrying = ''
                level = logme.ERROR
            logme.log(level, f'Error retrieving {req.url}: {exc!r}{retrying}')
        else:
            success, msg = (True, None)
            msg = f': {msg}' if msg else ''

            if success:
                logme.debug(f'{req.url} retrieved successfully{msg}')
                return r
        if attempt < self._retries:
            # TODO : might wanna tweak this back-off timer
            sleep_time = 2.0 * 2 ** attempt
            logme.info(f'Waiting {sleep_time:.0f} seconds')
            time.sleep(sleep_time)
    else:
        msg = f'{self._retries + 1} requests to {self.url} failed, giving up.'
        logme.fatal(msg)
        self.config.Guest_token = None
        raise RefreshTokenException(msg)

def refresh(self):
    logme.debug('Retrieving guest token')
    res = self._request()
    match = re.search(r'\("gt=(\d+);', res.text)
    if match:
        logme.debug('Found guest token in HTML')
        self.config.Guest_token = str(match.group(1))
    else:
        self.config.Guest_token = None
        raise RefreshTokenException('Could not find the Guest token in HTML')

I ran this script and got back nothing, no error, but when I did a search it didn't work.

Back to the Drawing Board.

References:

https://github.com/twintproject/twint/wiki/Setup

https://github.com/twintproject/twint/wiki/Configuration

#468

#915

#917

#944

#980

#1114 - Has a fix to upgrade twint && I also repored my issue here.

#1146 - Token Fix

https://github.com/Altimis/Scweet

http://saka.docsio.net/66892325/scrape-join-dates-user-info-from-a-list-csv-of-twitter-users

http://saka.docsio.net/64747304/twint-python-library-is-causing-exception-for-search-qeruy

https://www.geeksforgeeks.org/how-to-use-twint-osint-tool-on-google-cloud-console/

https://github.com/himanshudabas/twint/tree/twint-fixes

https://www.kaggle.com/zelinngilo/punya-nadia?scriptVersionId=57423357

https://www.geeksforgeeks.org/how-to-use-twint-osint-tool-on-google-cloud-console/

https://stackoverflow.com/questions/64747304/twint-python-library-is-causing-exception-for-search-query?rq=1

https://www.bountysource.com/teams/tweep/issues?tracker_ids=65007358

https://issueexplorer.com/issue/twintproject/twint/1266

@minamotorin
Copy link

minamotorin commented Dec 30, 2021

Solution

The following patch works for me.

# This patch is WTFPL (http://www.wtfpl.net/txt/copying/) and no warranty.
diff --git a/twint/token.py b/twint/token.py
index ae66a24..2eedcee 100644
--- a/twint/token.py
+++ b/twint/token.py
@@ -65,5 +65,30 @@ class Token:
             logme.debug('Found guest token in HTML')
             self.config.Guest_token = str(match.group(1))
         else:
-            self.config.Guest_token = None
-            raise RefreshTokenException('Could not find the Guest token in HTML')
+            headers = {
+                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0',
+                'authority': 'api.twitter.com',
+                'content-length': '0',
+                'authorization': self.config.Bearer_token,
+                'x-twitter-client-language': 'en',
+                'x-csrf-token': res.cookies.get("ct0"),
+                'x-twitter-active-user': 'yes',
+                'content-type': 'application/x-www-form-urlencoded',
+                'accept': '*/*',
+                'sec-gpc': '1',
+                'origin': 'https://twitter.com',
+                'sec-fetch-site': 'same-site',
+                'sec-fetch-mode': 'cors',
+                'sec-fetch-dest': 'empty',
+                'referer': 'https://twitter.com/',
+                'accept-language': 'en-US',
+            }
+            self._session.headers.update(headers)
+            req = self._session.prepare_request(requests.Request('POST', 'https://api.twitter.com/1.1/guest/activate.json'))
+            res = self._session.send(req, allow_redirects=True, timeout=self._timeout)
+            match = re.search(r'{"guest_token":"(\d+)"}', res.text)
+            if match:
+                logme.debug('Found guest token in JSON')
+                self.config.Guest_token = str(match.group(1))
+            else:
+                self.config.Guest_token = None
+                raise RefreshTokenException('Could not find the Guest token in JSON')

I don't understand session of requests, so the code may be not good.
I hope someone rewrite the patch better and create a pull request.

About the problem

In my environment, this problem has recently begun to occur.
This doesn't happen every time, so if you are lucky, you don't get the error.

The cause is literally literally that twint could not find the Guest token in HTML.
Actually, sometimes token isn't included in HTML recently.

#!/usr/bin/env python3
# This program is WTFPL.
import requests

res = requests.get('https://twitter.com')
print(res.text.split('\n')[-1])

twint require the result of running the above code is })();</script><script nonce="VALUE">document.cookie = decodeURIComponent("gt=VALUE; Max-Age=VALUE; Domain=.twitter.com; Path=/; Secure");</script>.
However, sometimes the result is only })();</script> and missing the Guest token.

About the solution

In my patch, twint get the Guest token from https://api.twitter.com/1.1/guest/activate.json if could not find the one.
I referred to the code of gallery-dl.

These are the steps I took to install twint and get try and get it to work on 2 versions of Linux and 1 Deb. I have spent 9 hours trying my best, but no fix anywhere, none at all, not one.

Unfortunately, twint hasn't been updated for a long time.
If the problem is new, updating will not solve the problem.

In this case, the problem is too new to find a solution.
However, the problem may not happen in other environments, so I want reports.

gabrielruoff added a commit to gabrielruoff/twint that referenced this issue Dec 31, 2021
@yvessaintlaureint
Copy link

tbh, I've got the same problem like this. My OS is Windows and I'm trying to scrap twitter's tweet in google colab. In the end, RefreshTokenException: Could not find the Guest token in HTML appear everytime I ran twint.run.Search(c). I though that it was just from google colab but after I tried it in Jupyter, the result still the same which is couldn't find the guest token. Any solution for this?

LinqLover pushed a commit to LinqLover/twint that referenced this issue Jan 2, 2022
ABOUT THE PROBLEM

This problem has recently begun to occur on some environments.
This doesn't happen every time, so if you are lucky, you don't get the
error.

The cause is literally literally that twint could not find the Guest
token in HTML.
Actually, sometimes token isn't included in HTML recently.

    #!/usr/bin/env python3
    # This program is WTFPL.
    import requests

res = requests.get('https://twitter.com')
print(res.text.split('\n')[-1])
twint require the result of running the above code is })();</script><script nonce="VALUE">document.cookie = decodeURIComponent("gt=VALUE; Max-Age=VALUE; Domain=.twitter.com; Path=/; Secure");</script>.
However, sometimes the result is only })();</script> and missing the
Guest token.

ABOUT THE SOLUTION

In this patch, twint get the Guest token from
https://api.twitter.com/1.1/guest/activate.json if could not find the
one.
The author referred to the code of gallery-dl:
https://github.com/mikf/gallery-dl/blob/47eae4c393f09937a5dbcc2cb978702fb173e747/gallery_dl/extractor/twitter.py#L780-L783

Author's note:

> I don't understand session of requests, so the code may be not good.
> I hope someone rewrite the patch better and create a pull request.

This commit was adopted from:
twintproject#1320 (comment)

Closes twintproject#1320.
@LinqLover
Copy link
Contributor

Thank you for the patch @minamotorin which worked for me too! I have created a PR #1322, hopefully any maintainer will be able to merge it soon.

@minamotorin
Copy link

@LinqLover @gabrielruoff

I forgot to update log message, so I updated my patch.
Please confirm it.

The changes are the following two points.

  • logme.debug('Found guest token in JSON') was added.
  • Error message of RefreshTokenException was changed.

LinqLover pushed a commit to Museum-Barberini/twint that referenced this issue Jan 2, 2022
stoep added a commit to stoep/twint that referenced this issue Jan 2, 2022
@aaabbbbbbb

This comment has been minimized.

fortunto2 added a commit to fortunto2/twint that referenced this issue Jan 6, 2022
djvdorp added a commit to djvdorp/twint that referenced this issue Jan 7, 2022
@prhbrt
Copy link

prhbrt commented Jan 7, 2022

Additionally to @minamotorin's response, I use this hotfix in my ipynbs. It also adds a response field to the exception, to simplify debugging.

import nest_asyncio
nest_asyncio.apply()
import re
import twint
from twint.token import Token as OriginalToken, RefreshTokenException

class Token(OriginalToken):
  def __init__(self, config):
    super().__init__(config)
    self._session.headers.update({
      'User-Agent': 'Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36'
    })
  
  def refresh(self):
    res = self._request()
    match = re.search(r'\("gt=(\d+);', res.text)
    if match:
      self.config.Guest_token = str(match.group(1))
    else:
      headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0',
        'authority': 'api.twitter.com',
        'content-length': '0',
        'authorization': self.config.Bearer_token,
        'x-twitter-client-language': 'en',
        'x-csrf-token': res.cookies.get("ct0"),
        'x-twitter-active-user': 'yes',
        'content-type': 'application/x-www-form-urlencoded',
        'accept': '*/*',
        'sec-gpc': '1',
        'origin': 'https://twitter.com',
        'sec-fetch-site': 'same-site',
        'sec-fetch-mode': 'cors',
        'sec-fetch-dest': 'empty',
        'referer': 'https://twitter.com/',
        'accept-language': 'en-US',
      }
      self._session.headers.update(headers)
      req = self._session.prepare_request(requests.Request('POST', 'https://api.twitter.com/1.1/guest/activate.json'))
      res = self._session.send(req, allow_redirects=True, timeout=self._timeout)
      match = re.search(r'{"guest_token":"(\d+)"}', res.text)
      if match:
        self.config.Guest_token = str(match.group(1))
      else:
        self.config.Guest_token = None
        exception = RefreshTokenException('Could not find the Guest token in HTML')
        exception.response = res
        raise exception

twint.token.Token = Token

query = "#kaag" #@param {"type": "string"}
since = "2021-01-01" #@param {"type": "string"}
until = "2022-01-08" #@param {"type": "string"}


c = twint.Config()
query = query
c.Search = query
c.Limit = 1000
c.Since, c.Until = since, until
c.Store_object =  True
c.User_full = True
c.Profile_full = True
c.Hide_output = True

try:
  twint.run.Search(c)
except Exception as e:
  response = e.response
  raise




@rangersmyth74
Copy link
Author

Solution

The following patch works for me.

# This patch is WTFPL (http://www.wtfpl.net/txt/copying/) and no warranty.
diff --git a/twint/token.py b/twint/token.py
index ae66a24..2eedcee 100644
--- a/twint/token.py
+++ b/twint/token.py
@@ -65,5 +65,30 @@ class Token:
             logme.debug('Found guest token in HTML')
             self.config.Guest_token = str(match.group(1))
         else:
-            self.config.Guest_token = None
-            raise RefreshTokenException('Could not find the Guest token in HTML')
+            headers = {
+                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0',
+                'authority': 'api.twitter.com',
+                'content-length': '0',
+                'authorization': self.config.Bearer_token,
+                'x-twitter-client-language': 'en',
+                'x-csrf-token': res.cookies.get("ct0"),
+                'x-twitter-active-user': 'yes',
+                'content-type': 'application/x-www-form-urlencoded',
+                'accept': '*/*',
+                'sec-gpc': '1',
+                'origin': 'https://twitter.com',
+                'sec-fetch-site': 'same-site',
+                'sec-fetch-mode': 'cors',
+                'sec-fetch-dest': 'empty',
+                'referer': 'https://twitter.com/',
+                'accept-language': 'en-US',
+            }
+            self._session.headers.update(headers)
+            req = self._session.prepare_request(requests.Request('POST', 'https://api.twitter.com/1.1/guest/activate.json'))
+            res = self._session.send(req, allow_redirects=True, timeout=self._timeout)
+            match = re.search(r'{"guest_token":"(\d+)"}', res.text)
+            if match:
+                logme.debug('Found guest token in JSON')
+                self.config.Guest_token = str(match.group(1))
+            else:
+                self.config.Guest_token = None
+                raise RefreshTokenException('Could not find the Guest token in JSON')

I don't understand session of requests, so the code may be not good. I hope someone rewrite the patch better and create a pull request.

About the problem

In my environment, this problem has recently begun to occur. This doesn't happen every time, so if you are lucky, you don't get the error.

The cause is literally literally that twint could not find the Guest token in HTML. Actually, sometimes token isn't included in HTML recently.

#!/usr/bin/env python3
# This program is WTFPL.
import requests

res = requests.get('https://twitter.com')
print(res.text.split('\n')[-1])

twint require the result of running the above code is })();</script><script nonce="VALUE">document.cookie = decodeURIComponent("gt=VALUE; Max-Age=VALUE; Domain=.twitter.com; Path=/; Secure");</script>. However, sometimes the result is only })();</script> and missing the Guest token.

About the solution

In my patch, twint get the Guest token from https://api.twitter.com/1.1/guest/activate.json if could not find the one. I referred to the code of gallery-dl.

These are the steps I took to install twint and get try and get it to work on 2 versions of Linux and 1 Deb. I have spent 9 hours trying my best, but no fix anywhere, none at all, not one.

Unfortunately, twint hasn't been updated for a long time. If the problem is new, updating will not solve the problem.

In this case, the problem is too new to find a solution. However, the problem may not happen in other environments, so I want reports.

Happy New Year to you minamotorin and thank you for explaining what was happening.

Happy New Year to everyone also!

I left this problem and worked on another project only coming back now to have another go at installing.

As a complete noob, I understand that the token.py file is the one to edit. I am just not sure what to do next.

Q: Is this the code I have to put into the token.py file?

#!/usr/bin/env python3

This program is WTFPL.

import requests

res = requests.get('https://twitter.com')
print(res.text.split('\n')[-1])

Q: Or, do I copy and paste the code long code above into the token.py?

I will report back with my findings.

This post will help many I believe and hope.

Thanks in advance.

Ranger.

@rangersmyth74
Copy link
Author

Additionally to @minamotorin's response, I use this hotfix in my ipynbs. It also adds a response field to the exception, to simplify debugging.

import nest_asyncio
nest_asyncio.apply()
import re
import twint
from twint.token import Token as OriginalToken, RefreshTokenException

class Token(OriginalToken):
  def __init__(self, config):
    super().__init__(config)
    self._session.headers.update({
      'User-Agent': 'Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36'
    })
  
  def refresh(self):
    res = self._request()
    match = re.search(r'\("gt=(\d+);', res.text)
    if match:
      self.config.Guest_token = str(match.group(1))
    else:
      headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0',
        'authority': 'api.twitter.com',
        'content-length': '0',
        'authorization': self.config.Bearer_token,
        'x-twitter-client-language': 'en',
        'x-csrf-token': res.cookies.get("ct0"),
        'x-twitter-active-user': 'yes',
        'content-type': 'application/x-www-form-urlencoded',
        'accept': '*/*',
        'sec-gpc': '1',
        'origin': 'https://twitter.com',
        'sec-fetch-site': 'same-site',
        'sec-fetch-mode': 'cors',
        'sec-fetch-dest': 'empty',
        'referer': 'https://twitter.com/',
        'accept-language': 'en-US',
      }
      self._session.headers.update(headers)
      req = self._session.prepare_request(requests.Request('POST', 'https://api.twitter.com/1.1/guest/activate.json'))
      res = self._session.send(req, allow_redirects=True, timeout=self._timeout)
      match = re.search(r'{"guest_token":"(\d+)"}', res.text)
      if match:
        self.config.Guest_token = str(match.group(1))
      else:
        self.config.Guest_token = None
        exception = RefreshTokenException('Could not find the Guest token in HTML')
        exception.response = res
        raise exception

twint.token.Token = Token

query = "#kaag" #@param {"type": "string"}
since = "2021-01-01" #@param {"type": "string"}
until = "2022-01-08" #@param {"type": "string"}


c = twint.Config()
query = query
c.Search = query
c.Limit = 1000
c.Since, c.Until = since, until
c.Store_object =  True
c.User_full = True
c.Profile_full = True
c.Hide_output = True

try:
  twint.run.Search(c)
except Exception as e:
  response = e.response
  raise

Hey, thanks for this!

What do I need to do so I can do what you suggested please? I am a Noob! lolz.

I have no idea what to do!! lolz, But willing to learn,

I copy and paste all of your code into a file that exists? Then to delete all the code in that file and paste in your code?

Which file and what is the location please?

And am I missing any steps?

@Silvthril
Copy link

Silvthril commented Jan 7, 2022

I did this to solve the problem

Overwrite your existing token.py file with the file contents contained here: https://github.com/Museum-Barberini/twint/blob/fix/RefreshTokenException/twint/token.py

See #1075 for a full review of how this worked out for me.

@mlkorra
Copy link

mlkorra commented Jan 8, 2022

I did this to solve the problem

Overwrite your existing token.py file with the file contents contained here: https://github.com/Museum-Barberini/twint/blob/fix/RefreshTokenException/twint/token.py

See #1075 for a full review of how this worked out for me.

Thanks,its working for me as well

@jonathanmc
Copy link

jonathanmc commented Jan 8, 2022

Worked but just noting for me it only retrieves about a week plus a day of tweets.

SiD3W4y added a commit to SiD3W4y/twint that referenced this issue Jan 8, 2022
Adds the workaround highlighted in issue twintproject#1320 by hardcoding the
guest token.
@Hoefnix
Copy link

Hoefnix commented Jan 9, 2022

I did this to solve the problem

Overwrite your existing token.py file with the file contents contained here: https://github.com/Museum-Barberini/twint/blob/fix/RefreshTokenException/twint/token.py

See #1075 for a full review of how this worked out for me.

This works perfectly for me. Thank you

@rangersmyth74
Copy link
Author

Hey all,

I replaced the code from https://github.com/Museum-Barberini/twint/blob/fix/RefreshTokenException/twint/token.py and replaced it on my file token.py.

I am still getting the error.

I added this code to the top of token.py

#!/usr/bin/env python3

This program is WTFPL.

import requests

res = requests.get('https://twitter.com')
print(res.text.split('\n')[-1])

I am still getting the error.

I then added this bit of code to the token.py at the bottom.

import nest_asyncio
nest_asyncio.apply()
import re
import twint
from twint.token import Token as OriginalToken, RefreshTokenException

class Token(OriginalToken):
def init(self, config):
super().init(config)
self._session.headers.update({
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36'
})

def refresh(self):
res = self._request()
match = re.search(r'("gt=(\d+);', res.text)
if match:
self.config.Guest_token = str(match.group(1))
else:
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0',
'authority': 'api.twitter.com',
'content-length': '0',
'authorization': self.config.Bearer_token,
'x-twitter-client-language': 'en',
'x-csrf-token': res.cookies.get("ct0"),
'x-twitter-active-user': 'yes',
'content-type': 'application/x-www-form-urlencoded',
'accept': '/',
'sec-gpc': '1',
'origin': 'https://twitter.com',
'sec-fetch-site': 'same-site',
'sec-fetch-mode': 'cors',
'sec-fetch-dest': 'empty',
'referer': 'https://twitter.com/',
'accept-language': 'en-US',
}
self._session.headers.update(headers)
req = self._session.prepare_request(requests.Request('POST', 'https://api.twitter.com/1.1/guest/activate.json'))
res = self._session.send(req, allow_redirects=True, timeout=self._timeout)
match = re.search(r'{"guest_token":"(\d+)"}', res.text)
if match:
self.config.Guest_token = str(match.group(1))
else:
self.config.Guest_token = None
exception = RefreshTokenException('Could not find the Guest token in HTML')
exception.response = res
raise exception

twint.token.Token = Token

query = "#kaag" #@param {"type": "string"}
since = "2021-01-01" #@param {"type": "string"}
until = "2022-01-08" #@param {"type": "string"}

c = twint.Config()
query = query
c.Search = query
c.Limit = 1000
c.Since, c.Until = since, until
c.Store_object = True
c.User_full = True
c.Profile_full = True
c.Hide_output = True

try:
twint.run.Search(c)
except Exception as e:
response = e.response
raise

Q: Is what I pasted (long code) have to be in a different file?

Can someone point me in the right direction please, as I am very close to fixing this!!

@minamotorin
Copy link

Happy new year too!

@rangersmyth74

If you don't know about syntax of diff, checking this DIFFirence while comparing before patching and after patching helps to understand this patch.

# This patch is WTFPL (http://www.wtfpl.net/txt/copying/) and no warranty.
diff --git a/twint/token.py b/twint/token.py
index ae66a24..2eedcee 100644
--- a/twint/token.py
+++ b/twint/token.py
@@ -65,5 +65,30 @@ class Token:
             logme.debug('Found guest token in HTML')
             self.config.Guest_token = str(match.group(1))
         else:
-            self.config.Guest_token = None
-            raise RefreshTokenException('Could not find the Guest token in HTML')
+            headers = {
+                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0',
+                'authority': 'api.twitter.com',
+                'content-length': '0',
+                'authorization': self.config.Bearer_token,
+                'x-twitter-client-language': 'en',
+                'x-csrf-token': res.cookies.get("ct0"),
+                'x-twitter-active-user': 'yes',
+                'content-type': 'application/x-www-form-urlencoded',
+                'accept': '*/*',
+                'sec-gpc': '1',
+                'origin': 'https://twitter.com',
+                'sec-fetch-site': 'same-site',
+                'sec-fetch-mode': 'cors',
+                'sec-fetch-dest': 'empty',
+                'referer': 'https://twitter.com/',
+                'accept-language': 'en-US',
+            }
+            self._session.headers.update(headers)
+            req = self._session.prepare_request(requests.Request('POST', 'https://api.twitter.com/1.1/guest/activate.json'))
+            res = self._session.send(req, allow_redirects=True, timeout=self._timeout)
+            match = re.search(r'{"guest_token":"(\d+)"}', res.text)
+            if match:
+                logme.debug('Found guest token in JSON')
+                self.config.Guest_token = str(match.group(1))
+            else:
+                self.config.Guest_token = None
+                raise RefreshTokenException('Could not find the Guest token in JSON')

The following code doesn't matter for you to solve the problem.

#!/usr/bin/env python3
# This program is WTFPL.
import requests

res = requests.get('https://twitter.com')
print(res.text.split('\n')[-1])

@minamotorin
Copy link

minamotorin commented Jan 10, 2022

@jonathanmc

Have you tried #1266 (comment)?

@jonathanmc
Copy link

jonathanmc commented Jan 10, 2022

Worked thank you!

I just tried it and retrieved almost a year's worth before name resolution failure. That might be unrelated - my Internet seemed to go down. I may need to find and increase the timer on twint. EDIT: This does appear to be unrelated, as it has been downloading successfully for a while now.

For anyone following after, I completely uninstalled twint, then reinstalled it via pip3 install twint, then made 1. change to token.py, and 2. uncommented the line in url.py above.

@minamotorin
Copy link

@jonathanmc

You are welcome!

reinstalled it via pip3 install twint

I think you should install twint via pip3 install git+https://github.com/twintproject/twint.git@origin/master#egg=twint (not pip3 install twint) to avoid bugs fixed after latest version of PyPI.

In addition, you can skip step 1 by installing twint via pip3 install git+https://github.com/Museum-Barberini/twint.git@fix/RefreshTokenException#egg=twint (see #1322 (comment)).

@ElizabethCappon
Copy link

New version of token.py fixed the guest token error, but now I'm back to only being able to get tweets from the last 10 days. I uncommented the line in url.py. This worked before the token.py guest token fix, but now it stopped working.

Anyone else with this problem? Anyone has an idea how to fix this?

I'm on Ubuntu 20.04.3 , Twint 2.1.21, Python 3.8.8, aiohttp-socks 0.7.1

@irisdemented
Copy link

@minamotorin worked for me, thank you so much!

@codeghees
Copy link

Ran into a profile banner issue key error. #1329 PR raised, please test if it works for you all!

@jonathanmc
Copy link

To note,

  1. pip3 install git+https://github.com/Museum-Barberini/twint.git@fix/RefreshTokenException#egg=twint

  2. uncommented url.py:
    ('query_source', 'typed_query'),`

Fetches 20 days or so of history on some accounts. Doesn't seem to do the same for all. Same issue as someone above.

minamotorin added a commit to minamotorin/twint that referenced this issue Feb 6, 2022
Reference: twintproject#1328, twintproject#1322

This problem doesn't happen recently, but too big is better than too small.
minamotorin added a commit to minamotorin/twint that referenced this issue Feb 6, 2022
@Florettt
Copy link

Solution

The following patch works for me.

# This patch is WTFPL (http://www.wtfpl.net/txt/copying/) and no warranty.
diff --git a/twint/token.py b/twint/token.py
index ae66a24..2eedcee 100644
--- a/twint/token.py
+++ b/twint/token.py
@@ -65,5 +65,30 @@ class Token:
             logme.debug('Found guest token in HTML')
             self.config.Guest_token = str(match.group(1))
         else:
-            self.config.Guest_token = None
-            raise RefreshTokenException('Could not find the Guest token in HTML')
+            headers = {
+                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0',
+                'authority': 'api.twitter.com',
+                'content-length': '0',
+                'authorization': self.config.Bearer_token,
+                'x-twitter-client-language': 'en',
+                'x-csrf-token': res.cookies.get("ct0"),
+                'x-twitter-active-user': 'yes',
+                'content-type': 'application/x-www-form-urlencoded',
+                'accept': '*/*',
+                'sec-gpc': '1',
+                'origin': 'https://twitter.com',
+                'sec-fetch-site': 'same-site',
+                'sec-fetch-mode': 'cors',
+                'sec-fetch-dest': 'empty',
+                'referer': 'https://twitter.com/',
+                'accept-language': 'en-US',
+            }
+            self._session.headers.update(headers)
+            req = self._session.prepare_request(requests.Request('POST', 'https://api.twitter.com/1.1/guest/activate.json'))
+            res = self._session.send(req, allow_redirects=True, timeout=self._timeout)
+            match = re.search(r'{"guest_token":"(\d+)"}', res.text)
+            if match:
+                logme.debug('Found guest token in JSON')
+                self.config.Guest_token = str(match.group(1))
+            else:
+                self.config.Guest_token = None
+                raise RefreshTokenException('Could not find the Guest token in JSON')

I don't understand session of requests, so the code may be not good. I hope someone rewrite the patch better and create a pull request.

About the problem

In my environment, this problem has recently begun to occur. This doesn't happen every time, so if you are lucky, you don't get the error.

The cause is literally literally that twint could not find the Guest token in HTML. Actually, sometimes token isn't included in HTML recently.

#!/usr/bin/env python3
# This program is WTFPL.
import requests

res = requests.get('https://twitter.com')
print(res.text.split('\n')[-1])

twint require the result of running the above code is })();</script><script nonce="VALUE">document.cookie = decodeURIComponent("gt=VALUE; Max-Age=VALUE; Domain=.twitter.com; Path=/; Secure");</script>. However, sometimes the result is only })();</script> and missing the Guest token.

About the solution

In my patch, twint get the Guest token from https://api.twitter.com/1.1/guest/activate.json if could not find the one. I referred to the code of gallery-dl.

These are the steps I took to install twint and get try and get it to work on 2 versions of Linux and 1 Deb. I have spent 9 hours trying my best, but no fix anywhere, none at all, not one.

Unfortunately, twint hasn't been updated for a long time. If the problem is new, updating will not solve the problem.

In this case, the problem is too new to find a solution. However, the problem may not happen in other environments, so I want reports.

I tried this but I still get the error :/

@OrionLi545
Copy link

@minamotorin and everyone in this thread, it seems that these hotfixes are not working well if I am scraping a large amount of users simultaneously. Earlier tonight, twint.run.Profile was working normally, but after I began to multiprocess scrape tweets out of 8000 accounts, twint.token.RefreshTokenException: Could not find the Guest token in HTML popped up, and all future attempts will always result in this.

I am using the twint-2.1.21 version from @minamotorin's own repo. Does this mean I am subject to some anti-scraping mechanism of twitter?

@universityofkalilinux
Copy link

I really love this app and I have only used it once. But now I want to use it and I can't. I have searched so much that my eyes are sore, why am I getting these errors and why can't I find a fix for it!! hehe...

Linux Kali-ROG 5.14.0-kali4-amd64 #1 SMP Debian 5.14.16-1kali1 (2021-11-05) x86_64 GNU/Linux

Please help not just me but anyone who searches the 400 pages of information to find an answer. 9 hours of work to try this, the time I will never get back and I just want to chuck in Linux altogether and do anything else.

  • [] Python version is 3.6;
  • [] Updated Twint with pip3 install --user --upgrade -e git+https://github.com/twintproject/twint.git@origin/master#egg=twint;

These are the steps I took to install twint and get try and get it to work on 2 versions of Linux and 1 Deb. I have spent 9 hours trying my best, but no fix anywhere, none at all, not one.

What errors am I getting, all listed below

1: Install on Kali. This did not work for me.

Linux Kali-ROG 5.14.0-kali4-amd64 #1 SMP Debian 5.14.16-1kali1 (2021-11-05) x86_64 GNU/Linux

Instructions for install : https://github.com/twintproject/twint

    git clone –depth=1 https://github.com/twintproject/twint.git
    cd twint
    pip3 install . -r requirements.txt        

Try and run these commands to see if it works.

   $ twint -s pineapple
    $ twint -u networkchuck -s "raspberry pi" 

2: Install on Kali AWS. This did not work for me.

Linux kali 5.15.0-kali2-cloud-amd64 #1 SMP Debian 5.15.5-2kali2 (2021-12-22) x86_64 GNU/Linux

It seems that twint and AWS do not work due to

Instructions for install : https://www.geeksforgeeks.org/how-to-use-twint-osint-tool-on-google-cloud-console/

    sudo git clone –depth=1 https://github.com/twintproject/twint.git
    sudo cd twint
    sudo pip3 install . -r requirements.txt
    sudo pip3 install twint

Try and run these commands to see if it works.

    $ twint -s pineapple
    $ twint -u networkchuck -s "raspberry pi" 

3: Install on Google Cloud Shell and This worked for me.

Lightbox Linux cs-899333161534-default-boost-h8jtm 5.10.68+ #1 SMP Wed Dec 1 10:07:21 UTC 2021 x86_64 GNU/Linux

Instructions for install: https://www.geeksforgeeks.org/how-to-use-twint-osint-tool-on-google-cloud-console/

FYI 1 - Google didn't like the command git clone --depth=1

So I removed it and it installed.

FY 2 - Using pip3 install twint command said all was installed, but I added sudo

$ sudo git clone https://github.com/twintproject/twint.git $ ls $ cd twint $ ls $ sudo pip3 install . -r requirements.txt $ sudo pip3 install twint

Ran commands to see if it works.

    $ twint -s pineapple
    $ twint -u networkchuck -s "raspberry pi"

I was able to run twint until I logged out and logged back in, it didn't work anymore.

The Kali twint install journey.

My Install on Kali Linux.

┌──(kali㉿kali)-[~] └─$ git clone --depth=1 https://github.com/twintproject/twint.git Cloning into 'twint'... remote: Enumerating objects: 47, done. remote: Counting objects: 100% (47/47), done. remote: Compressing objects: 100% (44/44), done. remote: Total 47 (delta 3), reused 14 (delta 0), pack-reused 0 Receiving objects: 100% (47/47), 42.95 KiB | 1.59 MiB/s, done. Resolving deltas: 100% (3/3), done.

┌──(kali㉿kali)-[~] └─$ ls Desktop Documents Downloads Music Pictures Public Templates Videos bash.txt nmap twint

┌──(kali㉿kali)-[~] └─$ cd twint

┌──(kali㉿kali)-[~/twint] └─$cd twint

┌──(kali㉿kali)-[~/twint] └─$ pip3 install . -r requirements.txt Command 'pip3' not found, but can be installed with: sudo apt install python3-pip Do you want to install it? (N/y)y sudo apt install python3-pip

┌──(kali㉿kali)-[~/twint] └─$ pip3 install . -r requirements.txt

WARNING: The script twint is installed in '/home/kali/.local/bin' which is not on PATH.

Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. Successfully installed aiohttp-socks-0.4.1 cchardet-2.1.7 dataclasses-0.6 elasticsearch-7.16.2 fake-useragent-0.1.11 geographiclib-1.52 geopy-2.2.0 googletransx-2.4.2 pandas-1.3.5 schedule-1.1.0 twint-2.1.21

(I added two commands for the path later on in the install, see far below).

┌──(kali㉿kali)-[~/.local] └─$ cd bin

┌──(kali㉿kali)-[~/.local/bin] └─$ ls cchardetect translate twint

┌──(kali㉿kali)-[~/.local/bin] └─$ twint -u networkchuck Traceback (most recent call last): File "/usr/local/bin/twint", line 33, in sys.exit(load_entry_point('twint==2.1.21', 'console_scripts', 'twint')()) File "/usr/local/bin/twint", line 25, in importlib_load_entry_point return next(matches).load() File "/usr/lib/python3.9/importlib/metadata.py", line 77, in load module = import_module(match.group('module')) File "/usr/lib/python3.9/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1030, in _gcd_import File "", line 1007, in _find_and_load File "", line 972, in _find_and_load_unlocked File "", line 228, in _call_with_frames_removed File "", line 1030, in _gcd_import File "", line 1007, in _find_and_load File "", line 986, in _find_and_load_unlocked File "", line 680, in _load_unlocked File "", line 850, in exec_module File "", line 228, in _call_with_frames_removed File "/home/kali/.local/lib/python3.9/site-packages/twint/init.py", line 14, in from . import run File "/home/kali/.local/lib/python3.9/site-packages/twint/run.py", line 4, in from . import datelock, feed, get, output, verbose, storage File "/home/kali/.local/lib/python3.9/site-packages/twint/get.py", line 12, in from aiohttp_socks import ProxyConnector, ProxyType File "/home/kali/.local/lib/python3.9/site-packages/aiohttp_socks/init.py", line 5, in from .connector import ( File "/home/kali/.local/lib/python3.9/site-packages/aiohttp_socks/connector.py", line 8, in from aiohttp.helpers import CeilTimeout # noqa ImportError: cannot import name 'CeilTimeout' from 'aiohttp.helpers' (/usr/lib/python3/dist-packages/aiohttp/helpers.py)

┌──(kali㉿kali)-[~/.local/bin] └─$ echo $SHELL

/bin/zsh

┌──(kali㉿kali)-[~/.local/bin] └─$ bash

┌──(kali㉿kali)-[~/.local/bin] └─$ myip Command 'myip' not found, did you mean: command 'mzip' from deb mtools Try: sudo apt install

┌──(kali㉿kali)-[~/.local/bin] └─$ twint -u networkchuck -s "raspberry pi" Traceback (most recent call last): File "/usr/local/bin/twint", line 33, in sys.exit(load_entry_point('twint==2.1.21', 'console_scripts', 'twint')()) File "/usr/local/bin/twint", line 25, in importlib_load_entry_point return next(matches).load() File "/usr/lib/python3.9/importlib/metadata.py", line 77, in load module = import_module(match.group('module')) File "/usr/lib/python3.9/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1030, in _gcd_import File "", line 1007, in _find_and_load File "", line 972, in _find_and_load_unlocked File "", line 228, in _call_with_frames_removed File "", line 1030, in _gcd_import File "", line 1007, in _find_and_load File "", line 986, in _find_and_load_unlocked File "", line 680, in _load_unlocked File "", line 850, in exec_module File "", line 228, in _call_with_frames_removed File "/home/kali/.local/lib/python3.9/site-packages/twint/init.py", line 14, in from . import run File "/home/kali/.local/lib/python3.9/site-packages/twint/run.py", line 4, in from . import datelock, feed, get, output, verbose, storage File "/home/kali/.local/lib/python3.9/site-packages/twint/get.py", line 12, in from aiohttp_socks import ProxyConnector, ProxyType File "/home/kali/.local/lib/python3.9/site-packages/aiohttp_socks/init.py", line 5, in from .connector import ( File "/home/kali/.local/lib/python3.9/site-packages/aiohttp_socks/connector.py", line 8, in from aiohttp.helpers import CeilTimeout # noqa ImportError: cannot import name 'CeilTimeout' from 'aiohttp.helpers' (/usr/lib/python3/dist-packages/aiohttp/helpers.py)

──(kali㉿kali)-[~/.local/bin] └─$ cd ..

┌──(kali㉿kali)-[~/.local] └─$ cd.. cd..: command not found

┌──(kali㉿kali)-[~/.local] └─$ cd ..

┌──(kali㉿kali)-[~] └─$

┌──(kali㉿kali)-[~] └─$ twint -u networkchuck -s "raspberry pi" Command 'twint' not found, did you mean: command 'twine' from deb twine Try: sudo apt install

┌──(kali㉿kali)-[~/twint] └─$ python3 Python 3.9.9 (main, Dec 16 2021, 23:13:29) [GCC 11.2.0] on linux Type "help", "copyright", "credits" or "license" for more information.

┌──(kali㉿kali)-[~/twint] └─$ twint -u networkchuck -s "raspberry pi" Command 'twint' not found, did you mean: command 'twine' from deb twine Try: sudo apt install

┌──(kali㉿kali)-[~] └─$ sudo apt update && sudo apt upgrade

┌──(kali㉿kali)-[~] └─$ systemctl reboot -i

┌──(kali㉿kali)-[~/twint] └─$cd twint

┌──(kali㉿kali)-[~/twint] └─$ chmod +x setup.py

┌──(kali㉿kali)-[~/twint] └─$ python3 setup.py usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...] or: setup.py --help [cmd1 cmd2 ...] or: setup.py --help-commands or: setup.py cmd --help

error: no commands supplied ──(kali㉿kali)-[~/twint] └─$ python3 setup.py --help Common commands: (see '--help-commands' for more)

setup.py build will build the package underneath 'build/' setup.py install will install the package

┌──(kali㉿kali)-[~/twint] └─$ sudo python3 setup.py install

running install

┌──(kali㉿kali)-[~/twint] └─$ sudo python3 setup.py build running build running build_py

┌──(kali㉿kali)-[~/twint] └─$ twint -u networkchuck -s "raspberry pi" Traceback (most recent call last): File "/usr/local/bin/twint", line 33, in sys.exit(load_entry_point('twint==2.1.21', 'console_scripts', 'twint')()) File "/usr/local/bin/twint", line 25, in importlib_load_entry_point return next(matches).load() File "/usr/lib/python3.9/importlib/metadata.py", line 77, in load module = import_module(match.group('module')) File "/usr/lib/python3.9/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1030, in _gcd_import File "", line 1007, in _find_and_load File "", line 972, in _find_and_load_unlocked File "", line 228, in _call_with_frames_removed File "", line 1030, in _gcd_import File "", line 1007, in _find_and_load File "", line 986, in _find_and_load_unlocked File "", line 680, in _load_unlocked File "", line 850, in exec_module File "", line 228, in _call_with_frames_removed File "/home/kali/.local/lib/python3.9/site-packages/twint/init.py", line 14, in from . import run File "/home/kali/.local/lib/python3.9/site-packages/twint/run.py", line 4, in from . import datelock, feed, get, output, verbose, storage File "/home/kali/.local/lib/python3.9/site-packages/twint/get.py", line 12, in from aiohttp_socks import ProxyConnector, ProxyType File "/home/kali/.local/lib/python3.9/site-packages/aiohttp_socks/init.py", line 5, in from .connector import ( File "/home/kali/.local/lib/python3.9/site-packages/aiohttp_socks/connector.py", line 8, in from aiohttp.helpers import CeilTimeout # noqa ImportError: cannot import name 'CeilTimeout' from 'aiohttp.helpers' (/usr/lib/python3/dist-packages/aiohttp/helpers.py)

Research Issue.

Fix 1: I searched on Google with the error and found,

#1071

Asked to run the command below to check if there was a .local/bin folder created.

$ echo $PATH | tr -s ":" "\n" | sort

┌──(kali㉿kali)-[~/Downloads/twint] └─$ echo $PATH | tr -s ":" "\n" | sort 1 ⨯ /bin /home/kali/.dotnet/tools /home/kali/.local/bin /sbin /usr/bin /usr/games /usr/local/bin /usr/local/games /usr/local/sbin /usr/sbin

The advice was to add these two lines to the .zshrc script as I am using zsh shell. I am having the same issue in bash, so I don't think it matters at all for the moment.

$ export PYTHON_BIN_PATH="$(python3 -m site --user-base)/bin"

$ export PATH="$PATH:$PYTHON_BIN_PATH"

This method didn't work for me. Still have the error.

Fix2: $ pip3 install yarl --force-reinstall --no-cache-dir

https://stackoverflow.com/questions/64747304/twint-python-library-is-causing-exception-for-search-query?rq=1

A fix was mentioned by a user - "I had the same issue with python3.7 and resolve it by reinstalling yarl, like this way:"

$ pip3 install yarl --force-reinstall --no-cache-dir

This method didn't work for me.

Fix3: To uninstall and reinstall twint with the upgrade.

#915

    pip3 uninstall twint
    pip3 install --user --upgrade git+https://github.com/twintproject/twint.git@origin/master#egg=twint
    sudo pip3 install twint

This method didn't work for me.

Fix4: Install on Google Cloud Shell.

Google Cloud New build

    sudo git clone –depth=1 https://github.com/twintproject/twint.git
    cd twint
    sudo pip3 install . -r requirements.txt
    pip3 install twint

This worked for me but once you leave the session, twint seems to be causing the same issue as I had on the Kali machine.

twint -u networkchuck –limit 20 (Did not recognise -limit 20)

twint -u networkchuck -s “raspberry pi”

New Error message.

$ twint -u networkchuck -s crypto -o rightnow.json–json

Traceback (most recent call last): File "/home/rangersmyth_74/.local/bin/twint", line 10, in sys.exit(run_as_command()) File "/home/rangersmyth_74/.local/lib/python3.7/site-packages/twint/cli.py", line 339, in run_as_command main() File "/home/rangersmyth_74/.local/lib/python3.7/site-packages/twint/cli.py", line 330, in main run.Search(c) File "/home/rangersmyth_74/.local/lib/python3.7/site-packages/twint/run.py", line 410, in Search run(config, callback) File "/home/rangersmyth_74/.local/lib/python3.7/site-packages/twint/run.py", line 329, in run get_event_loop().run_until_complete(Twint(config).main(callback)) File "/home/rangersmyth_74/.local/lib/python3.7/site-packages/twint/run.py", line 36, in init self.token.refresh() File "/home/rangersmyth_74/.local/lib/python3.7/site-packages/twint/token.py", line 69, in refresh raise RefreshTokenException('Could not find the Guest token in HTML') twint.token.RefreshTokenException: Could not find the Guest token in HTML

Searching for twint.token

#1146 Fix 5. Add to PATH. https://issueexplorer.com/issue/twintproject/twint/1189

"" MikeTheScriptKid wrote this answer on 2021-05-08 "" So I finally got the twint command working in Kali. I added the following to /etc/environment /your_user/.local/bin

"" HarryWestFord wrote this answer on 2021-05-10 "" "sorry for being such a newbie but how excalty do you add this on the command line?

"" hackingbutlegal wrote this answer on 2021-05-15 ""

nano /etc/environment

This didn't work for me.

Found a fix to sort out the token problem

SCRIPT from >> #1146

import re import time import logging as logme import requests

class TokenExpiryException(Exception): def init(self, msg): super().init(msg)

class RefreshTokenException(Exception): def init(self, msg): super().init(msg)

class Token: def init(self, config): self._session = requests.Session() self._session.headers.update( {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0'}) self.config = config self._proxies = self._set_proxies() self._retries = 5 self._timeout = 10 self.url = 'https://twitter.com'

def _set_proxies(self) -> dict:
    settings = [self.config.Proxy_type, self.config.Proxy_host, self.config.Proxy_port]
    if not all(settings):
        logme.debug(f"No proxy in config")
        return {}

    proxy_type = self.config.Proxy_type.lower()
    proxy_val = f"{self.config.Proxy_host}:{self.config.Proxy_port}"
    proxies = {proxy_type: proxy_val}
    if proxy_type == 'http':
        proxies['https'] = proxy_val
    return proxies

def _request(self):
    for attempt in range(self._retries + 1):
        # The request is newly prepared on each retry because of potential cookie updates.
        req = self._session.prepare_request(requests.Request('GET', self.url))
        logme.debug(f'Retrieving {req.url}')
        try:
            if self._proxies:
                r = self._session.send(
                    req,
                    allow_redirects=True,
                    timeout=self._timeout,
                    proxies=self._proxies,
                    verify=False
                )
            else:
                r = self._session.send(req, allow_redirects=True, timeout=self._timeout)
        except requests.exceptions.RequestException as exc:
            if attempt < self._retries:
                retrying = ', retrying'
                level = logme.WARNING
            else:
                retrying = ''
                level = logme.ERROR
            logme.log(level, f'Error retrieving {req.url}: {exc!r}{retrying}')
        else:
            success, msg = (True, None)
            msg = f': {msg}' if msg else ''

            if success:
                logme.debug(f'{req.url} retrieved successfully{msg}')
                return r
        if attempt < self._retries:
            # TODO : might wanna tweak this back-off timer
            sleep_time = 2.0 * 2 ** attempt
            logme.info(f'Waiting {sleep_time:.0f} seconds')
            time.sleep(sleep_time)
    else:
        msg = f'{self._retries + 1} requests to {self.url} failed, giving up.'
        logme.fatal(msg)
        self.config.Guest_token = None
        raise RefreshTokenException(msg)

def refresh(self):
    logme.debug('Retrieving guest token')
    res = self._request()
    match = re.search(r'\("gt=(\d+);', res.text)
    if match:
        logme.debug('Found guest token in HTML')
        self.config.Guest_token = str(match.group(1))
    else:
        self.config.Guest_token = None
        raise RefreshTokenException('Could not find the Guest token in HTML')

I ran this script and got back nothing, no error, but when I did a search it didn't work.

Back to the Drawing Board.

References:

https://github.com/twintproject/twint/wiki/Setup

https://github.com/twintproject/twint/wiki/Configuration

#468

#915

#917

#944

#980

#1114 - Has a fix to upgrade twint && I also repored my issue here.

#1146 - Token Fix

https://github.com/Altimis/Scweet

http://saka.docsio.net/66892325/scrape-join-dates-user-info-from-a-list-csv-of-twitter-users

http://saka.docsio.net/64747304/twint-python-library-is-causing-exception-for-search-qeruy

https://www.geeksforgeeks.org/how-to-use-twint-osint-tool-on-google-cloud-console/

https://github.com/himanshudabas/twint/tree/twint-fixes

https://www.kaggle.com/zelinngilo/punya-nadia?scriptVersionId=57423357

https://www.geeksforgeeks.org/how-to-use-twint-osint-tool-on-google-cloud-console/

https://stackoverflow.com/questions/64747304/twint-python-library-is-causing-exception-for-search-query?rq=1

https://www.bountysource.com/teams/tweep/issues?tracker_ids=65007358

https://issueexplorer.com/issue/twintproject/twint/1266

⚙️Install & Usage:
$ sudo apt update
$ git clone --depth=1 https://github.com/twintproject/twint.git
$ ls
$ cd twint
$ ls
$ pip3 install . -r requirements.txt
$ sudo python3 setup.py install
$ sudo twint -h
$ sudo -i
$ twint -u Andela

The video link is given here- https://youtu.be/Jk5TF-yEZZ4

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.