-
Notifications
You must be signed in to change notification settings - Fork 681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
r.init() failed to download TagUI_Linux.zip file due to firewall - pack() or hack #104
Comments
Hi Dave, yes this looks like your company firewall may be blocking automated downloads from GitHub. You can use the pack() function on a computer with internet access and no firewall. After that, copy the zip file to your work computer to use. See more details at API reference and the 3-step guide here #36 (comment) |
Hi Ken,
Our organization has firewall, your method is to first use pack() to get
the zip file. But pack() still failed in our organization network because
it has firewall. We are not allowed to copy files from home computer to
office computers.
How to deal with this issue?
I can download the TagUI_Linux.zip
file in office computer, can you modify your init function so that it will
check if this zip file exists, if yes, then no need to download it and can
directly use it?
Maybe there are better ways to handle this such as specify http proxy.
Thanks,
Dave
…On Fri, Feb 7, 2020 at 11:34 AM Ken Soh ***@***.***> wrote:
Hi Dave, yes this looks like your company firewall may be blocking
automated downloads from GitHub. You can use the pack() function on a
computer with internet access and no firewall. After that, copy the zip
file to your work computer to use.
See more details at API reference
<https://github.com/tebelorg/RPA-Python#api-reference> and the 3-step
guide here #36 (comment)
<#36 (comment)>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#104?email_source=notifications&email_token=ADYRVY7ZIMQVJZ5PEO6TTB3RBWEQZA5CNFSM4KRRG4NKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELDU62Q#issuecomment-583487338>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADYRVY4HJ3LD76Y5AHTKODTRBWEQZANCNFSM4KRRG4NA>
.
|
Hi Dave, can you tell me how do you download TagUI_Linux.zip when your work computer has no access to internet and not allowed to copy files from home computer to office computer? If you are doing it through proxy, can you try setting it using the following before you run Python and import the package? The standard download urllib.request used by the download() function should use the proxy settings defined in the environment. Windows
macOS / Linux
There is no plan to have a function within the package to define proxy, partly because proxy settings are not applied to the Chrome browser invoked through TagUI, and partly because a normal user for RPA use cases won't be programmatically changing proxy. User can still automate the steps through frontend UI layer to change proxy the way he does it manually. |
For recognising the local TagUI_Linux.zip it is not a good solution, because there are 7 other files which will also be downloaded during init(). These other 7 files are stable cutting edge version source code files for TagUI. If implement recognise local files, users will have to also download these 7 files, which is too much user friction. |
Adding on, as you can download TagUI_Linux.zip on your work computer, there must be some way which your work computer can access the internet. If above suggestion using proxy does not work and init() can't run, you can use pack() on your home Linux computer to generate the zip file, and upload using something like Firefox Send - https://send.firefox.com or some other ways which allows your work computer to download. Let me know more how you download TagUI_Linux.zip on your work computer and any other details so that we can find the best solution to use the package and update it in future. |
In our company, all computers can access internet and we have a firewall.
The windows computer has set up so that it can access most websites and
download files like TagUI_Linux.zip without problems. All personal email
accounts and other file sharing websites are not allowed(including
send.firefox.com). USB drive is not allowed to use either. There is no way
to transfer files from home computer to office computer. The policy is very
strict.
In Linux machines on our office network, it can access internet within
firewall. We use http proxy to access internet. I have used requests python
library with some proxy setup before to access some websites to download
data. I am not sure if it will work by just setting the environmental
variable for http proxy. I will test it next week.
From your youtube video demo, it seems that this is a great tool, I really
want to use it. Thanks!
|
I see.. Thanks for sharing these details! Look forward to hearing more from you on how it turns out when you set environmental variables for proxy. I will work with you closely to figure out a way to have the tool run on your work computer with as little user friction as possible. Letting the tool run on computers with restricted internet access or no internet access is a primary goal of this project, as I see decentralised distribution and running of software to be very important in the coming decade. In the meantime, from above data points, it looks like your firewall may be configured to allow http requests from allowed apps such as Chrome browser but not other apps like a Python process. Because the URL which you use to download the TagUI_Linux.zip and the URL which the tool automatically downloads the file is the same - https://github.com/tebelorg/Tump/releases/download/v1.0.0/TagUI_Linux.zip If that's the case, in the worst case scenario, the following may work for you. Try this only if the setting proxy method does not work. Because many steps below, troublesome.
|
I tried to set the environmental variable http_proxy and https_proxy in Linux before running Python.In [1]: import rpa as r
|
I can use python requests library to download the TagUI_Linux.zip file without issues after I set up the proxy. When I use rpa init() function, it will give error. I wonder if init() function can accept some parameter to accept the proxy info. |
From the Python docs, it looks like the way download() use urllib.request it will automatically get proxy from the environment - https://docs.python.org/3/library/urllib.request.html I've also check Python requests package and it seems to also be using urllib functions to get proxy settings - https://github.com/psf/requests/blob/master/requests/compat.py Can you try below to see if that works? Maybe the quotation marks are needed, and the previous solution I found from StackOverflow without quotes is wrong.
In the worst case scenario you can use the 4 steps above to upload the zip file from pack() and use your Python requests package proxy method to do the download. |
I tried the export with quotation marks, it did not work either. The 4 steps will not work because it will violate the company policy. |
The following is how I use proxy and it can work. import requests class ProxyUAAdapter(HTTPAdapter): s = requests.Session() proxies = {'http': 'http://10.10.10.10:8000', 'https': 'https://10.10.10.10:1212'} url = "https://www.abcd.com/a.zip" j = s.get(url, proxies = proxies) |
I see.. I'm afraid I have no other idea to try for now. Because exhaust what online documentation suggests. Will have to look out for more data points from other users with similar setup to see if there is some way to solve. For the 4th step to download file, I mean you download the same way using Python requests module on the uploaded URL, the same way that you manage to download TagUI_Linux.zip. |
Yes you can do steps 1 to 3 to upload zip file and rpa.py to a dummy release on github. After that you use your requests script above to download the URLs of the files uploaded in step 3. |
I'll avoid adding requests as a dependency to proxy for now to avoid having dependencies, but you can hack download() function in tagui.py to include your code above, so that it uses proxy to download everytime. I'm assuming that what you care about is installing the package to use. Then steps 1 to 3 combine with your requests script to download should work. If your use case for automation involves calling download() function and need to access URLs through proxy, then you will have to hack download() to include your code above to always retrieve through your proxies. |
For you and other users reference - below is the hacked version of download() which includes your requests proxy method to download files. I used some dummy free proxy which is not reliable for testing. It has to be changed to your stable proxies. Also, the headers I use the same as the Lynx user agent since it works in your environment. That should be changed accordingly to be something else valid, otherwise I think some web server will not want to serve a request from Lynx browser. def download(download_url = None, filename_to_save = None):
"""function for python 2/3 compatible file download from url"""
if download_url is None or download_url == '':
print('[RPA][ERROR] - download URL missing for download()')
return False
# if not given, use last part of url as filename to save
if filename_to_save is None or filename_to_save == '':
download_url_tokens = download_url.split('/')
filename_to_save = download_url_tokens[-1]
# delete existing file if exist to ensure freshness
if os.path.isfile(filename_to_save):
os.remove(filename_to_save)
# handle case where url is invalid or has no content
try:
import requests
from requests.adapters import HTTPAdapter
class ProxyUAAdapter(HTTPAdapter):
def proxy_headers(self, proxy):
headers = super(ProxyUAAdapter, self).proxy_headers(proxy)
headers['User-Agent'] = 'Lynx'
return headers
s = requests.Session()
s.mount('http://', ProxyUAAdapter())
s.mount('https://', ProxyUAAdapter())
proxies = {'http': '142.93.80.189:80', 'https': '148.251.200.199:1080'}
get_response = s.get(download_url, proxies = proxies)
downloaded_file = open(filename_to_save,'wb')
downloaded_file.write(get_response.content)
downloaded_file.close()
except Exception as e:
print('[RPA][ERROR] - failed downloading from ' + download_url + '...')
print(str(e))
return False
# take the existence of downloaded file as success
if os.path.isfile(filename_to_save):
return True
else:
print('[RPA][ERROR] - failed downloading to ' + filename_to_save)
return False |
Is the download() function part of your rpa python library? If so, where can I find this function and then modify it to your latest hacked version? How do I use this download() function in rpa? before r.init()? |
Yes download() is inside tagui.py. You can do below and find the file location to modify it - import tagui as t init() function will automatically call download() to download the files, no need to hack init(). |
I made changes to the download() function and used the correct proxy, now it can download zip file, but has new errors: In [2]: r.init() /abc/def/.tagui/src/tagui: line 304: type: google-chrome: not found ============================ Thanks! |
Oh shucks.. This package is designed to work only with Google Chrome. One last thing that you can try is to modify tagui.py to change below -
to the following -
And download Firefox v59 from there - https://ftp.mozilla.org/pub/firefox/releases/59.0 However, this has not been tested to work for this package and will most likely fail. |
I'll leave the readme unchanged for now, since there is mention of Chrome browser. But let me know if the readme is confusing to suggest that Firefox or Internet Explorer is supported, then I update. |
I changed =================================================== Gecko error: it seems /usr/bin/firefox is not compatible with SlimerJS. Out[2]: False |
Have you installed Firefox v59 from the link in my post above? |
Our firefox version is v72.0.1 64 bits on Linux ubuntu. Does rpa library only work with v59? Thanks! |
This package uses TagUI project. TagUI uses SlimerJS to control Firefox, but only for version 59 and earlier. Because v60 onwards the Firefox architecture has changed totally. You can try installing v59 Firefox and make the modification above to see if it works. But I think it is 99.9% not likely to work properly. The package is designed to work with Chrome only. |
Hi Sir,
Your RPA looks great, I am trying to test it in our work environment(Linux Ubuntu) but got an error. I think it has to do our firewall. Do you have suggestions on how to bypass this issue? I tried to download this zip file from Windows, then copied it to my home directory in linux, but init() function still tried to download this zip file and timed out.
In [100]: r.init(visual_automation = True, chrome_browser = False)
[RPA][INFO] - setting up TagUI for use in your Python environment
[RPA][INFO] - downloading TagUI (~200MB) and unzipping to below folder...
[RPA][INFO] - /home/xxx/
[RPA][ERROR] - failed downloading from https://github.com/tebelorg/Tump/releases/download/v1.0.0/TagUI_Linux.zip...
<urlopen error [Errno 110] Connection timed out>
Thanks,
Dave
The text was updated successfully, but these errors were encountered: