Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arcgis package breaks boto3? #3912

Closed
thehappycheese opened this issue Oct 26, 2023 · 8 comments
Closed

arcgis package breaks boto3? #3912

thehappycheese opened this issue Oct 26, 2023 · 8 comments

Comments

@thehappycheese
Copy link

thehappycheese commented Oct 26, 2023

Describe the bug

If I import arcgis after boto3 then boto3 fails to create an s3 client.

Does boto3 do any non-standard imports or monkey patching?

I cant tell if it is boto3 or arcgis causing the problem.

Expected Behavior

It should not matter what order the packages are imported

Current Behavior

See Reproduction Steps below to see what caused this error

{
	"name": "RecursionError",
	"message": "maximum recursion depth exceeded",
	"stack": "---------------------------------------------------------------------------
RecursionError                            Traceback (most recent call last)
c:\\Users\\...\\LOCAL\\GIT\\pta_api\\test_rtop_archive.ipynb Cell 2 line 1
----> <a href='vscode-notebook-cell:/...archive.ipynb#W1sZmlsZQ%3D%3D?line=0'>1</a> pta_api.fetch_rtop_archive(
      <a href='vscode-notebook-cell:/...archive.ipynb#W1sZmlsZQ%3D%3D?line=1'>2</a>     aws_access_key_id     = keyring.get_password(\"...-key\",        \"user\"),
      <a href='vscode-notebook-cell:/...archive.ipynb#W1sZmlsZQ%3D%3D?line=2'>3</a>     aws_secret_access_key = keyring.get_password(\"...-access-key\", \"user\"),
      <a href='vscode-notebook-cell:/...archive.ipynb#W1sZmlsZQ%3D%3D?line=3'>4</a>     year                  = 2023,
      <a href='vscode-notebook-cell:/...archive.ipynb#W1sZmlsZQ%3D%3D?line=4'>5</a> )

File ~\\LOCAL\\GIT\\pta_api\\src\\pta_api\\_fetch_rtop_archive.py:51, in fetch_rtop_archive(aws_access_key_id, aws_secret_access_key, year, month, day, region_name, parse_dates, parse_geometry)
     48 elif day is not None:
     49     raise ValueError('day specified without month')
---> 51 client = session.client('s3')
     53 # download all files (one by one; not ideal at all, but this is the only way)
     54 results = []

File c:\\Users\\...\\AppData\\Local\\miniconda3\\Lib\\site-packages\\boto3\\session.py:299, in Session.client(self, service_name, region_name, api_version, use_ssl, verify, endpoint_url, aws_access_key_id, aws_secret_access_key, aws_session_token, config)
    217 def client(
    218     self,
    219     service_name,
   (...)
    228     config=None,
    229 ):
    230     \"\"\"
    231     Create a low-level service client by name.
    232 
   (...)
    297 
    298     \"\"\"
--> 299     return self._session.create_client(
    300         service_name,
    301         region_name=region_name,
    302         api_version=api_version,
    303         use_ssl=use_ssl,
    304         verify=verify,
    305         endpoint_url=endpoint_url,
    306         aws_access_key_id=aws_access_key_id,
    307         aws_secret_access_key=aws_secret_access_key,
    308         aws_session_token=aws_session_token,
    309         config=config,
    310     )

File c:\\Users\\...\\AppData\\Local\\miniconda3\\Lib\\site-packages\\botocore\\session.py:997, in Session.create_client(self, service_name, region_name, api_version, use_ssl, verify, endpoint_url, aws_access_key_id, aws_secret_access_key, aws_session_token, config)
    980 self._add_configured_endpoint_provider(
    981     client_name=service_name,
    982     config_store=config_store,
    983 )
    985 client_creator = botocore.client.ClientCreator(
    986     loader,
    987     endpoint_resolver,
   (...)
    995     user_agent_creator=user_agent_creator,
    996 )
--> 997 client = client_creator.create_client(
    998     service_name=service_name,
    999     region_name=region_name,
   1000     is_secure=use_ssl,
   1001     endpoint_url=endpoint_url,
   1002     verify=verify,
   1003     credentials=credentials,
   1004     scoped_config=self.get_scoped_config(),
   1005     client_config=config,
   1006     api_version=api_version,
   1007     auth_token=auth_token,
   1008 )
   1009 monitor = self._get_internal_component('monitor')
   1010 if monitor is not None:

File c:\\Users\\...\\AppData\\Local\\miniconda3\\Lib\\site-packages\\botocore\\client.py:159, in ClientCreator.create_client(self, service_name, region_name, is_secure, endpoint_url, verify, credentials, scoped_config, api_version, client_config, auth_token)
    146 region_name, client_config = self._normalize_fips_region(
    147     region_name, client_config
    148 )
    149 endpoint_bridge = ClientEndpointBridge(
    150     self._endpoint_resolver,
    151     scoped_config,
   (...)
    157     ),
    158 )
--> 159 client_args = self._get_client_args(
    160     service_model,
    161     region_name,
    162     is_secure,
    163     endpoint_url,
    164     verify,
    165     credentials,
    166     scoped_config,
    167     client_config,
    168     endpoint_bridge,
    169     auth_token,
    170     endpoints_ruleset_data,
    171     partition_data,
    172 )
    173 service_client = cls(**client_args)
    174 self._register_retries(service_client)

File c:\\Users\\...\\AppData\\Local\\miniconda3\\Lib\\site-packages\\botocore\\client.py:490, in ClientCreator._get_client_args(self, service_model, region_name, is_secure, endpoint_url, verify, credentials, scoped_config, client_config, endpoint_bridge, auth_token, endpoints_ruleset_data, partition_data)
    466 def _get_client_args(
    467     self,
    468     service_model,
   (...)
    479     partition_data,
    480 ):
    481     args_creator = ClientArgsCreator(
    482         self._event_emitter,
    483         self._user_agent,
   (...)
    488         user_agent_creator=self._user_agent_creator,
    489     )
--> 490     return args_creator.get_client_args(
    491         service_model,
    492         region_name,
    493         is_secure,
    494         endpoint_url,
    495         verify,
    496         credentials,
    497         scoped_config,
    498         client_config,
    499         endpoint_bridge,
    500         auth_token,
    501         endpoints_ruleset_data,
    502         partition_data,
    503     )

File c:\\Users\\...\\AppData\\Local\\miniconda3\\Lib\\site-packages\\botocore\\args.py:137, in ClientArgsCreator.get_client_args(self, service_model, region_name, is_secure, endpoint_url, verify, credentials, scoped_config, client_config, endpoint_bridge, auth_token, endpoints_ruleset_data, partition_data)
    134 new_config = Config(**config_kwargs)
    135 endpoint_creator = EndpointCreator(event_emitter)
--> 137 endpoint = endpoint_creator.create_endpoint(
    138     service_model,
    139     region_name=endpoint_region_name,
    140     endpoint_url=endpoint_config['endpoint_url'],
    141     verify=verify,
    142     response_parser_factory=self._response_parser_factory,
    143     max_pool_connections=new_config.max_pool_connections,
    144     proxies=new_config.proxies,
    145     timeout=(new_config.connect_timeout, new_config.read_timeout),
    146     socket_options=socket_options,
    147     client_cert=new_config.client_cert,
    148     proxies_config=new_config.proxies_config,
    149 )
    151 serializer = botocore.serialize.create_serializer(
    152     protocol, parameter_validation
    153 )
    154 response_parser = botocore.parsers.create_parser(protocol)

File c:\\Users\\...\\AppData\\Local\\miniconda3\\Lib\\site-packages\\botocore\\endpoint.py:409, in EndpointCreator.create_endpoint(self, service_model, region_name, endpoint_url, verify, response_parser_factory, timeout, max_pool_connections, http_session_cls, proxies, socket_options, client_cert, proxies_config)
    406 endpoint_prefix = service_model.endpoint_prefix
    408 logger.debug('Setting %s timeout as %s', endpoint_prefix, timeout)
--> 409 http_session = http_session_cls(
    410     timeout=timeout,
    411     proxies=proxies,
    412     verify=self._get_verify_value(verify),
    413     max_pool_connections=max_pool_connections,
    414     socket_options=socket_options,
    415     client_cert=client_cert,
    416     proxies_config=proxies_config,
    417 )
    419 return Endpoint(
    420     endpoint_url,
    421     endpoint_prefix=endpoint_prefix,
   (...)
    424     http_session=http_session,
    425 )

File c:\\Users\\...\\AppData\\Local\\miniconda3\\Lib\\site-packages\\botocore\\httpsession.py:323, in URLLib3Session.__init__(self, verify, proxies, timeout, max_pool_connections, socket_options, client_cert, proxies_config)
    321     self._socket_options = []
    322 self._proxy_managers = {}
--> 323 self._manager = PoolManager(**self._get_pool_manager_kwargs())
    324 self._manager.pool_classes_by_scheme = self._pool_classes_by_scheme

File c:\\Users\\...\\AppData\\Local\\miniconda3\\Lib\\site-packages\\botocore\\httpsession.py:340, in URLLib3Session._get_pool_manager_kwargs(self, **extra_kwargs)
    336 def _get_pool_manager_kwargs(self, **extra_kwargs):
    337     pool_manager_kwargs = {
    338         'timeout': self._timeout,
    339         'maxsize': self._max_pool_connections,
--> 340         'ssl_context': self._get_ssl_context(),
    341         'socket_options': self._socket_options,
    342         'cert_file': self._cert_file,
    343         'key_file': self._key_file,
    344     }
    345     pool_manager_kwargs.update(**extra_kwargs)
    346     return pool_manager_kwargs

File c:\\Users\\...\\AppData\\Local\\miniconda3\\Lib\\site-packages\\botocore\\httpsession.py:349, in URLLib3Session._get_ssl_context(self)
    348 def _get_ssl_context(self):
--> 349     return create_urllib3_context()

File c:\\Users\\...\\AppData\\Local\\miniconda3\\Lib\\site-packages\\botocore\\httpsession.py:139, in create_urllib3_context(ssl_version, cert_reqs, options, ciphers)
    133     # TLSv1.2 only. Unless set explicitly, do not request tickets.
    134     # This may save some bandwidth on wire, and although the ticket is encrypted,
    135     # there is a risk associated with it being on wire,
    136     # if the server is not rotating its ticketing keys properly.
    137     options |= OP_NO_TICKET
--> 139 context.options |= options
    141 # Enable post-handshake authentication for TLS 1.3, see GH #1634. PHA is
    142 # necessary for conditional client cert authentication with TLS 1.3.
    143 # The attribute is None for OpenSSL <= 1.1.0 or does not exist in older
    144 # versions of Python.  We only enable on Python 3.7.4+ or if certificate
    145 # verification is enabled to work around Python issue #37428
    146 # See: https://bugs.python.org/issue37428
    147 if (
    148     cert_reqs == ssl.CERT_REQUIRED or sys.version_info >= (3, 7, 4)
    149 ) and getattr(context, \"post_handshake_auth\", None) is not None:

File c:\\Users\\...\\AppData\\Local\\miniconda3\\Lib\\ssl.py:624, in SSLContext.options(self, value)
    622 @options.setter
    623 def options(self, value):
--> 624     super(SSLContext, SSLContext).options.__set__(self, value)

File c:\\Users\\...\\AppData\\Local\\miniconda3\\Lib\\ssl.py:624, in SSLContext.options(self, value)
    622 @options.setter
    623 def options(self, value):
--> 624     super(SSLContext, SSLContext).options.__set__(self, value)

    [... skipping similar frames: SSLContext.options at line 624 (1478 times)]

File c:\\Users\\...\\AppData\\Local\\miniconda3\\Lib\\ssl.py:624, in SSLContext.options(self, value)
    622 @options.setter
    623 def options(self, value):
--> 624     super(SSLContext, SSLContext).options.__set__(self, value)

RecursionError: maximum recursion depth exceeded"
}

Reproduction Steps

import boto3
from arcgis.features import FeatureLayer
session = boto3.Session(
    profile_name="..."
)
client = session.client('s3')
# Fails... see error message further below

Possible Solution

A workaround is to swap the order of imports:

# Swapping the order of imports; and it works again:
from arcgis.features import FeatureLayer
import boto3
session = boto3.Session(profile_name="...")
client = session.client('s3')
# Succeeds

Additional Information/Context

I raised a similar issue on the arcgis package here: Esri/arcgis-python-api#1698

SDK version used

boto3 1.28.71

Environment details (OS name and version, etc.)

Windows 10, Freshly reinstalled Python 3.11.5

@thehappycheese thehappycheese added bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Oct 26, 2023
@tim-finnigan tim-finnigan self-assigned this Nov 1, 2023
@tim-finnigan
Copy link
Contributor

Hi @thehappycheese thanks for reaching out. I could reproduce the issue as you described. Steps for quickly reproducing the issue:

Set up a virtual environment:

virtualenv venv
source venv/bin/activate
pip install boto3 arcgis

In a new file:

import boto3
import arcgis
boto3.client('s3')

Results in: RecursionError: maximum recursion depth exceeded while calling a Python object

The error does not occur when importing arcgis before boto3 as you mentioned. It also does not occur if a client is not created.

I tested this on Python 3.11.5/OpenSSL 1.1.1.u, but the error did not occur when I tested in Python 3.8.2/OpenSSL 1.1.1d. Have you tried testing in any other Python or OpenSSL versions?

I couldn't find any other reports of this specific issue. I'm wondering if this is some edge case involving the boto3/arcgis dependencies or SSL context initialization. We can continue tracking this here and try to narrow down the cause.

@tim-finnigan tim-finnigan added response-requested Waiting on additional information or feedback. p2 This is a standard priority issue needs-review and removed needs-triage This issue or PR still needs to be triaged. labels Nov 1, 2023
@tim-finnigan
Copy link
Contributor

@nateprewitt pointed out to me that this is occurring because arcgis is modifying the urllib3 import in the global modules. If boto3 is imported after arcgis then that will correct the monkey patching we're doing. You can find a related issue here with how eventlet does monkeypatching: eventlet/eventlet#618. Based on that information, arcgis would need to make changes for this to be fixed.

@tim-finnigan tim-finnigan added closing-soon This issue will automatically close in 4 days unless further comments are made. third-party and removed bug This issue is a confirmed bug. response-requested Waiting on additional information or feedback. needs-review p2 This is a standard priority issue labels Nov 1, 2023
@nateprewitt
Copy link
Contributor

Hi @thehappycheese, thanks for raising this report. I did a cursory search and think I've got leads on why this specific issue is occurring with arcgis. Could you confirm if truststore is installed in your environment, and if so what version?

I unfortunately can't find the arcgis code hosted publicly to link to but if you dig through the modules after download, you can see they've integrated truststore (arcgis/auth/api.py) and are mutating the SSL library with inject_into_ssl. This has had issues in the past with similar members on the context. arcgis also has a custom LazyLoader class that is messing with the urllib3 import to defer its use that I haven't had a chance to dig into yet.

My initial hypothesis from what we're seeing in the stacktrace is that we're hitting this issue because truststore is mutating the import for urllib3's SSLContext. When this is done after we've imported botocore, we end up in this recursive state. When it's done before the boto3 import, we're able to properly establish the correct reference to SSLContext. The reason you're likely hitting this in your environment is arcgis skips only uses truststore in Python 3.10+ on Windows.

arcgis/auth/api.py
They skip the first conditional on Python 3.10+.

if sys.platform == "win32" and check_module_exists("certifi_win32"):  # pragma: no cover
    # when on Windows, append to the certifi
    # the users trusted certificate store
    # when certifi_win32 is present.
    try:
        import certifi_win32

        certifi_win32.wincerts.verify_combined_pem()
        certifi_win32.wincerts.where()
    except ImportError:
        pass
elif check_module_exists("truststore"):  # pragma: no cover
    try:
        import truststore

        truststore.inject_into_ssl()
    except ImportError as ie:
        __log__.warning(f"truststore raised a warning: {ie}")
    except Exception as e:
        __log__.warning(f"truststore raised a warning: {e}")

arcgis setup.py

dependencies = [
        ...
        "python-certifi-win32;python_version<'3.10'",
        "truststore>=0.7.0;python_version>'3.9'",
        ...
]

So this does appear to be stemming from how arcgis is mutating the ssl module with truststore. We can likely work with the truststore maintainers to see if there's a way to get this smoothed out but for the time being arcgis will likely break anyone using the SSLContext from urllib3 if it's not imported first.

@nateprewitt
Copy link
Contributor

I've opened sethmlarson/truststore#121 to track the issue in truststore with a minimal repro using only urllib3 and truststore. This may get addressed in a future truststore release but imports will likely need to be managed by customers using arcgis on newer version Python for now.

@github-actions github-actions bot removed the closing-soon This issue will automatically close in 4 days unless further comments are made. label Nov 2, 2023
@nateprewitt
Copy link
Contributor

I'm going to put this ticket back in auto-close since we seem to have answers to the underlying problem. Please feel free to let us know if you have any further questions though. We've been able to reproduce this issue with just arcgis and urllib3, so this does appear to be unique to how arcgis is using truststore. Once they resolve their usage issue, things should work as expected.

@nateprewitt nateprewitt added the closing-soon This issue will automatically close in 4 days unless further comments are made. label Nov 2, 2023
@github-actions github-actions bot added closed-for-staleness and removed closing-soon This issue will automatically close in 4 days unless further comments are made. labels Nov 5, 2023
@github-actions github-actions bot closed this as completed Nov 5, 2023
@thehappycheese
Copy link
Author

Hi @nateprewitt, thankyou for your work to get to the bottom of this! Really appreciate it.

Could you confirm if truststore is installed in your environment, and if so what version?

truststore 0.8.0 appears to be installed.

Apologies for the delayed response and thanks again :)

@martimpassos
Copy link

Hi, I'm facing the same issue. Introducing ArcGIS in my project broke my boto3 operations. Looking at the mentioned issues there doesn't seem to be a fix yet, is that right? I tried downgrading my project to python 3.9 and arcgis api operations now hang forever

@deepblue-phoenix
Copy link

deepblue-phoenix commented Jun 25, 2024

we're seeing this problem on our side, and the boto3 team has reproduced the issue and are actively looking into a solution.
#4061 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants