This repository has been archived by the owner on Aug 4, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 54
Add HEAD
requests support to DelayedRequester
#865
Merged
obulat
merged 8 commits into
WordPress:main
from
twstokes:feature/delayed-requester-head
Nov 18, 2022
Merged
Changes from 3 commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
367118f
Add HEAD requests support to DelayedRequester.
twstokes e4250c2
Use a HEAD request for StockSnap filesize method.
twstokes f14062b
Use a HEAD request for WordPress filesize method.
twstokes bce40ff
Merge remote-tracking branch 'origin/main' into feature/delayed-reque…
twstokes be9f43a
Add Callable type constraint for Requests method.
twstokes 8c5d0a2
Update docstrings params for _make_request.
twstokes af83245
Update openverse_catalog/dags/common/requester.py
twstokes cd4c714
Add docstrings for GET and HEAD methods.
twstokes File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -30,12 +30,11 @@ class RetriesExceeded(Exception): | |
|
||
class DelayedRequester: | ||
""" | ||
Provides a method `get` that is a wrapper around `get` from the | ||
`requests` module (i.e., it simply passes along whatever arguments it | ||
receives). The difference is that when this class is initialized | ||
with a non-zero `delay` parameter, it waits for at least that number | ||
of seconds between consecutive requests. This is to avoid hitting | ||
rate limits of APIs. | ||
Provides methods `get` and `head` that are wrappers around the `requests` | ||
module methods with the same name (i.e., it simply passes along whatever | ||
arguments it receives). The difference is that when this class is initialized | ||
with a non-zero `delay` parameter, it waits for at least that number of seconds | ||
between consecutive requests. This is to avoid hitting rate limits of APIs. | ||
|
||
Optional Arguments: | ||
delay: an integer giving the minimum number of seconds to wait | ||
|
@@ -50,23 +49,24 @@ def __init__(self, delay=0, headers=None): | |
self._last_request = 0 | ||
self.session = requests.Session() | ||
|
||
def get(self, url, params=None, **kwargs): | ||
def _make_request(self, method, url, **kwargs): | ||
""" | ||
Make a get request, and return the response object if it exists. | ||
Make a request, and return the response object if it exists. | ||
|
||
Required Arguments: | ||
|
||
url: URL to make the request as a string. | ||
params: Dictionary of query string params | ||
**kwargs: Optional arguments that will be passed to `requests.get` | ||
**kwargs: Optional arguments that will be passed to the `requests` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The docstrings params need to be updated, too: add There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated in 8c5d0a2. |
||
module request | ||
""" | ||
self._delay_processing() | ||
self._last_request = time.time() | ||
request_kwargs = kwargs or {} | ||
if "headers" not in kwargs: | ||
request_kwargs["headers"] = self.headers | ||
try: | ||
response = self.session.get(url, params=params, **request_kwargs) | ||
response = method(url, **request_kwargs) | ||
if response.status_code == requests.codes.ok: | ||
logger.debug(f"Received response from url {response.url}") | ||
elif response.status_code == requests.codes.unauthorized: | ||
|
@@ -90,10 +90,17 @@ def get(self, url, params=None, **kwargs): | |
except Exception as e: | ||
logger.error(f"Error with the request for URL: {url}") | ||
logger.info(f"{type(e).__name__}: {e}") | ||
logger.info(f"Using query parameters {params}") | ||
if params := request_kwargs.get("params"): | ||
logger.info(f"Using query parameters {params}") | ||
logger.info(f'Using headers {request_kwargs.get("headers")}') | ||
return None | ||
|
||
def get(self, url, params=None, **kwargs): | ||
twstokes marked this conversation as resolved.
Show resolved
Hide resolved
|
||
self._make_request(self.session.get, url, params=params, **kwargs) | ||
|
||
def head(self, url, **kwargs): | ||
twstokes marked this conversation as resolved.
Show resolved
Hide resolved
|
||
self._make_request(self.session.head, url, **kwargs) | ||
twstokes marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
def _delay_processing(self): | ||
wait = self._DELAY - (time.time() - self._last_request) | ||
if wait >= 0: | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👋 Hi @obulat - thanks for the suggestion. Could you help me understand this literal type constraint better? I was under the impression that the calls to
_make_request
were passing methods from the Requests module and not string values, but I'm probably missing something.To experiment I changed the
Literal
params to something nonsensical (instead ofget
andhead
) andjust lint
andjust test
passed when I was expecting them to fail.Also, if you're suggesting that we make the Requests module calls only within
_make_request
based on the string passed then that makes sense, but I wanted to get confirmation. Thank you!There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gah, that's right! This is actually of type
callable
, rather thanLiteral
. I think the way it is presently is fine, but it might be worth adding the genericcallable
type annotation (trying to match the call signatures of both.head
and.get
will be a head-ache 😄)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for stepping in, @AetherUnbound, and sorry I couldn't reply sooner, @twstokes.