Use a persistent cache for tldextract across CI runs #3716
Labels
💻 aspect: code
Concerns the software code in the repository
✨ goal: improvement
Improvement to an existing user-facing feature
🟨 priority: medium
Not blocking but should be addressed soon
🧱 stack: mgmt
Related to repo management and automations
Current Situation
tldextract is a library used by the ingestion server to help extract TLDs (not a surprise, based on the library's name!).
It relies on updated public suffix information, which it requests the first time it runs. This information is cached, and we should persist it accross test runs.
Suggested Improvement
Configure tldextract's cache location to a place we can easily cache in GitHub Actions. Use
tldextract --update
with the configured cache location, as shown in the library docs, to update the cached data as needed, and prevent unnecessary network requests in tests.Benefit
Prevent unnecessary network requests during test runs.
The text was updated successfully, but these errors were encountered: