Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split Updater into logical components: redesign mirrors.py and download.py #1307

Closed
3 of 4 tasks
sechkova opened this issue Mar 12, 2021 · 4 comments
Closed
3 of 4 tasks
Assignees
Labels
experimental-client Items related to the development of a new client (see milestone/8 and theexperimental-client branch)

Comments

@sechkova
Copy link
Contributor

sechkova commented Mar 12, 2021

Description of issue or feature request:
A clear separation between file download logic and metadata verification in Updater. Design and implement appropriate class, part of the client internals, that performs all metadata/target files download operations.

This should happen together with a redesign of the mirrors and download modules and considering the recent addition of fetcher and requests_fetcher.

Some issues of the current implementation:

See related comments: #1291 (comment), #1291 (comment)

@sechkova sechkova added the experimental-client Items related to the development of a new client (see milestone/8 and theexperimental-client branch) label Mar 12, 2021
@sechkova sechkova added this to the Client Refactor milestone Mar 12, 2021
@sechkova sechkova mentioned this issue Mar 12, 2021
17 tasks
@sechkova sechkova self-assigned this Mar 15, 2021
sechkova added a commit to sechkova/tuf that referenced this issue Mar 23, 2021
Temporary disable (inline) undefined-loop-variable pylint
checks until the download functionality is revised (theupdateframework#1307)

Signed-off-by: Teodora Sechkova <[email protected]>
sechkova added a commit to sechkova/tuf that referenced this issue Mar 23, 2021
Temporary disable (inline) undefined-loop-variable pylint
checks until the download functionality is revised (theupdateframework#1307)

Signed-off-by: Teodora Sechkova <[email protected]>
sechkova added a commit to sechkova/tuf that referenced this issue Mar 23, 2021
Temporary disable (inline) undefined-loop-variable pylint
checks until the download functionality is revised (theupdateframework#1307)

Signed-off-by: Teodora Sechkova <[email protected]>
sechkova added a commit to sechkova/tuf that referenced this issue Mar 24, 2021
Temporary disable (inline) undefined-loop-variable pylint
checks in the new Updater code until the download functionality
is revised (theupdateframework#1307).

Signed-off-by: Teodora Sechkova <[email protected]>
@sechkova
Copy link
Contributor Author

sechkova commented Apr 7, 2021

There is an ongoing work addressing this issue #1331

@sechkova
Copy link
Contributor Author

sechkova commented Apr 27, 2021

Starting with a reminder of what we want to achieve in a reference implementation:
A readable codebase, easily mapped to the specification, less prone to errors. A clear separation between file download logic and metadata verification in Updater.

What is the outcome of the efforts on improving the mirrors download code so far, summarised well by @jku in this comment and curated a bit by me:

  • currently the TUF client supports the initialisation of Updater with a mirrors configuration: a list of base URLs. The client is capable of resolving metadata and target names to full URLs. On each file download, metadata or target one, the client takes the first mirror from the list, downloads the file, verifies it and if any of these two operations fail, tries the next mirror.

  • as long as the mirror downloading works as it does, it means there is going to be a "generic download loop" that is going to download files, parse the content, verify the metatadata and then decide if it needs to download another version of the file. Anything else is going to lead to a lot of repetition in the Updater code.

  • we recognised the download/mirrors code is what makes improving on the current code tricky -- this is what we tried to solve in some earlier PRs (WIP: Mirrors redesign #1331 for ex.) but ultimately failed to create something that improves readability, potentially reduces errors that and still supports mirrors...

pip, "forced" by the way PyPI hosts files, needs to be able to use different base URLs for different target files. However, we are not aware of a use case where metadata for a single repository is served by several mirrors.

Given the complications that such logic brings to a clean reference implementation, we suggest dropping the mirrors support for metadata. Going even further, fully drop the mirrors configuration and add the base URL as an optional argument to the download_target() method of Updater.

This proposal assumes the following scenario. Metadata for a repository is retrieved from a single host address (mirroring by CDNs, for example, is hidden from the client's point of view). TUF client resolves metadata file names to URLs with this single host URL. Target files may be downloaded from any URL provided by the user but the same host URL used for metadata can serve as a default value.

@jku
Copy link
Member

jku commented Apr 27, 2021

Thanks. I do agree that this seems like a direction worth exploring (thanks @trishankatdatadog for prodding into this direction). I think a hypothetical API that tries to look as much as it can like the current one would look like this:

class Updater:
    def __init__(
        repository_path: str,
        metadata_url: str, 
        default_target_url: str = None,
        fetcher: FetcherInterface = None)

    # ... skip refresh et al, they're not interesting here ...

    def download_target(
        target_name: str, 
        destination_dir: str, 
        url: str = None)

All *_url arguments are url-prefixes that resemble url_prefix+metadata_path or url_prefix+targets_path in the current mirrors implementation.

As Teodora says, "metadata mirroring" is explicitly not supported here. However, I believe this proposal does not prevent a client from building a setup that functions like current "mirrors configuration" with regards to targets.

@sechkova
Copy link
Contributor Author

The proposals from the discussion are implemented with #1373.
The last point of the description tasks's list related to directories structure will be continued in #1397.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
experimental-client Items related to the development of a new client (see milestone/8 and theexperimental-client branch)
Projects
None yet
Development

No branches or pull requests

3 participants