Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modularized client for state management #114

Merged
merged 11 commits into from
Jun 21, 2024
69 changes: 69 additions & 0 deletions micropip/manager.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
from typing import Any, List

from micropip import package_index


class Manager:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we like the name Manager? It's a bit vague but seems alright to me. We should consider whether we can come up with something better. @jaraco maybe you have some thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love "Manager", but it's adequate.

I notice below the repodata_ prefix on a couple of member variables, suggesting that maybe this object is representing RepoData or State or RepositoryState or Repository or Repo (and maybe the repodata_ prefix can be removed). I also see that "repodata" is a legacy artifact and "lock" is relevant, so maybe LockState or some variant.

Since I see actions about .install, and .freeze, I'm thinking this is relevant to an Environment or Site.

I'm liking Repository or Environment best, but I'd be okay with any option. Feel free to pick/promote one you like best.

Am I right in thinking this class is an internal implementation detail (not part of a public API)? If so, the naming isn't critical and can be changed later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thoughts were that eventually this class will become representative of the current public API, with additional extensibility such as dependency injection as discussed in #112, although I would guess having a separate exposed layer and keeping the Manager internal is also a feasible choice. (Assuming that the most common case of a user wanting to instantiate their own Manager without accessing the current public APIs would be to utilize such extensibility)

I was thinking that Client or Micropip could suit it better but Manager seemed closest to what it was going to do: state management. I do agree its a vague name, and Repository also sounds alright too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I plan to make this class public. People would do mgr = micropip.Manager(); mgr.install(...).

I prefer Manager or Environment. The former emphasizes that this is a package manager of sorts, while the latter recognizes that each instance is like a virtual environment of sorts.

Let's go with Manager for now. We still have time before implementing all these APIs and making this class public.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call it PackageManager then? or ???Environment? I feel like we could use an extra word in the name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PackageManager sounds good to me.

"""
Manager provides an extensible interface for customizing micropip's behavior.

Each Manager instance holds its own local state that is
independent of other instances, including the global state.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

including the global state.

Well it's not really global then. It's a bit unclear how this works since the result of installing stuff with micropip is generally files placed in the file system. I guess different Manager instances can have different install directories?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant to phrase it in a way that the global manager instance(representing the 'global state' for the exposed commands) is also included in the statement, but maybe it isn't necessary to specify it as such.

I would guess that the 'right way' to keep things independent would be to also have different install directories, but I'm not sure how nicely that would play out with sitepackages, or perhaps a mechanism to prevent managers having duplicate install directories. Could you provide your opinion on this? cc: @ryanking13

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess different Manager instances can have different install directories?

Ideally, we may try to have each instance behave completely independently as if each instance becomes a virtual environment. For example, by combining micropip and subinterpreter, we may be able to support different interpreters to install and run different sets of packages.

However, I think most of the time, people will end up using one instance, and we would recommend doing so, too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super useful for testing purposes but probably not as useful for real world stuff. How does importlib manage this? It seems to me that it depends on sys.path and sys.meta_path for everything so it can't handle introspecting multiple environments without monkeypatching these. Presumably there are pypi packages that can inspect virtual environments that aren't active.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know importlib.metadata.distributions() takes an optional path argument, which defaults to sys.path, allowing a caller to inspect metadata for a set of paths. That could be pointed at a site-packages for a virtualenv to get the metadata for those packages, but it won't honor other paths that might get added by the site config (such as a venv --system-site-packages or a .pth file in the site-packages that might extend the path). In my experience, the only technique I've seen for inspecting the true packaging environment for an interpreter is to launch that interpreter (in a subprocess) and then inspect that interpreter's environment.

That said, I know recent versions of tox have been doing some innovative things around detecting installed packages (and diffing them against an install spec), so there may be some good ideas in that codebase.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super useful for testing purposes but probably not as useful for real world stuff.

My main goal of making micropip modular is to allow us (or others) to customize some of its behavior. It's nice to have each instance behave independently, but it's not really something I'm very interested in right now (of course we can benefit from it in testing purpose).


TODO: Implement all of the following global commands to utilize local state.
"""

def __init__(self):
self.index_urls = package_index.DEFAULT_INDEX_URLS

# TODO: initialize the compatibility layer
self.repodata_packages: dict[str, dict[str, Any]] = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repodata is a legacy name. We used a lock file called repodata.json, but we've changed its name to pyodide-lock.json, so it would probably be better to rename this variable too.

(no action required in this PR)

self.repodata_info: dict[str, str] = {}

pass

def install(self):
raise NotImplementedError()

def list(self):
raise NotImplementedError()

def freeze(self):
raise NotImplementedError()

def add_mock_package(self):
raise NotImplementedError()

def list_mock_packages(self):
raise NotImplementedError()

def remove_mock_package(self):
raise NotImplementedError()

def uninstall(self):
raise NotImplementedError()

def set_index_urls(self, urls: List[str] | str):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def set_index_urls(self, urls: List[str] | str):
def set_index_urls(self, urls: list[str] | str):

from Python 3.9, standard types can be used as type helpers.

Copy link
Contributor Author

@RulerOfCakes RulerOfCakes Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was because of the name conflict with the list() method, unlike the _list export from _commands.list.py I couldn't alias the method name to something else. Both my LSP and mypy won't let me make this change, but i'm not sure how I can avoid the warning from mypy for using List. Any suggestions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, that's funny. It was probably a bad decision to call it list at the first time. Anyway, it's not an issue. So let's keep this as-is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment next to the from typing import List please!

"""
Set the index URLs to use when looking up packages.

- The index URL should support the
`JSON API <https://warehouse.pypa.io/api-reference/json/>`__ .

- The index URL may contain the placeholder {package_name} which will be
replaced with the package name when looking up a package. If it does not
contain the placeholder, the package name will be appended to the URL.

- If a list of URLs is provided, micropip will try each URL in order until
it finds a package. If no package is found, an error will be raised.

Parameters
----------
urls
A list of URLs or a single URL to use as the package index.
"""

if isinstance(urls, str):
urls = [urls]

self.index_urls = urls[:]
29 changes: 29 additions & 0 deletions tests/test_manager.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
import micropip.package_index as package_index
from micropip.manager import Manager


def test_manager() -> Manager:
manager = Manager()
assert manager.index_urls == package_index.DEFAULT_INDEX_URLS

return manager


def test_set_index_urls():
manager = test_manager()

default_index_urls = package_index.DEFAULT_INDEX_URLS
assert manager.index_urls == default_index_urls

valid_url1 = "https://pkg-index.com/{package_name}/json/"
valid_url2 = "https://another-pkg-index.com/{package_name}"
valid_url3 = "https://another-pkg-index.com/simple/"
try:
manager.set_index_urls(valid_url1)
assert manager.index_urls == [valid_url1]

manager.set_index_urls([valid_url1, valid_url2, valid_url3])
assert manager.index_urls == [valid_url1, valid_url2, valid_url3]
finally:
manager.set_index_urls(default_index_urls)
assert manager.index_urls == default_index_urls