-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reconstruct RepodataRecord from lock file #433
Comments
In rattler we added some additional fields to the The fields we added are: /// Experimental: architecture field
pub arch: Option<String>,
/// Experimental: the subdir where the package can be found
pub subdir: Option<String>,
/// Experimental: conda build number of the package
pub build_number: Option<u64>,
/// Experimental: see: [Constrains](crate::repo_data::PackageRecord::constrains)
pub constrains: Vec<String>,
/// Experimental: see: [Features](crate::repo_data::PackageRecord::features)
pub features: Option<String>,
/// Experimental: see: [Track features](crate::repo_data::PackageRecord::track_features)
pub track_features: Vec<String>,
/// Experimental: the specific license of the package
pub license: Option<String>,
/// Experimental: the license family of the package
pub license_family: Option<String>,
/// Experimental: If this package is independent of architecture this field specifies in what way. See
/// [`NoArchType`] for more information.
pub noarch: NoArchType,
/// Experimental: The size of the package archive in bytes
pub size: Option<u64>,
/// Experimental: The date this entry was created.
pub timestamp: Option<chrono::DateTime<chrono::Utc>>, We also discovered another important reason to do so. When rattler (and micromamba) create an environment from a lock-file without reading additional repodata all the information that is stored in the I propose we add the same fields in conda-lock! :) |
Yes, I've been wanting to implement something like this for a while. One of the most confusing parts for me of the conda-lock codebase is understanding when a dependency is conda or pip or either. I've been especially fond for quite some time of the idea of including the timestamp data, since the maximum over timestamps gives an approximate but stable last-locked time. |
To me, it makes sense to have two alternative data structures ( |
Checklist
What is the idea?
While working on
rattler
I ran into a situation where I wanted to update lock files incrementally.To do so I need to pass the "currently installed packages" to the solver. The solver prioritizes the "installed" variants of a package over others which nudges the solver into using the installed package variants.
As far as I understand it, conda-lock facilitates incremental lock file updates by creating a fake environment with fake "reconstructed"
conda-meta
files. In Rattler (and in conda too), the files in theconda-meta
folder are represented asPrefixRecord
s. These are a superset ofRepoDataRecord
which are in turn a superset ofPackageRecord
s.PackageRecord
contains data read fromrepodata.json
files.RepoDataRecord
contains the same data asPackageRecords
but amended with information about the channel (like the URL of the package, the filename, and a string representation of the channel).PrefixRecord
contains the same data asRepoDataRecord
but additionally includes information about how the package was installed.To be able to do a "perfect" incremental lock file update we would ideally completely reconstruct the information of the
RepoDataRecord
from the conda-lock file and pass that to the solver. In a regular conda update this information is typically read from theconda-meta
directory. Complete reconstruction is important because if a package was locked that is no longer available in the repodata.json (for whatever reason) the lock file remains valid, even when updating parts of the lock file. Its also important because according to MatchSpec dependencies can match on any of theRepoDataRecord
fields.The issue I run into is that the current conda-lock file format does not easily allow the proper reconstruction of
RepoDataRecord
s. The current models have a single definition of aLockedDependency
for bothpip
andconda
packages. I propose we implement this differently through a union of either aPipLockedDependency
(of which I know not enough to describe what it should look like) andCondaLockedDependency
which would allow the complete reconstruction of aRepoDataRecord
. I believe thatmicromamba
,mamba
, andconda
expose enough information to do so.Currently, things that are hard to reproduce are:
arch
andsubdir
fieldbuild_number
field (very important)constrains
features
andtrack_features
license
andlicense_family
noarch
size
timestamp
Why is this needed?
As explained in "what is the idea?", this is needed to be able to do proper incremental lock file updates.
What should happen?
No response
Additional Context
No response
The text was updated successfully, but these errors were encountered: