-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement replacement for datalad status
#586
Comments
Few more thoughts on recursive operation and listing untracked files within DataLad advertises restricted recursion as a feature (i.e. recurse a limited number of dataset levels). This represents a challenge. Git does not support that (either recurse or not, only). Therefore the handling cannot be left to Git. Conceptually, it is not clear to me why we support that. If it is about "seamless monorepos", it seems questionable to require an awareness of repository boundaries (which is necessary for specifying numerical limits). I vaguely remember that there was a use case for "immediate subdatasets", but I cannot recall specifics. Not providing restricted recursion would also encourage usage that does not involve needlessly deep repository hierarchies. If this is to be discontinued, a main benefit could be added support for Git's pathspecs. Presently datalad largely ignores that possibility, and it would be rather complicated to implement mangling of pathspecs for seemless recursive Git calls on subdataset -- without using Gits own recursion mechanism. Listing untracked files recursively in subdatasets Presently
Combining with A recursive listing of untracked files would therefore require a recursive discovery of submodules and then individual This means that an implementation of something like a It appears to still make sense to implement a dedicated |
Looking jointly at all Git commands that are involved in a status report ( What calls would need to be done:
An alternative look at the required capabilities from the POV of
Conclusions:
My gut feeling is that we should first implement (A), and then do a separate implementation of (B), following by an refactoring to get a useful codebase -- while maintaining two distinct API entrypoints. |
This implementation is the first to emit `CommandResult` type result items, ie. dataclass instances rather than result dicts. It also uses uniform parameter validation, enabling substantially simplified implementation (e.g., of the result renderer). The user-facing appearance remains (largely?) the same. TODO more detailed analysis The command options `untracked` and `recursive` now both take (optional) qualifiying values, but also work with any value specification at the CLI. Closes datalad#586 (eventually)
This implementation is the first to emit `CommandResult` type result items, ie. dataclass instances rather than result dicts. It also uses uniform parameter validation, enabling substantially simplified implementation (e.g., of the result renderer). The user-facing appearance remains (largely?) the same. TODO more detailed analysis The command options `untracked` and `recursive` now both take (optional) qualifiying values, but also work with any value specification at the CLI. Closes datalad#586 (eventually)
This implementation is the first to emit `CommandResult` type result items, ie. dataclass instances rather than result dicts. It also uses uniform parameter validation, enabling substantially simplified implementation (e.g., of the result renderer). The user-facing appearance remains (largely?) the same. TODO more detailed analysis The command options `untracked` and `recursive` now both take (optional) qualifiying values, but also work with any value specification at the CLI. Closes datalad#586 (eventually)
This implementation is the first to emit `CommandResult` type result items, ie. dataclass instances rather than result dicts. It also uses uniform parameter validation, enabling substantially simplified implementation (e.g., of the result renderer). The user-facing appearance remains (largely?) the same. TODO more detailed analysis The command options `untracked` and `recursive` now both take (optional) qualifiying values, but also work with any value specification at the CLI. Closes datalad#586 (eventually)
This implementation is the first to emit `CommandResult` type result items, ie. dataclass instances rather than result dicts. It also uses uniform parameter validation, enabling substantially simplified implementation (e.g., of the result renderer). The user-facing appearance remains (largely?) the same. TODO more detailed analysis The command options `untracked` and `recursive` now both take (optional) qualifiying values, but also work with any value specification at the CLI. Closes datalad#586 (eventually)
This implementation is the first to emit `CommandResult` type result items, ie. dataclass instances rather than result dicts. It also uses uniform parameter validation, enabling substantially simplified implementation (e.g., of the result renderer). The user-facing appearance remains (largely?) the same. TODO more detailed analysis The command options `untracked` and `recursive` now both take (optional) qualifiying values, but also work with any value specification at the CLI. Closes datalad#586 (eventually)
This implementation is the first to emit `CommandResult` type result items, ie. dataclass instances rather than result dicts. It also uses uniform parameter validation, enabling substantially simplified implementation (e.g., of the result renderer). The user-facing appearance remains (largely?) the same. TODO more detailed analysis The command options `untracked` and `recursive` now both take (optional) qualifiying values, but also work with any value specification at the CLI. Closes datalad#586 (eventually)
This implementation is the first to emit `CommandResult` type result items, ie. dataclass instances rather than result dicts. It also uses uniform parameter validation, enabling substantially simplified implementation (e.g., of the result renderer). The user-facing appearance remains (largely?) the same. TODO more detailed analysis The command options `untracked` and `recursive` now both take (optional) qualifiying values, but also work with any value specification at the CLI. Closes datalad#586 (eventually)
This implementation is the first to emit `CommandResult` type result items, ie. dataclass instances rather than result dicts. It also uses uniform parameter validation, enabling substantially simplified implementation (e.g., of the result renderer). The user-facing appearance remains (largely?) the same. TODO more detailed analysis The command options `untracked` and `recursive` now both take (optional) qualifiying values, but also work with any value specification at the CLI. Closes datalad#586 (eventually)
This implementation is the first to emit `CommandResult` type result items, ie. dataclass instances rather than result dicts. It also uses uniform parameter validation, enabling substantially simplified implementation (e.g., of the result renderer). The user-facing appearance remains (largely?) the same. TODO more detailed analysis The command options `untracked` and `recursive` now both take (optional) qualifiying values, but also work with any value specification at the CLI. Closes datalad#586 (eventually)
This implementation is the first to emit `CommandResult` type result items, ie. dataclass instances rather than result dicts. It also uses uniform parameter validation, enabling substantially simplified implementation (e.g., of the result renderer). The user-facing appearance remains (largely?) the same. TODO more detailed analysis The command options `untracked` and `recursive` now both take (optional) qualifiying values, but also work with any value specification at the CLI. Closes datalad#586 (eventually)
This implementation is the first to emit `CommandResult` type result items, ie. dataclass instances rather than result dicts. It also uses uniform parameter validation, enabling substantially simplified implementation (e.g., of the result renderer). The user-facing appearance remains (largely?) the same. TODO more detailed analysis The command options `untracked` and `recursive` now both take (optional) qualifiying values, but also work with any value specification at the CLI. Closes datalad#586 (eventually)
This implementation is the first to emit `CommandResult` type result items, ie. dataclass instances rather than result dicts. It also uses uniform parameter validation, enabling substantially simplified implementation (e.g., of the result renderer). The user-facing appearance remains (largely?) the same. TODO more detailed analysis The command options `untracked` and `recursive` now both take (optional) qualifiying values, but also work with any value specification at the CLI. Closes datalad#586 (eventually)
This implementation is the first to emit `CommandResult` type result items, ie. dataclass instances rather than result dicts. It also uses uniform parameter validation, enabling substantially simplified implementation (e.g., of the result renderer). The user-facing appearance remains (largely?) the same. TODO more detailed analysis The command options `untracked` and `recursive` now both take (optional) qualifiying values, but also work with any value specification at the CLI. Closes datalad#586 (eventually)
This implementation is the first to emit `CommandResult` type result items, ie. dataclass instances rather than result dicts. It also uses uniform parameter validation, enabling substantially simplified implementation (e.g., of the result renderer). The user-facing appearance remains (largely?) the same. TODO more detailed analysis The command options `untracked` and `recursive` now both take (optional) qualifiying values, but also work with any value specification at the CLI. Closes datalad#586 (eventually)
This implementation is the first to emit `CommandResult` type result items, ie. dataclass instances rather than result dicts. It also uses uniform parameter validation, enabling substantially simplified implementation (e.g., of the result renderer). The user-facing appearance remains (largely?) the same. TODO more detailed analysis The command options `untracked` and `recursive` now both take (optional) qualifiying values, but also work with any value specification at the CLI. Closes datalad#586 (eventually)
This implementation is the first to emit `CommandResult` type result items, ie. dataclass instances rather than result dicts. It also uses uniform parameter validation, enabling substantially simplified implementation (e.g., of the result renderer). The user-facing appearance remains (largely?) the same. TODO more detailed analysis The command options `untracked` and `recursive` now both take (optional) qualifiying values, but also work with any value specification at the CLI. Closes datalad#586 (eventually)
This implementation is the first to emit `CommandResult` type result items, ie. dataclass instances rather than result dicts. It also uses uniform parameter validation, enabling substantially simplified implementation (e.g., of the result renderer). The user-facing appearance remains (largely) the same. The command documentation contains a summary of the key differences. Closes datalad#586
This implementation is the first to emit `CommandResult` type result items, ie. dataclass instances rather than result dicts. It also uses uniform parameter validation, enabling substantially simplified implementation (e.g., of the result renderer). The user-facing appearance remains (largely) the same. The command documentation contains a summary of the key differences. Closes datalad#586
This implementation is the first to emit `CommandResult` type result items, ie. dataclass instances rather than result dicts. It also uses uniform parameter validation, enabling substantially simplified implementation (e.g., of the result renderer). The user-facing appearance remains (largely) the same. The command documentation contains a summary of the key differences. Closes datalad#586
With #580 and related iterators I quickly hacked together an iterator for changes in git repos (see #323 for the background). This essentially comprises all pieces necessary to reimplement
datalad status
.This implementation would differ significantly, both in code base and also in approach.
ls-file-collection (git|annex)worktree
, and it is a lot faster with that approachdatalad_next.iter_collections
Based on initial performance tests of an unpublished draft, we can expect up to 100x speed increase for sizable repos (status report in <20ms instead of 1.5s (for a test repo with >30k files).
datalad status
behavior summaryHere is what present
datalad status
in core does.bytesize
is only present for files in Git (not for annexed files, unless--annex
is given).state
can be the expectedclean
,added
,deleted
,modified
, ...datalad status --annex
behavior summarydatalad status --annex availability
behavior summaryThere is nothing here that could not be reported based on the existing iterators.
Beyond the already existing draft of
iter_gitdiff()
it makes sense to also implementiter_annexdiff()
. It would also implement the pattern ofiter_annexworktree()
wrappingiter_gitworktree()
, and injecting the corresponding annex attributes to the reported items.API considerations
Right now, I see no need to make the API of the reimplementation be different from the one in
datalad-core
. Obviously, thereport_filetype
argument that is already ignored would not be supported.The text was updated successfully, but these errors were encountered: