-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GitPathSpec
support class and constraint
#587
Comments
Turns out that the mangling of path specs for submodules (subdirectories) is not trivial. Supporting all possible combinations of magics is laborious, but seems doable. The main challenge is the deal with wildcards. When translating a pathspec for a subdirectory we do have a full path to match against (only the subdirectory). I cannot come up with a rule when and how to decide where to strip the longest or the shortest match from the pathspec. The best I can come up with is to multiply the pathspecs whenever they contain |
The main (if not only) purpose of this functionality is pathspec mangling/translation for handing them over to analog Git command calls on submodules -- for any Git command that supports pathspecs, but not recursion. A simple example for such a command is `git ls-files --other`. It accepts pathspecs, but does not implement `--recurse-submodules` for listing untracked files. The goal of this functionality is to be able to take pathspecs that is valid in the context of a top-level repository, and translate it such that the set of paths specs given to the same command running on/in a submodule/subdirectory gives the same results, as if the initial top-level invocation reported them (if it even could). The included sketch of a testbattery uses ``git ls-files --other` for testing, rather than a formal description -- because the behavior of the implementation is more elaborate than the documentation at https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec suggests. All testing is (for now) performed within a single repository, and with translation for execution in subdirectories. The implementation is a rough sketch for exploring the problem, rather than anything polished. Ping datalad#587 Also see datalad/datalad#6933
The main (if not only) purpose of this functionality is pathspec mangling/translation for handing them over to analog Git command calls on submodules -- for any Git command that supports pathspecs, but not recursion. A simple example for such a command is `git ls-files --other`. It accepts pathspecs, but does not implement `--recurse-submodules` for listing untracked files. The goal of this functionality is to be able to take pathspecs that is valid in the context of a top-level repository, and translate it such that the set of paths specs given to the same command running on/in a submodule/subdirectory gives the same results, as if the initial top-level invocation reported them (if it even could). The included sketch of a testbattery uses ``git ls-files --other` for testing, rather than a formal description -- because the behavior of the implementation is more elaborate than the documentation at https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec suggests. All testing is (for now) performed within a single repository, and with translation for execution in subdirectories. The implementation is a rough sketch for exploring the problem, rather than anything polished. Ping datalad#587 Also see datalad/datalad#6933
The main (if not only) purpose of this functionality is pathspec mangling/translation for handing them over to analog Git command calls on submodules -- for any Git command that supports pathspecs, but not recursion. A simple example for such a command is `git ls-files --other`. It accepts pathspecs, but does not implement `--recurse-submodules` for listing untracked files. The goal of this functionality is to be able to take pathspecs that is valid in the context of a top-level repository, and translate it such that the set of paths specs given to the same command running on/in a submodule/subdirectory gives the same results, as if the initial top-level invocation reported them (if it even could). The included sketch of a testbattery uses ``git ls-files --other` for testing, rather than a formal description -- because the behavior of the implementation is more elaborate than the documentation at https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec suggests. All testing is (for now) performed within a single repository, and with translation for execution in subdirectories. The implementation is a rough sketch for exploring the problem, rather than anything polished. Ping datalad#587 Also see datalad/datalad#6933
The main (if not only) purpose of this functionality is pathspec mangling/translation for handing them over to analog Git command calls on submodules -- for any Git command that supports pathspecs, but not recursion. A simple example for such a command is `git ls-files --other`. It accepts pathspecs, but does not implement `--recurse-submodules` for listing untracked files. The goal of this functionality is to be able to take pathspecs that is valid in the context of a top-level repository, and translate it such that the set of paths specs given to the same command running on/in a submodule/subdirectory gives the same results, as if the initial top-level invocation reported them (if it even could). The included sketch of a testbattery uses ``git ls-files --other` for testing, rather than a formal description -- because the behavior of the implementation is more elaborate than the documentation at https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec suggests. All testing is (for now) performed within a single repository, and with translation for execution in subdirectories. The implementation is a rough sketch for exploring the problem, rather than anything polished. Ping datalad#587 Also see datalad/datalad#6933
The main (if not only) purpose of this functionality is pathspec mangling/translation for handing them over to analog Git command calls on submodules -- for any Git command that supports pathspecs, but not recursion. A simple example for such a command is `git ls-files --other`. It accepts pathspecs, but does not implement `--recurse-submodules` for listing untracked files. The goal of this functionality is to be able to take pathspecs that is valid in the context of a top-level repository, and translate it such that the set of paths specs given to the same command running on/in a submodule/subdirectory gives the same results, as if the initial top-level invocation reported them (if it even could). The included sketch of a testbattery uses ``git ls-files --other` for testing, rather than a formal description -- because the behavior of the implementation is more elaborate than the documentation at https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec suggests. All testing is (for now) performed within a single repository, and with translation for execution in subdirectories. The implementation is a rough sketch for exploring the problem, rather than anything polished. Ping datalad#587 Also see datalad/datalad#6933
The main (if not only) purpose of this functionality is pathspec mangling/translation for handing them over to analog Git command calls on submodules -- for any Git command that supports pathspecs, but not recursion. A simple example for such a command is `git ls-files --other`. It accepts pathspecs, but does not implement `--recurse-submodules` for listing untracked files. The goal of this functionality is to be able to take pathspecs that is valid in the context of a top-level repository, and translate it such that the set of paths specs given to the same command running on/in a submodule/subdirectory gives the same results, as if the initial top-level invocation reported them (if it even could). The included sketch of a testbattery uses ``git ls-files --other` for testing, rather than a formal description -- because the behavior of the implementation is more elaborate than the documentation at https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec suggests. All testing is (for now) performed within a single repository, and with translation for execution in subdirectories. The implementation is a rough sketch for exploring the problem, rather than anything polished. Ping datalad#587 Also see datalad/datalad#6933
The main (if not only) purpose of this functionality is pathspec mangling/translation for handing them over to analog Git command calls on submodules -- for any Git command that supports pathspecs, but not recursion. A simple example for such a command is `git ls-files --other`. It accepts pathspecs, but does not implement `--recurse-submodules` for listing untracked files. The goal of this functionality is to be able to take pathspecs that is valid in the context of a top-level repository, and translate it such that the set of paths specs given to the same command running on/in a submodule/subdirectory gives the same results, as if the initial top-level invocation reported them (if it even could). The included sketch of a testbattery uses ``git ls-files --other` for testing, rather than a formal description -- because the behavior of the implementation is more elaborate than the documentation at https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec suggests. All testing is (for now) performed within a single repository, and with translation for execution in subdirectories. The implementation is a rough sketch for exploring the problem, rather than anything polished. Refs: datalad#587, datalad/datalad#6933
The main (if not only) purpose of this functionality is pathspec mangling/translation for handing them over to analog Git command calls on submodules -- for any Git command that supports pathspecs, but not recursion. A simple example for such a command is `git ls-files --other`. It accepts pathspecs, but does not implement `--recurse-submodules` for listing untracked files. The goal of this functionality is to be able to take pathspecs that is valid in the context of a top-level repository, and translate it such that the set of paths specs given to the same command running on/in a submodule/subdirectory gives the same results, as if the initial top-level invocation reported them (if it even could). The included sketch of a testbattery uses ``git ls-files --other` for testing, rather than a formal description -- because the behavior of the implementation is more elaborate than the documentation at https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec suggests. All testing is (for now) performed within a single repository, and with translation for execution in subdirectories. The implementation is a rough sketch for exploring the problem, rather than anything polished. Refs: datalad#587, datalad/datalad#6933
The main (if not only) purpose of this functionality is pathspec mangling/translation for handing them over to analog Git command calls on submodules -- for any Git command that supports pathspecs, but not recursion. A simple example for such a command is `git ls-files --other`. It accepts pathspecs, but does not implement `--recurse-submodules` for listing untracked files. The goal of this functionality is to be able to take pathspecs that is valid in the context of a top-level repository, and translate it such that the set of paths specs given to the same command running on/in a submodule/subdirectory gives the same results, as if the initial top-level invocation reported them (if it even could). The included sketch of a testbattery uses ``git ls-files --other` for testing, rather than a formal description -- because the behavior of the implementation is more elaborate than the documentation at https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec suggests. All testing is (for now) performed within a single repository, and with translation for execution in subdirectories. The implementation is a rough sketch for exploring the problem, rather than anything polished. Refs: datalad#587, datalad/datalad#6933
The main (if not only) purpose of this functionality is pathspec mangling/translation for handing them over to analog Git command calls on submodules -- for any Git command that supports pathspecs, but not recursion. A simple example for such a command is `git ls-files --other`. It accepts pathspecs, but does not implement `--recurse-submodules` for listing untracked files. The goal of this functionality is to be able to take pathspecs that is valid in the context of a top-level repository, and translate it such that the set of paths specs given to the same command running on/in a submodule/subdirectory gives the same results, as if the initial top-level invocation reported them (if it even could). The included sketch of a testbattery uses ``git ls-files --other` for testing, rather than a formal description -- because the behavior of the implementation is more elaborate than the documentation at https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec suggests. All testing is (for now) performed within a single repository, and with translation for execution in subdirectories. The implementation is a rough sketch for exploring the problem, rather than anything polished. Refs: datalad#587, datalad/datalad#6933
The main (if not only) purpose of this functionality is pathspec mangling/translation for handing them over to analog Git command calls on submodules -- for any Git command that supports pathspecs, but not recursion. A simple example for such a command is `git ls-files --other`. It accepts pathspecs, but does not implement `--recurse-submodules` for listing untracked files. The goal of this functionality is to be able to take pathspecs that is valid in the context of a top-level repository, and translate it such that the set of paths specs given to the same command running on/in a submodule/subdirectory gives the same results, as if the initial top-level invocation reported them (if it even could). The included sketch of a testbattery uses ``git ls-files --other` for testing, rather than a formal description -- because the behavior of the implementation is more elaborate than the documentation at https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec suggests. All testing is (for now) performed within a single repository, and with translation for execution in subdirectories. The implementation is a rough sketch for exploring the problem, rather than anything polished. Refs: datalad#587, datalad/datalad#6933
The main (if not only) purpose of this functionality is pathspec mangling/translation for handing them over to analog Git command calls on submodules -- for any Git command that supports pathspecs, but not recursion. A simple example for such a command is `git ls-files --other`. It accepts pathspecs, but does not implement `--recurse-submodules` for listing untracked files. The goal of this functionality is to be able to take pathspecs that is valid in the context of a top-level repository, and translate it such that the set of paths specs given to the same command running on/in a submodule/subdirectory gives the same results, as if the initial top-level invocation reported them (if it even could). The included sketch of a testbattery uses ``git ls-files --other` for testing, rather than a formal description -- because the behavior of the implementation is more elaborate than the documentation at https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec suggests. All testing is (for now) performed within a single repository, and with translation for execution in subdirectories. The implementation is a rough sketch for exploring the problem, rather than anything polished. Refs: datalad#587, datalad/datalad#6933
The main (if not only) purpose of this functionality is pathspec mangling/translation for handing them over to analog Git command calls on submodules -- for any Git command that supports pathspecs, but not recursion. A simple example for such a command is `git ls-files --other`. It accepts pathspecs, but does not implement `--recurse-submodules` for listing untracked files. The goal of this functionality is to be able to take pathspecs that is valid in the context of a top-level repository, and translate it such that the set of paths specs given to the same command running on/in a submodule/subdirectory gives the same results, as if the initial top-level invocation reported them (if it even could). The included sketch of a testbattery uses ``git ls-files --other` for testing, rather than a formal description -- because the behavior of the implementation is more elaborate than the documentation at https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec suggests. All testing is (for now) performed within a single repository, and with translation for execution in subdirectories. The implementation is a rough sketch for exploring the problem, rather than anything polished. Refs: datalad#587, datalad/datalad#6933
The main (if not only) purpose of this functionality is pathspec mangling/translation for handing them over to analog Git command calls on submodules -- for any Git command that supports pathspecs, but not recursion. A simple example for such a command is `git ls-files --other`. It accepts pathspecs, but does not implement `--recurse-submodules` for listing untracked files. The goal of this functionality is to be able to take pathspecs that is valid in the context of a top-level repository, and translate it such that the set of paths specs given to the same command running on/in a submodule/subdirectory gives the same results, as if the initial top-level invocation reported them (if it even could). The included sketch of a testbattery uses ``git ls-files --other` for testing, rather than a formal description -- because the behavior of the implementation is more elaborate than the documentation at https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec suggests. All testing is (for now) performed within a single repository, and with translation for execution in subdirectories. The implementation is a rough sketch for exploring the problem, rather than anything polished. Refs: datalad#587, datalad/datalad#6933
The main (if not only) purpose of this functionality is pathspec mangling/translation for handing them over to analog Git command calls on submodules -- for any Git command that supports pathspecs, but not recursion. A simple example for such a command is `git ls-files --other`. It accepts pathspecs, but does not implement `--recurse-submodules` for listing untracked files. The goal of this functionality is to be able to take pathspecs that is valid in the context of a top-level repository, and translate it such that the set of paths specs given to the same command running on/in a submodule/subdirectory gives the same results, as if the initial top-level invocation reported them (if it even could). The included sketch of a testbattery uses ``git ls-files --other` for testing, rather than a formal description -- because the behavior of the implementation is more elaborate than the documentation at https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec suggests. All testing is (for now) performed within a single repository, and with translation for execution in subdirectories. The implementation is a rough sketch for exploring the problem, rather than anything polished. Refs: datalad#587, datalad/datalad#6933
It is a thin proxy for `GitPathSpec.from_pathspec_str()`. The error message template includes `__itemized_causes__` to ensure that underlying violation causes are reported to a user. Refs: datalad#587
It is a thin proxy for `GitPathSpec.from_pathspec_str()`. The error message template includes `__itemized_causes__` to ensure that underlying violation causes are reported to a user. Refs: datalad#587
The main (if not only) purpose of this functionality is pathspec mangling/translation for handing them over to analog Git command calls on submodules -- for any Git command that supports pathspecs, but not recursion. A simple example for such a command is `git ls-files --other`. It accepts pathspecs, but does not implement `--recurse-submodules` for listing untracked files. The goal of this functionality is to be able to take pathspecs that is valid in the context of a top-level repository, and translate it such that the set of paths specs given to the same command running on/in a submodule/subdirectory gives the same results, as if the initial top-level invocation reported them (if it even could). The key algorihtm lives in a standlone function `yield_subdir_match_remainder_pathspecs()` that performs a purely lexical analysis. It also comes with a dedicated test collection that is leaner and easier to extend than the previous one (which remains also). The additionally included sketch of a testbattery uses ``git ls-files --other` for testing, rather than a formal description -- because the behavior of the implementation is more elaborate than the documentation at https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec suggests. All testing is (for now) performed within a single repository, and with translation for execution in subdirectories. Refs: datalad#587, datalad/datalad#6933
The main (if not only) purpose of this functionality is pathspec mangling/translation for handing them over to analog Git command calls on submodules -- for any Git command that supports pathspecs, but not recursion. A simple example for such a command is `git ls-files --other`. It accepts pathspecs, but does not implement `--recurse-submodules` for listing untracked files. The goal of this functionality is to be able to take pathspecs that is valid in the context of a top-level repository, and translate it such that the set of paths specs given to the same command running on/in a submodule/subdirectory gives the same results, as if the initial top-level invocation reported them (if it even could). The key algorihtm lives in a standlone function `yield_subdir_match_remainder_pathspecs()` that performs a purely lexical analysis. It also comes with a dedicated test collection that is leaner and easier to extend than the previous one (which remains also). The additionally included sketch of a testbattery uses ``git ls-files --other` for testing, rather than a formal description -- because the behavior of the implementation is more elaborate than the documentation at https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec suggests. All testing is (for now) performed within a single repository, and with translation for execution in subdirectories. Refs: datalad#587, datalad/datalad#6933
The main (if not only) purpose of this functionality is pathspec mangling/translation for handing them over to analog Git command calls on submodules -- for any Git command that supports pathspecs, but not recursion. A simple example for such a command is `git ls-files --other`. It accepts pathspecs, but does not implement `--recurse-submodules` for listing untracked files. The goal of this functionality is to be able to take pathspecs that is valid in the context of a top-level repository, and translate it such that the set of paths specs given to the same command running on/in a submodule/subdirectory gives the same results, as if the initial top-level invocation reported them (if it even could). The key algorihtm lives in a standlone function `yield_subdir_match_remainder_pathspecs()` that performs a purely lexical analysis. It also comes with a dedicated test collection that is leaner and easier to extend than the previous one (which remains also). The additionally included sketch of a testbattery uses ``git ls-files --other` for testing, rather than a formal description -- because the behavior of the implementation is more elaborate than the documentation at https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec suggests. All testing is (for now) performed within a single repository, and with translation for execution in subdirectories. Refs: datalad#587, datalad/datalad#6933
The main (if not only) purpose of this functionality is pathspec mangling/translation for handing them over to analog Git command calls on submodules -- for any Git command that supports pathspecs, but not recursion. A simple example for such a command is `git ls-files --other`. It accepts pathspecs, but does not implement `--recurse-submodules` for listing untracked files. The goal of this functionality is to be able to take pathspecs that is valid in the context of a top-level repository, and translate it such that the set of paths specs given to the same command running on/in a submodule/subdirectory gives the same results, as if the initial top-level invocation reported them (if it even could). The included sketch of a testbattery uses ``git ls-files --other` for testing, rather than a formal description -- because the behavior of the implementation is more elaborate than the documentation at https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec suggests. All testing is (for now) performed within a single repository, and with translation for execution in subdirectories. The implementation is a rough sketch for exploring the problem, rather than anything polished. Ping #587 Also see datalad/datalad#6933
It is a thin proxy for `GitPathSpec.from_pathspec_str()`. The error message template includes `__itemized_causes__` to ensure that underlying violation causes are reported to a user. Refs: #587
Looking into #586 it seems attractive to start support Git's pathspecs properly https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec
The main development challenge would be support for mangling pathspecs for subdataset recursions implemented outside Git. Something like stripping away a path prefix when a pathspec is passed on to another call on a subdataset -- or detecting that a particular pathspec would no longer apply to a subdataset.
The specification suggests great flexibility. The key statement/property seems to be
This would imply that we need to implement pathspec parsing enough to be able to extract and manipulate that directory prefix.
Such a manipulation might be as simple as removing segments of the
/
delimited directory prefix orglob
.For operation on Windows we might also want to convert platform paths to POSIX paths for simplicity. But I have not yet tested how Git behaves there.
The text was updated successfully, but these errors were encountered: