This package lets you interact with git-annex from within Python.
Necessary commands are executed using subprocess
and use their
batch versions whenever possible.
I'm developing this as needed, so feel free to ask if there's any functionality you want me to implement.
You might also like to check out datalad's annexrepo support package which is probably more featureful than this.
- Python 3
- git-annex 6.20170101 (or later)
- pygit2 0.24 (or later)
To create a git-annex repository from scratch:
>>> from pygit2 import init_repository >>> from git_annex_adapter import init_annex >>> init_repository('/path/to/repo') pygit2.Repository('/path/to/repo/.git/') >>> init_annex('/path/to/repo') git_annex_adapter.repo.GitAnnexRepo(/path/to/repo/.git/)
To wrap an existing git-annex repository:
>>> from git_annex_adapter.repo import GitAnnexRepo >>> repo = GitAnnexRepo('/tmp/repo')
The GitAnnexRepo is a subclass of pygit2.Repository. Git-annex specific
functionality is accessed via the annex
property of it, which is
a mapping object from git-annex keys to AnnexedFile
objects:
>>> for key in repo.annex: ... print(key) SHA256E-s3--2c26... SHA256E-s3--baa5... SHA256E-s3--fcde... >>> key = 'SHA256E-s3--2c26...' >>> repo.annex[key] git_annex_adapter.repo.AnnexedFile('SHA256E-s3--2c26...')
You can also get a tree representation of any git tree-ish object with
annexed file entries replaced with AnnexedFile
objects:
>>> tree = repo.annex.get_file_tree() # treeish='HEAD' >>> tree git_annex_adapter.repo.AnnexedFileTree(4d7f...) >>> set(tree) {'foo', 'bar', 'baz', 'README', 'directory'} >>> tree['foo'] git_annex_adapter.repo.AnnexedFile(SHA256E-s3--2c26...) >>> tree['directory'] git_annex_adapter.repo.AnnexedFileTree(8b54...) >>> tree['directory/file'] # or tree['directory']['file'] <pygit2.Blob object at 0x...>
The AnnexedFile
objects can be used to access and manipulate
information about a file.
The metadata
property of the AnnexedFile
is a mutable mapping
object from fields to sets of values:
>>> foo = tree['foo'] >>> for field, values in foo.metadata: ... print('{}: {}'.format(field, values)) author: {'me'} numbers: {'1', '2', '3'} >>> foo.metadata['numbers'] |= {'0'} >>> foo.metadata['numbers'] -= {'3'} >>> foo.metadata['numbers'] {'0', '2'} >>> del foo.metadata['author'] >>> 'author' in foo.metadata False >>> foo.metadata['lastchanged'] '2017-07-19@15-00-00'
If you need low-level access to the git-annex processes, you can do it
via the classes included in process
module:
>>> from git_annex_adapter.process import ...
Subclasses of GitAnnexBatchProcess
return relevant output (usually
one line or a dict object) whenever called with a line of input.
For example, git-annex metadata --batch --json
:
>>> proc = GitAnnexMetadataBatchJsonProcess('/path/to/repo') >>> proc(file='foo') {..., 'key':'SHA256E-s3--2c26...', 'fields': ...} >>> proc(file='foo', fields={'numbers': ['1', '2', '3']}) {..., 'key': ..., 'fields': {'numbers': ['1', '2', '3'], ...}}
Subclasses of GitAnnexRunner
call a single program with
different arguments. They return a subprocess.CompletedProcess
when called, which captures stdout and stderr. For example, to run
git-annex version
:
>>> runner = GitAnnexVersionRunner('/path/to/repo') >>> runner(raw=True) CompletedProcess(..., stdout='6.20170101', stderr='') >>> print(runner().stdout) git-annex version: 6.20170101 ...