Skip to content

Commit

Permalink
Merge pull request #211 from UtrechtUniversity/develop
Browse files Browse the repository at this point in the history
v0.2.3
  • Loading branch information
chStaiger authored Jun 26, 2024
2 parents 54469e4 + a47e237 commit 1970de3
Show file tree
Hide file tree
Showing 11 changed files with 564 additions and 25 deletions.
36 changes: 33 additions & 3 deletions docs/source/cli.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,40 @@ in your shell script without having to create a new python script.
There are no CLI commands to add/change metadata, instead use the iBridges API for this.


Setting up
----------
.. _cli-setup:

As with the ibridges API, you will need to create an `irods_environment.json`. It is the easiest if you put this file
Setup
-----

As with the ibridges API, you will need to create an `irods_environment.json`. We have created a plugin system to automatically
create the environment file for you. Below are the currently (known) plugins, see the links for installation instructions:

.. list-table:: Server configuration plugins
:widths: 50 50
:header-rows: 1

* - Organization
- Link
* - Utrecht University
- https://github.com/UtrechtUniversity/ibridges-servers-uu

After installation, you will be able to create an `irods_environment.json` by simply answering questions like which email-address
you have. First find the server name with:

.. code:: shell
ibridges setup --list
Then finish the setup using the server name you just found:

.. code:: shell
ibridges setup server_name
If your organization does not provide a plugin, then you will have to create the `environment.json` yourself (with
the help of your iRODS administrator).

It is the easiest if you put this file
in the default location: `~/.irods/irods_environment.json`, because then it will be automatically detected. However,
if you have it in another location for some reason (let's say you have multiple environments), then you can tell the
ibridges CLI where it is:
Expand Down
12 changes: 12 additions & 0 deletions docs/source/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,15 @@ An advantage compared to the iCommands is that iBridges also works on Mac OS and

Our development is done on `GitHub <https://github.com/UtrechtUniversity/iBridges>`__ We are welcoming contributions
by pull requests. You can also ask for new features/ideas in our issue tracker.


**I have installed iBridges and now get "ibridges: command not found"**
-----------------------------------------------------------------------

This can happen for a variety for reasons, but the most common reason is that your PATH is not setup correctly on your system.
Often `pip` will complain about this when you install `ibridges`. To solve this, you must first find out where pip installs the
ibridges executable. Usually this will be something like `/home/your_username/.local/bin`, but this is dependent on your system. Then we must
add this to the path on the command line: `export PATH="${PATH}:/home/your_username/.local/bin"` (change the path according to your system). This should allow
your shell to find the `ibridges` command. You would have to type the previous command every time you start a new shell, which can be inconvenient.
To fix this permanently, add the command to your `.bashrc` or `.zshrc` file in your home directory at the end of the file
(depending on your shell, type `echo ${SHELL}` to find out).
5 changes: 3 additions & 2 deletions docs/source/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,9 @@ iBridges requires Python version 3.8 or higher. You can install iBridges with pi
Getting your iRODS environment file
-----------------------------------

To connect to an iRods server you need an iRods environment file (`irods_nevironment.json`).
You can obtain this by asking your local iRods administrator. An example of an environment file:
To connect to an iRods server you need an iRods environment file (`irods_environment.json`).
If your organization provides automatic setup, you can create this file yourself using the :ref:`CLI <cli-setup>`.
Otherwise, you can obtain this by asking your local iRods administrator. An example of an environment file:

.. code:: json
Expand Down
83 changes: 80 additions & 3 deletions ibridges/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,15 @@
from typing import Union

from ibridges.data_operations import download, sync, upload
from ibridges.interactive import interactive_auth
from ibridges.interactive import DEFAULT_IENV_PATH, interactive_auth
from ibridges.path import IrodsPath
from ibridges.session import Session
from ibridges.util import get_collection
from ibridges.util import (
find_environment_provider,
get_collection,
get_environment_providers,
print_environment_providers,
)

try: # Python < 3.10 (backport)
from importlib_metadata import version # type: ignore
Expand Down Expand Up @@ -40,6 +45,8 @@
List a collection and subcollections in a hierarchical way.
mkcoll:
Create the collection and all its parent collections.
setup:
Create an iRODS environment file to connect to an iRODS server.
The iBridges CLI does not implement the complete iBridges API. For example, there
are no commands to modify metadata on the irods server.
Expand All @@ -53,6 +60,7 @@
ibridges list irods:~/collection
ibridges mkcoll irods://~/bli/bla/blubb
ibridges tree irods:~/collection
ibridges setup uu-its
Program information:
Expand Down Expand Up @@ -88,6 +96,8 @@ def main() -> None:
ibridges_mkcoll()
elif subcommand == "tree":
ibridges_tree()
elif subcommand == "setup":
ibridges_setup()
else:
print(f"Invalid subcommand ({subcommand}). For help see ibridges --help")
sys.exit(1)
Expand Down Expand Up @@ -147,11 +157,78 @@ def _list_coll(session: Session, remote_path: IrodsPath):
print(str(remote_path)+':')
coll = get_collection(session, remote_path)
print('\n'.join([' '+sub.path for sub in coll.data_objects]))
print('\n'.join([' C- '+sub.path for sub in coll.subcollections]))
print('\n'.join([' C- '+sub.path for sub in coll.subcollections
if not str(remote_path) == sub.path]))
else:
raise ValueError(f"Irods path '{remote_path}' is not a collection.")


def ibridges_setup():
"""Use templates to create an iRODS environment json."""
parser = argparse.ArgumentParser(
prog="ibridges setup",
description="Tool to create a valid irods_environment.json"
)
parser.add_argument(
"server_name",
help="Server name to create your irods_environment.json for.",
type=str,
default=None,
nargs="?"
)
parser.add_argument(
"--list",
help="List all available server names.",
action="store_true"
)
parser.add_argument(
"-o", "--output",
help="Store the environment to a file.",
type=Path,
default=DEFAULT_IENV_PATH,
required=False,
)
parser.add_argument(
"--overwrite",
help="Overwrite the irods environment file.",
action="store_true"
)
args = parser.parse_args()
env_providers = get_environment_providers()
if args.list:
if len(env_providers) == 0:
print("No server information was found. To use this function, please install a plugin"
" such as:\n\nhttps://github.com/UtrechtUniversity/ibridges-servers-uu"
"\n\nAlternatively create an irods_environment.json by yourself or with the help "
"of your iRODS administrator.")
print_environment_providers(env_providers)
return

try:
provider = find_environment_provider(env_providers, args.server_name)
except ValueError:
print(f"Error: Unknown server with name {args.server_name}.\n"
"Use `ibridges setup --list` to list all available server names.")
sys.exit(123)

user_answers = {}
for question in provider.questions:
user_answers[question] = input(question + ": ")

json_str = provider.environment_json(args.server_name, **user_answers)
if args.output.is_file() and not args.overwrite:
print(f"File {args.output} already exists, use --overwrite or copy the below manually.")
print("\n")
print(json_str)
if args.output.is_dir():
print(f"Output {args.output} is a directory, cannot export irods_environment"
" file.")
sys.exit(234)
else:
with open(args.output, "w", encoding="utf-8") as handle:
handle.write(json_str)


def ibridges_list():
"""List a collection on iRODS."""
parser = argparse.ArgumentParser(
Expand Down
17 changes: 10 additions & 7 deletions ibridges/data_operations.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@
import irods.data_object
import irods.exception
import irods.keywords as kw
from irods import DEFAULT_CONNECTION_TIMEOUT
from tqdm import tqdm

from ibridges.path import CachedIrodsPath, IrodsPath
Expand Down Expand Up @@ -378,14 +377,18 @@ def perform_operations(session: Session, operations: dict, ignore_err: bool=Fals
pbar = tqdm(total=sum(up_sizes) + sum(down_sizes), unit="B", unit_scale=True, unit_divisor=1024,
disable=disable)

# The code below does not work as expected, since connections in the
# pool can be reused. Another solution for dynamic timeouts might be needed
# Leaving the previous solution in here for documentation.

# For large files, the checksum computation might take too long, which can result in a timeout.
# This is why we increase the time out from file sizes > 1 GB
# This might still result in a time out if your server is very busy or a potato.
max_size = max([*up_sizes, *down_sizes, 0])
original_timeout = session.irods_session.pool.connection_timeout
if max_size > 1e9 and original_timeout == DEFAULT_CONNECTION_TIMEOUT:
session.irods_session.pool.connection_timeout = int(
DEFAULT_CONNECTION_TIMEOUT*(max_size/1e9)+0.5)
# max_size = max([*up_sizes, *down_sizes, 0])
# original_timeout = session.irods_session.pool.connection_timeout
# if max_size > 1e9 and original_timeout == DEFAULT_CONNECTION_TIMEOUT:
# session.irods_session.pool.connection_timeout = int(
# DEFAULT_CONNECTION_TIMEOUT*(max_size/1e9)+0.5)

for col in operations["create_collection"]:
IrodsPath.create_collection(session, col)
Expand All @@ -406,7 +409,7 @@ def perform_operations(session: Session, operations: dict, ignore_err: bool=Fals
_obj_get(session, ipath, lpath, overwrite=True, ignore_err=ignore_err, options=options,
resc_name=resc_name)
pbar.update(size)
session.irods_session.pool.connection_timeout = original_timeout
# session.irods_session.pool.connection_timeout = original_timeout


def sync(session: Session,
Expand Down
5 changes: 1 addition & 4 deletions ibridges/resources.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,10 +44,7 @@ def get_resource(self, resc_name: str) -> irods.resource.iRODSResource:
If the resource does not exist.
"""
try:
return self.session.irods_session.resources.get(resc_name)
except irods.exception.ResourceDoesNotExist as error:
return {'successful': False, 'reason': repr(error)}
return self.session.irods_session.resources.get(resc_name)

def get_free_space(self, resc_name: str) -> int:
"""Determine free space in a resource hierarchy.
Expand Down
12 changes: 6 additions & 6 deletions ibridges/search.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,17 +60,17 @@ def search_data(session: Session, path: Optional[Union[str, IrodsPath]] = None,
# one data search in case path is a collection path and we want to retrieve all data there
# one in case the path is or ends with a file name
if path:
path = IrodsPath(session, path)
parent = path.parent
name = path.name
path = str(path)
parent = path.rsplit("/", maxsplit=1)[0]
name = path.rsplit("/", maxsplit=1)[1]
# all collections starting with path
coll_query = coll_query.filter(icat.LIKE(icat.COLL_NAME, str(path)))
coll_query = coll_query.filter(icat.LIKE(icat.COLL_NAME, path))

# all data objects in path
data_query = data_query.filter(icat.LIKE(icat.COLL_NAME, str(path)))
data_query = data_query.filter(icat.LIKE(icat.COLL_NAME, path))
# all data objects on path.parent with name
data_name_query = data_name_query.filter(icat.LIKE(icat.DATA_NAME, name)).filter(
icat.LIKE(icat.COLL_NAME, str(parent)))
icat.LIKE(icat.COLL_NAME, parent))
if key_vals:
for key in key_vals:
data_query.filter(icat.LIKE(icat.META_DATA_ATTR_NAME, key))
Expand Down
3 changes: 3 additions & 0 deletions ibridges/session.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,9 @@ def __init__(self,
raise TypeError(f"Error reading environment file '{irods_env_path}': "
f"expected dictionary, got {type(irods_env)}.")

if "connection_timeout" not in irods_env:
irods_env["connection_timeout"] = 25000

self._password = password
self._irods_env: dict = irods_env
self._irods_env_path = irods_env_path
Expand Down
61 changes: 61 additions & 0 deletions ibridges/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,18 @@

from __future__ import annotations

from collections.abc import Sequence
from typing import Union

import irods

from ibridges.path import IrodsPath

try:
from importlib_metadata import entry_points
except ImportError:
from importlib.metadata import entry_points # type: ignore


def get_dataobject(session,
path: Union[str, IrodsPath]) -> irods.data_object.iRODSDataObject:
Expand Down Expand Up @@ -76,3 +82,58 @@ def obj_replicas(obj: irods.data_object.iRODSDataObject) -> list[tuple[int, str,
r.size, repl_states.get(r.status, r.status)) for r in obj.replicas]

return replicas

def get_environment_providers() -> list:
"""Get a list of all environment template providers.
Returns
-------
The list that contains the providers.
"""
return [entry.load() for entry in entry_points(group="ibridges_server_template")]


def print_environment_providers(env_providers: Sequence):
"""Print the environment providers to the screen.
Parameters
----------
env_providers
A list of all installed environment providers.
"""
for provider in env_providers:
print(provider.name)
print("-"*len(provider.name))
print("\n")
max_len = max(len(x) for x in provider.descriptions)
for server_name, description in provider.descriptions.items():
print(f"{server_name: <{max_len+1}} - {description}")


def find_environment_provider(env_providers: list, server_name: str) -> object:
"""Find the provider that provides the right template.
Parameters
----------
env_providers
A list of all installed environment providers.
server_name
Name of the server for which the template is to be found.
Returns
-------
The provider that contains the template.
Raises
------
ValueError
If the server_name identifier can't be found in the providers.
"""
for provider in env_providers:
if provider.contains(server_name):
return provider
raise ValueError("Cannot find provider with name {server_name} ensure that the plugin is "
"installed.")
2 changes: 2 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ classifiers = [
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Development Status :: 3 - Alpha",
"License :: OSI Approved :: MIT License",
]
Expand Down Expand Up @@ -60,6 +61,7 @@ write_to = "ibridges/_version.py"
[[tool.mypy.overrides]]
module = [
"irods.*",
"importlib_metadata.*",
]
ignore_missing_imports = true

Expand Down
Loading

0 comments on commit 1970de3

Please sign in to comment.