Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Galaxy JWD Python script #15618

Closed
wants to merge 4 commits into from

Conversation

sanjaysrikakulam
Copy link
Contributor

WHAT:
Adds a new script with the following functionality:

  1. Can get you the path of a JWD given a Galaxy job id: Used when one wants to decode/find/go to the JWD for debugging
  2. Can delete JWDs of jobs that failed within the last X days: For cleaning up JWDs

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. For the script to run the following ENVs (same as gxadmin) are needed
      1.1 GALAXY_CONFIG_FILE: Path to the galaxy.yml file
      1.2 GALAXY_LOG_DIR: Path to the Galaxy log directory
      1.3 PGDATABASE: Name of the Galaxy database
      1.4 PGUSER: Galaxy database user
      1.5 PGHOST: Galaxy database host
    2. A ~/.pgpass file (same as gxadmin) with the following format
      1.1 <pg_host>:5432:*:<pg_user>:<pg_password>
    3. Script Python dependencies (Both dependencies are part of Galaxy, so nothing to do)
      1.1 psycopg2
      1.2 xml.dom.minidom
    4. To run:
      1.1 python galaxy_jwd.py get_jwd 12345678
      1.2 python galaxy_jwd.py clean_jwds --dry_run True --days 30 or python galaxy_jwd.py clean_jwds --dry_run False --days 30

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

Functionality:
1. Can get you the path of a JWD: Used when one wants to find and go to the JWD
for debugging
2. Can delete JWD's of job failed within last X days: For cleaning up JWDs
Copy link
Member

@mvdbeek mvdbeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice start, but I would recommend using Galaxy's config file parsing
Galaxy's database session, Galaxy's object store instance and creating this as part of the galaxy-data package. Take a look at

def build_app():
for a reasonably similar setup. If you do this it'd also be straightforward to turn this into a celery cron task and that would address all my comments.

backends = parse_object_store(object_store_conf)

# Connect to Galaxy database
db = Database(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use sqlalchemy please ?

str: Path to the object_store_conf.xml file
"""
object_store_conf = ""
with open(galaxy_config_file, "r") as config:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config file is a yaml file, please parse it as such.

object_store_conf = ""
with open(galaxy_config_file, "r") as config:
for line in config:
if line.strip().startswith("object_store_config_file"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The object store file doesn't have to be an xml file, there are different ways to parse them.

if metadata[0] not in backends_dict.keys():
raise ValueError(f"Object store id '{metadata[0]}' does not exist in the object_store_conf.xml file")

jwd_path = f"{backends_dict[metadata[0]]}/0{job_id[0:2]}/{job_id[2:5]}/{job_id}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's very brittle, I think you just want to load up the object store itself.

sanjaysrikakulam added a commit to sanjaysrikakulam/galaxyproject_galaxy that referenced this pull request Mar 16, 2023
@sanjaysrikakulam
Copy link
Contributor Author

Closing this as the revised updated one part of celery task is here

sanjaysrikakulam added a commit to sanjaysrikakulam/galaxyproject_galaxy that referenced this pull request Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants