Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add pantax #50983

Merged
merged 21 commits into from
Oct 10, 2024
24 changes: 24 additions & 0 deletions recipes/pantax/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
#!/usr/bin/env bash

if [ "$(uname)" == "Darwin" ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The meta.yaml says to skip OSX

cp -rf $SRC_DIR $PREFIX
else
martin-g marked this conversation as resolved.
Show resolved Hide resolved
zwh82 marked this conversation as resolved.
Show resolved Hide resolved
cd ${SRC_DIR}/gurobi11
${PYTHON} -m pip install gurobipy-11.0.2-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is not allowed to install dependencies with Pip like this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does it install it, by the way ? It looks like it is no inside $PREFIX

Copy link
Contributor Author

@zwh82 zwh82 Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Are there any rules that restrict this installation behavior? I have already installed the artifact locally and tested it sucessfully, and PanTax is works well. Gurobipy, as a Python package, should be installed in the /path/to/env/lib/python3.10/site-packages/gurobipy. ${PREFIX} should be /path/to/env/bin,so it will certainly not be installed there.

Copy link
Contributor Author

@zwh82 zwh82 Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the question of vg version. I use vg version 1.52 instead of newer version to ensure the robustness and reproducibility of our tool. All my experiments are run on this version. In addition, I have a dependency PGGB. PGGB depends on vg version 1.40. Therefore, I cannot directly add vg to the dependency, otherwise the vg version 1.40 will be overwritten by the new version of vg.

rm gurobipy-11.0.2-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
martin-g marked this conversation as resolved.
Show resolved Hide resolved

mkdir -p ${PREFIX}/bin/tools
cd ${SRC_DIR}/vg
cp vg* ${PREFIX}/bin/tools/vg
chmod +x ${PREFIX}/bin/tools/vg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve vg installation process with error handling and specific file copying

The current vg installation process lacks error handling and uses wildcards for copying, which might include unnecessary files. Consider the following improvements:

mkdir -p "${PREFIX}/bin/tools" || exit 1
cd "${SRC_DIR}/vg" || exit 1
cp vg vg_* "${PREFIX}/bin/tools/" || exit 1
chmod +x "${PREFIX}/bin/tools/vg"* || exit 1

This ensures that the script fails if any critical operation fails and only copies the necessary files. Adjust the vg vg_* pattern as needed for your specific use case.

🧰 Tools
🪛 Shellcheck

[warning] 7-7: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)


cd ${SRC_DIR}/scripts
chmod +x pantax
chmod +x data_preprocessing
cp ${SRC_DIR}/scripts/pantax ${SRC_DIR}/scripts/data_preprocessing ${PREFIX}/bin
cp ${SRC_DIR}/scripts/*py ${PREFIX}/bin
zwh82 marked this conversation as resolved.
Show resolved Hide resolved

cd ${SRC_DIR}/tools/fastix
cargo install fastix --root ./
zwh82 marked this conversation as resolved.
Show resolved Hide resolved
cp ${SRC_DIR}/tools/fastix/bin/fastix ${PREFIX}/bin/tools
martin-g marked this conversation as resolved.
Show resolved Hide resolved
zwh82 marked this conversation as resolved.
Show resolved Hide resolved
fi
57 changes: 57 additions & 0 deletions recipes/pantax/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
{% set name = "pantax" %}
{% set version = "1.0.1" %}

package:
name: {{ name }}
version: {{ version }}

source:
- url: https://github.com/LuoGroup2023/PanTax/releases/download/v{{ version }}/{{ name }}.tar.gz
sha256: c14a46b41f580cf39ad502049d267cd31bbfd4dc05028215cc669c0e52f58ce7

- url: https://files.pythonhosted.org/packages/07/0f/6039cf6e22f9cbec57cdedff103949e4856c18fd2cb714bc70cdc8b52941/gurobipy-11.0.2-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please require this as a dependency and do not pip install them

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gurobipy can be install with conda install gurobi::gurobi and pip. But I don't know how to set channels gurobi to install it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use only dependencies from conda-forge and bioconda channels. You cannot install from third-party channels and/or Pip

sha256: 164e72462c2f1a705fcec73acc060aa639e745827bfe877714ff7b2ccabc5237
folder: gurobi11

- url: https://github.com/vgteam/vg/releases/download/v1.52.0/vg
sha256: 07776b51aa6494b45982e4134eb84005ddc743e81c67e61fa52bff211d81777c
folder: vg
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you instead require VG in the requirements section?

Copy link
Contributor Author

@zwh82 zwh82 Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is impossible. The requirement PGGB works with vg 1.40, but I need updated versions (vg 1.52) in other parts of my codes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://anaconda.org/bioconda/vg provides a newer version


build:
skip: True # [osx]
number: 0
run_exports:
- {{ pin_subpackage(name, max_pin="x.x") }}
zwh82 marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Review pin_subpackage directive and remove trailing space.

The build section is generally well-defined. However, consider the following points:

  1. The pin_subpackage directive uses "x.x", which might be too strict for a 1.x version. For version 1.x, using "x" is typically sufficient unless you expect API breaks in minor versions.
  2. There's a trailing space on line 16 that should be removed.

Consider updating the run_exports section as follows:

  run_exports:
    - {{ pin_subpackage(name, max_pin="x") }}

This change allows for minor version updates while still maintaining compatibility.

🧰 Tools
🪛 yamllint

[error] 16-16: trailing spaces

(trailing-spaces)


requirements:
build:
- {{ compiler('rust') }}
host:
- python ==3.10
- pip
run:
- python =3.10
martin-g marked this conversation as resolved.
Show resolved Hide resolved
- samtools =1.19.2
- bcftools =1.19
- htslib =1.19.1
- pggb =0.5.4
- graphaligner =1.0.17
- h5py =3.10.0
- pandas =2.2.0
- tqdm =4.66.1
- numpy =1.26.3
- networkx =3.2.1
- pyarrow =14.0.2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Review dependency version constraints.

The run requirements section is comprehensive, but there are some points to consider:

  1. The Python version is strictly pinned to 3.10, which limits flexibility. Consider allowing a range of compatible Python versions.

  2. Many dependencies have very specific version requirements. While this ensures reproducibility, it may make updates challenging. Consider using version ranges (e.g., >=1.19.2,<1.20) for dependencies that are likely to remain compatible across minor versions.

  3. For bioinformatics tools like samtools, bcftools, and htslib, consider if such strict version pinning is necessary. These tools often maintain backwards compatibility, so allowing minor version updates might be beneficial.

  4. Verify if all listed dependencies are directly required by pantax. Remove any that are indirect dependencies to keep the requirements list focused.

Consider relaxing version constraints where appropriate to improve maintainability while preserving compatibility.

🧰 Tools
🪛 yamllint

[error] 37-37: trailing spaces

(trailing-spaces)


[error] 38-38: trailing spaces

(trailing-spaces)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Review dependency version constraints and remove trailing spaces.

The requirements section is comprehensive, but there are some points to consider:

  1. Consider relaxing the Python version constraint to allow for minor version updates (e.g., >=3.10,<3.11).
  2. Review the strict version pinning for other dependencies. While this ensures reproducibility, it may make updates challenging. Consider using version ranges (e.g., >=1.19.2,<1.20) for dependencies that are likely to remain compatible across minor versions.
  3. Remove trailing spaces from lines 33, 34, 37, and 38.

Here's an example of how you might adjust the Python version constraint:

  host:
    - python >=3.10,<3.11
    - pip
  run:
    - python >=3.10,<3.11
    # Other dependencies...

Consider applying similar adjustments to other dependencies where appropriate.

🧰 Tools
🪛 yamllint

[error] 33-33: trailing spaces

(trailing-spaces)


[error] 34-34: trailing spaces

(trailing-spaces)


test:
commands:
- pantax -h
zwh82 marked this conversation as resolved.
Show resolved Hide resolved

about:
home: https://github.com/LuoGroup2023/PanTax
license: GPL-3.0-or-later
zwh82 marked this conversation as resolved.
Show resolved Hide resolved
summary: Strain-level taxonomic classification of metagenomic data using pangenome graphs

extra:
skip-lints:
- should_be_noarch_python