Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JOSS Review: nested config #91

Closed
beardymcjohnface opened this issue Nov 7, 2024 · 2 comments
Closed

JOSS Review: nested config #91

beardymcjohnface opened this issue Nov 7, 2024 · 2 comments

Comments

@beardymcjohnface
Copy link

openjournals/joss-reviews#7410

I really like the way snk attempts to handle nested config. However, I think there are still cases where users would run into problems. Example, the mapache pipeline (github.com/sneuensc/mapache) config lets the user list multiple genomes:

# (e.g., ind1.hg19.bam)
genome: 
    hg19: /work/FAC/FBM/DBC/amalaspi/popgen/shared_resources/reference_human/hs.build37.1/hs.build37.1.fa
    #test: /users/sneuensc/work_sneuensc/mapping_pipeline/mapache/test_genome.fa
    #GRCh38: /work/FAC/FBM/DBC/amalaspi/popgen/shared_resources/reference_human/GCA_000001405.15_GRCh38.p13_onlyChrom/GCA_000001405.15_GRCh38.fa

However, this translates into --genome-hg19 and expects the text filepath for hg19, and it doesn't look like there is a way to use the CLI to add more genomes. I can't think of a fix that snk could implement to address this (you might be able to but it's not worth the time and effort involved), but it might be helpful to highlight that snk might not work for some config files.

@Wytamma
Copy link
Owner

Wytamma commented Nov 16, 2024

Hi @beardymcjohnface, thanks for the review and great issue!

I think this will need to be solved with the snk.yaml file e.g. some special case like choices.

I think the best way to do it will be to add a Tuple type and let user specify the the key value pairs with --genome <TEXT TEXT>..... I'm not sure if that will work with multiple values but I'll test that and get back to you.

@Wytamma
Copy link
Owner

Wytamma commented Nov 17, 2024

Okay I've kinda solved this... I added a dict and pair type to the snk types so you can do this

# snk.yaml
cli: 
    genome:
        type: "dict"

Which will result in

❯ mapache run -h
...
╭─ Workflow Configuration ──────────────────────────────────────────╮
│ --sample-file                      TEXT            [default:      │
│                                                    config/sample… │
│ --result-dir                       TEXT            [default:      │
│                                                    results]       │
│ --genome                           <TEXT TEXT>...  [default:      │
│                                                    hg19,          │
│                                                    /work/FAC/FBM… │
...

You can add genomes to the dict with mapache run --genome new path/to/genome. However, because Snakemake updates dicts in the config with values passed from the CLI the new genome will not replace the old but add to the dict. Which results in {"genome":{"hg19":"/work/FAC/FBM…", "new": "path/to/genome"}. I don't think there's a way around this without modify the workflow config, however, required changes are minimal.

You just need to delete genome from the configfile and add it to the default value to the snk.yaml file then it'll all work as expected.

# snk.yaml
cli: 
    genome:
        type: "dict"
        default: [["hg19", "/work/FAC/FBM…"]]

@Wytamma Wytamma closed this as completed Nov 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants