-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use gxformat2 to convert .ga to .cwl? #33
Comments
The file generated with gxformat2 does not validate. Building the docker container from https://github.com/ResearchObject/ro-crate-py/tree/9c2c74506226f4508985e86df7b1fa72f657f8b2:
|
The code was missing the
Here is the generated CWL: class: Workflow
cwlVersion: v1.2
inputs:
'GenBank file ':
id: 'GenBank file '
type: File
Paired Collection (fastqsanger):
id: Paired Collection (fastqsanger)
type: File[]
outputs:
_anonymous_output_1:
outputSource: 'GenBank file '
type: File
_anonymous_output_2:
outputSource: Paired Collection (fastqsanger)
type: File
_anonymous_output_3:
outputSource: 2/snpeff_output
type: File
_anonymous_output_4:
outputSource: 2/output_fasta
type: File
_anonymous_output_5:
outputSource: 3/output_paired_coll
type: File
_anonymous_output_6:
outputSource: 3/report_html
type: File
_anonymous_output_7:
outputSource: 4/bam_output
type: File
FASTP_report:
outputSource: 5/html_report
type: File
_anonymous_output_8:
outputSource: 6/output1
type: File
_anonymous_output_9:
outputSource: '7'
type: File
_anonymous_output_10:
outputSource: 8/metrics_file
type: File
_anonymous_output_11:
outputSource: 8/outFile
type: File
mapping_report:
outputSource: 9/html_report
type: File
_anonymous_output_12:
outputSource: 10/realigned
type: File
DeDup_Report:
outputSource: 11/html_report
type: File
_anonymous_output_13:
outputSource: 12/variants
type: File
_anonymous_output_14:
outputSource: 13/statsFile
type: File
_anonymous_output_15:
outputSource: 13/snpeff_output
type: File
_anonymous_output_16:
outputSource: '14'
type: File
SnpEff vcf.gz:
outputSource: 15/output1
type: File
_anonymous_output_17:
outputSource: '16'
type: File
steps:
'10':
in:
reads:
source: 8/outFile
reference_source|ref:
source: 2/output_fasta
out:
- realigned
run:
class: Operation
doc: ''
inputs: {}
outputs: {}
'11':
in:
results_0|software_cond|output_0|input:
source: 8/metrics_file
out:
- plots
- stats
- html_report
run:
class: Operation
doc: ''
inputs: {}
outputs: {}
'12':
in:
reads:
source: 10/realigned
reference_source|ref:
source: 2/output_fasta
out:
- variants
run:
class: Operation
doc: ''
inputs: {}
outputs: {}
'13':
in:
input:
source: 12/variants
snpDb|snpeff_db:
source: 2/snpeff_output
out:
- snpeff_output
- statsFile
run:
class: Operation
doc: ''
inputs: {}
outputs: {}
'14':
in:
input:
source: 13/snpeff_output
out: []
run:
class: Operation
doc: ''
inputs: {}
outputs: {}
'15':
in:
input1:
source: 13/snpeff_output
out:
- output1
run:
class: Operation
doc: ''
inputs: {}
outputs: {}
'16':
in:
input_list:
source: '14'
out: []
run:
class: Operation
doc: ''
inputs: {}
outputs: {}
'2':
in:
input_type|input_gbk:
source: 'GenBank file '
out:
- output_fasta
- snpeff_output
run:
class: Operation
doc: ''
inputs: {}
outputs: {}
'3':
in:
single_paired|paired_input:
source: Paired Collection (fastqsanger)
out:
- report_json
- report_html
- output_paired_coll
run:
class: Operation
doc: ''
inputs: {}
outputs: {}
'4':
in:
fastq_input|fastq_input1:
source: 3/output_paired_coll
reference_source|ref_file:
source: 2/output_fasta
out:
- bam_output
run:
class: Operation
doc: ''
inputs: {}
outputs: {}
'5':
in:
results_0|software_cond|input:
source: 3/report_json
out:
- plots
- stats
- html_report
run:
class: Operation
doc: ''
inputs: {}
outputs: {}
'6':
in:
input1:
source: 4/bam_output
out:
- output1
run:
class: Operation
doc: ''
inputs: {}
outputs: {}
'7':
in:
input:
source: 6/output1
out: []
run:
class: Operation
doc: ''
inputs: {}
outputs: {}
'8':
in:
inputFile:
source: 6/output1
out:
- outFile
- metrics_file
run:
class: Operation
doc: ''
inputs: {}
outputs: {}
'9':
in:
results_0|software_cond|output_0|type|input:
source: '7'
out:
- plots
- stats
- html_report
run:
class: Operation
doc: ''
inputs: {}
outputs: {} Note that the class: Workflow
cwlVersion: v1.2.0-dev2
doc: 'Abstract CWL Automatically generated from the Galaxy workflow file: COVID-19:
PE Variation'
inputs:
'GenBank file ':
format: data
type: File
Paired Collection (fastqsanger):
format: data
type: File
outputs: {}
steps:
10_Realign reads:
in:
reads: 8_MarkDuplicates/outFile
reference_source|ref: 2_SnpEff build/output_fasta
out:
- realigned
run:
class: Operation
id: toolshed_g2_bx_psu_edu_repos_iuc_lofreq_viterbi_lofreq_viterbi_2_1_3_1+galaxy1
inputs:
reads:
format: Any
type: File
reference_source|ref:
format: Any
type: File
outputs:
realigned:
doc: bam
type: File
11_MultiQC:
in:
results_0|software_cond|output_0|input: 8_MarkDuplicates/metrics_file
out:
- stats
- plots
- html_report
run:
class: Operation
id: toolshed_g2_bx_psu_edu_repos_iuc_multiqc_multiqc_1_7_1
inputs:
results_0|software_cond|output_0|input:
format: Any
type: File
outputs:
html_report:
doc: html
type: File
plots:
doc: input
type: File
stats:
doc: input
type: File
12_Call variants:
in:
reads: 10_Realign reads/realigned
reference_source|ref: 2_SnpEff build/output_fasta
out:
- variants
run:
class: Operation
id: toolshed_g2_bx_psu_edu_repos_iuc_lofreq_call_lofreq_call_2_1_3_1+galaxy0
inputs:
reads:
format: Any
type: File
reference_source|ref:
format: Any
type: File
outputs:
variants:
doc: vcf
type: File
13_SnpEff eff:
in:
input: 12_Call variants/variants
snpDb|snpeff_db: 2_SnpEff build/snpeff_output
out:
- snpeff_output
- statsFile
run:
class: Operation
id: toolshed_g2_bx_psu_edu_repos_iuc_snpeff_snpEff_4_3+T_galaxy1
inputs:
input:
format: Any
type: File
snpDb|snpeff_db:
format: Any
type: File
outputs:
snpeff_output:
doc: vcf
type: File
statsFile:
doc: html
type: File
14_SnpSift Extract Fields:
in:
input: 13_SnpEff eff/snpeff_output
out:
- output
run:
class: Operation
id: toolshed_g2_bx_psu_edu_repos_iuc_snpsift_snpSift_extractFields_4_3+t_galaxy0
inputs:
input:
format: Any
type: File
outputs:
output:
doc: tabular
type: File
15_Convert VCF to VCF_BGZIP:
in:
input1: 13_SnpEff eff/snpeff_output
out:
- output1
run:
class: Operation
id: CONVERTER_vcf_to_vcf_bgzip_0
inputs:
input1:
format: Any
type: File
outputs:
output1:
doc: vcf_bgzip
type: File
16_Collapse Collection:
in:
input_list: 14_SnpSift Extract Fields/output
out:
- output
run:
class: Operation
id: toolshed_g2_bx_psu_edu_repos_nml_collapse_collections_collapse_dataset_4_1
inputs:
input_list:
format: Any
type: File
outputs:
output:
doc: input
type: File
2_SnpEff build:
in:
input_type|input_gbk: 'GenBank file '
out:
- snpeff_output
- output_fasta
run:
class: Operation
id: toolshed_g2_bx_psu_edu_repos_iuc_snpeff_snpEff_build_gb_4_3+T_galaxy4
inputs:
input_type|input_gbk:
format: Any
type: File
outputs:
output_fasta:
doc: fasta
type: File
snpeff_output:
doc: snpeffdb
type: File
3_fastp:
in:
single_paired|paired_input: Paired Collection (fastqsanger)
out:
- output_paired_coll
- report_html
- report_json
run:
class: Operation
id: toolshed_g2_bx_psu_edu_repos_iuc_fastp_fastp_0_19_5+galaxy1
inputs:
single_paired|paired_input:
format: Any
type: File
outputs:
output_paired_coll:
doc: input
type: File
report_html:
doc: html
type: File
report_json:
doc: json
type: File
4_Map with BWA-MEM:
in:
fastq_input|fastq_input1: 3_fastp/output_paired_coll
reference_source|ref_file: 2_SnpEff build/output_fasta
out:
- bam_output
run:
class: Operation
id: toolshed_g2_bx_psu_edu_repos_devteam_bwa_bwa_mem_0_7_17_1
inputs:
fastq_input|fastq_input1:
format: Any
type: File
reference_source|ref_file:
format: Any
type: File
outputs:
bam_output:
doc: bam
type: File
5_MultiQC:
in:
results_0|software_cond|input: 3_fastp/report_json
out:
- stats
- plots
- html_report
run:
class: Operation
id: toolshed_g2_bx_psu_edu_repos_iuc_multiqc_multiqc_1_7_1
inputs:
results_0|software_cond|input:
format: Any
type: File
outputs:
html_report:
doc: html
type: File
plots:
doc: input
type: File
stats:
doc: input
type: File
6_Filter SAM or BAM, output SAM or BAM:
in:
input1: 4_Map with BWA-MEM/bam_output
out:
- output1
run:
class: Operation
id: toolshed_g2_bx_psu_edu_repos_devteam_samtool_filter2_samtool_filter2_1_8+galaxy1
inputs:
input1:
format: Any
type: File
outputs:
output1:
doc: sam
type: File
7_Samtools stats:
in:
input: 6_Filter SAM or BAM, output SAM or BAM/output1
out:
- output
run:
class: Operation
id: toolshed_g2_bx_psu_edu_repos_devteam_samtools_stats_samtools_stats_2_0_2+galaxy2
inputs:
input:
format: Any
type: File
outputs:
output:
doc: tabular
type: File
8_MarkDuplicates:
in:
inputFile: 6_Filter SAM or BAM, output SAM or BAM/output1
out:
- metrics_file
- outFile
run:
class: Operation
id: toolshed_g2_bx_psu_edu_repos_devteam_picard_picard_MarkDuplicates_2_18_2_2
inputs:
inputFile:
format: Any
type: File
outputs:
metrics_file:
doc: txt
type: File
outFile:
doc: bam
type: File
9_MultiQC:
in:
results_0|software_cond|output_0|type|input: 7_Samtools stats/output
out:
- stats
- plots
- html_report
run:
class: Operation
id: toolshed_g2_bx_psu_edu_repos_iuc_multiqc_multiqc_1_7_1
inputs:
results_0|software_cond|output_0|type|input:
format: Any
type: File
outputs:
html_report:
doc: html
type: File
plots:
doc: input
type: File
stats:
doc: input
type: File I've opened a draft PR from the branch to make it easier to track changes |
Came up at the 2020 Elixir biohackathon.
Experimented with this in https://github.com/ResearchObject/ro-crate-py/tree/gxformat2_cwl_conv. Here are the changes. I checked the output from converting
test/test-data/test_galaxy_wf.ga
and the one output by gxformat2 is very different from the one obtained with galaxy2cwl. I'm not even sure the latter is a valid CWL workflow. Did I use the gxformat2 API in the wrong way? If not, maybe this needs to be checked by a CWL expert.The text was updated successfully, but these errors were encountered: