Skip to content

Commit

Permalink
Merge pull request #171 from Plant-Food-Research-Open/report/hicqc
Browse files Browse the repository at this point in the history
Added the HiC QC report to the final report
  • Loading branch information
GallVp authored Oct 31, 2024
2 parents 26025a4 + 5f373d6 commit 14ca56e
Show file tree
Hide file tree
Showing 8 changed files with 52 additions and 8 deletions.
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## v2.2.0dev - [25-Oct-2024]
## v2.2.0dev - [31-Oct-2024]

### `Added`

Expand All @@ -14,6 +14,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
5. Added `text/html` as content mime type for the report file [#146](https://github.com/Plant-Food-Research-Open/assemblyqc/issues/146)
6. Added a sequence labels table below the HiC contact map [#147](https://github.com/Plant-Food-Research-Open/assemblyqc/issues/147)
7. Added parameter `hic_samtools_ext_args` and set its default value to `-F 3852` [#159](https://github.com/Plant-Food-Research-Open/assemblyqc/issues/159)
8. Added the HiC QC report to the final report so that users don't have to navigate to the results folder [#162](https://github.com/Plant-Food-Research-Open/assemblyqc/issues/162)

### `Fixed`

Expand Down
19 changes: 15 additions & 4 deletions bin/report_modules/parsers/hic_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,28 +15,38 @@ def parse_hic_folder(folder_name="hic_outputs"):
return {}

list_of_hic_files = hic_folder_path.glob("*.html")
list_of_hic_files = [
x for x in list_of_hic_files if re.match(r"^\w+\.html$", x.name)
]

data = {"HIC": []}

for hic_path in list_of_hic_files:
hic_file_name = os.path.basename(str(hic_path))

file_tokens = re.findall(
tag = re.findall(
r"([\w]+).html",
hic_file_name,
)[0]

labels_table = pd.read_csv(f"{folder_name}/{file_tokens}.agp.assembly", sep=" ")

# Get the labels table
labels_table = pd.read_csv(f"{folder_name}/{tag}.agp.assembly", sep=" ")
labels_table = labels_table[labels_table.iloc[:, 0].str.startswith(">")].iloc[
:, [0, 2]
]
labels_table.columns = ["Sequence", "Length"]
labels_table.Length = labels_table.Length.astype(int)

# Get the HiC QC report
hicqc_report = [
x
for x in hic_folder_path.glob("*.pdf")
if re.match(rf"[\S]+\.on\.{tag}_qc_report\.pdf", x.name)
][0]

data["HIC"].append(
{
"hap": file_tokens,
"hap": tag,
"hic_html_file_name": hic_file_name,
"labels_table": labels_table.to_dict("records"),
"labels_table_html": tabulate(
Expand All @@ -46,6 +56,7 @@ def parse_hic_folder(folder_name="hic_outputs"):
numalign="left",
showindex=False,
),
"hicqc_report_pdf": os.path.basename(str(hicqc_report)),
}
)

Expand Down
12 changes: 12 additions & 0 deletions bin/report_modules/templates/header.html
Original file line number Diff line number Diff line change
Expand Up @@ -213,6 +213,18 @@

.iframe-wrapper {
text-align: center;
width: 90%;
margin-left: auto;
margin-right: auto;
margin-bottom: 32px;
}

.iframe-wrapper-hic {
width: 700px;
height: 850px;
margin-left: auto;
margin-right: auto;
margin-bottom: 32px;
}

.tab {
Expand Down
13 changes: 11 additions & 2 deletions bin/report_modules/templates/hic/report_contents.html
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,23 @@
<div class="section-heading-wrapper">
<div class="section-heading">{{ all_stats_dicts['HIC'][item]['hap'] }}</div>
</div>
<div class="iframe-wrapper">
<iframe src="./hic/{{ all_stats_dicts['HIC'][item]['hic_html_file_name'] }}" width="100%" height="100%"></iframe>
<div class="iframe-wrapper-hic">
<iframe src="./hic/{{ all_stats_dicts['HIC'][item]['hic_html_file_name'] }}" width="700px" height="850px"></iframe>
</div>
</div>
<div class="results-section">
<div class="section-para-wrapper">
<p class="section-para"><b>Sequence labels and lengths</b></p>
</div>
<div class="table-outer">
<div class="table-wrapper">{{ all_stats_dicts['HIC'][item]['labels_table_html'] }}</div>
</div>
<div class="section-para-wrapper">
<p class="section-para"><b>HiC QC report</b></p>
</div>
<div class="iframe-wrapper">
<iframe src="./hic/hicqc/{{ all_stats_dicts['HIC'][item]['hicqc_report_pdf'] }}" width="100%" height="100%"></iframe>
</div>
</div>
</div>
{% if vars.update({'is_first': False}) %} {% endif %} {% endfor %}
Binary file added docs/images/hicqc.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 6 additions & 1 deletion docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,12 @@ Kraken2 [assigns taxonomic labels](https://ccb.jhu.edu/software/kraken2/) to seq

Hi-C contact mapping experiments measure the frequency of physical contact between loci in the genome. The resulting dataset, called a “contact map,” is represented using a [two-dimensional heatmap](https://github.com/igvteam/juicebox.js) where the intensity of each pixel indicates the frequency of contact between a pair of loci.

<div align="center"><img src="images/hic_map.png" alt="AssemblyQC - HiC interactive contact map" width="50%"><hr><em>AssemblyQC - HiC interactive contact map</em></div>
<div align="center">
<img src="images/hicqc.png" alt="AssemblyQC - HiC QC report" width="44.59%">
<img src="images/hic_map.png" alt="AssemblyQC - HiC interactive contact map" width="40%">
<hr>
<em>AssemblyQC - HiC results</em>
</div>

### Synteny

Expand Down
2 changes: 2 additions & 0 deletions subworkflows/local/fq2hic.nf
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ workflow FQ2HIC {

HICQC ( ch_bam_and_ref.map { meta3, bam, fa -> [ meta3, bam ] } )

ch_hicqc_pdf = HICQC.out.pdf
ch_versions = ch_versions.mix(HICQC.out.versions)

// MODULE: MAKEAGPFROMFASTA | AGP2ASSEMBLY | ASSEMBLY2BEDPE
Expand Down Expand Up @@ -95,6 +96,7 @@ workflow FQ2HIC {
ch_versions = ch_versions.mix(HIC2HTML.out.versions.first())

emit:
hicqc_pdf = ch_hicqc_pdf
hic = ch_hic
html = HIC2HTML.out.html
assembly = AGP2ASSEMBLY.out.assembly
Expand Down
4 changes: 4 additions & 0 deletions workflows/assemblyqc.nf
Original file line number Diff line number Diff line change
Expand Up @@ -590,12 +590,16 @@ workflow ASSEMBLYQC {
params.hic_skip_fastqc
)

ch_hicqc_pdf = FQ2HIC.out.hicqc_pdf
ch_hic_html = FQ2HIC.out.html
ch_hic_assembly = FQ2HIC.out.assembly
ch_hic_report_files = ch_hic_html
| mix(
ch_hic_assembly.map { tag, assembly -> assembly }
)
| mix(
ch_hicqc_pdf.map { meta, pdf -> pdf }
)
ch_versions = ch_versions.mix(FQ2HIC.out.versions)

// SUBWORKFLOW: FASTA_SYNTENY
Expand Down

0 comments on commit 14ca56e

Please sign in to comment.