Add a script that can batch import and explain Lancet data in the November format #428

adamnovak · 2024-05-03T20:58:12Z

This fixes #427 sort of by making a BED that explains what variant is supposed to be seen.

I ran it like this:

cd ~/workspace/sequenceTubeMap
mkdir -p webhost
cd webhost

~/workspace/sequenceTubeMap/scripts/prepare_lancet_output.sh ~/Downloads/STM_DataShare_Nov07_2023\ 2/ ./lancet_2023-11-07

cd lancet_2023-11-07

python3 -m http.server

Then I could put http://[::]:8000/index.bed into my local tube map as the BED file and browse the data.

But, I think the data provided is not exactly what I really want to look at. It looks liek the called tumor variants are not actually in the graphs. For example, at chr1:38506973-38506974 I am suppsoed to see a Tumor-specific CTGGAATCCAGCAGCCCAGACTTCCACATCATAATTTTCTGGGGCAATGGTTTTCAAACTTCACTGTACG -> C DEL variant. But the graph doesn't show a large deletion. Instead, that deleted sequence occurs 1 base into the leftmost node here, and I can see the softclips in the aligned tumor reads at their left ends, where they would read over the deletion edge that isn't present. Here's a screenshot of the softclips, with the tumor reads in red:

So I now have a way to generate and host tumor-normal Lancet examples, including examples for larger indels, but the graphs coming from Lancet don't really seem to be the right ones.

…ember format

… coordinates

… nodes because they moved

adamnovak · 2024-05-07T21:26:08Z

OK, I talked to Rajeeva and he told me that Lancet makes several graphs per variant. I added code to find the most centered one for each variant, and now I can indeed see this variant. Here's that deletion being taken by just the tumor reads:

These are now probably usable examples.

adamnovak added 6 commits May 3, 2024 16:51

Add a script that can batch import and explain Lancet data in the Nov…

09c3ddb

…ember format

Add error reporting for things that are not BED files

8fa2e46

Find the Lancet region that best covers the variant, and emit 1-based…

e27bf8e

… coordinates

Fix visibility toggle for paths

0070fd8

Make link decoding support multiple things that are flags

3df5a0a

Lay out in node order and then track order to fix reads missing their…

38ec2e6

… nodes because they moved

adamnovak merged commit 5b1a3f5 into master May 7, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a script that can batch import and explain Lancet data in the November format #428

Add a script that can batch import and explain Lancet data in the November format #428

adamnovak commented May 3, 2024

adamnovak commented May 7, 2024

Add a script that can batch import and explain Lancet data in the November format #428

Add a script that can batch import and explain Lancet data in the November format #428

Conversation

adamnovak commented May 3, 2024

adamnovak commented May 7, 2024