Deduplicate CDR tables #165

douweschulte · 2022-06-02T22:27:02Z

It is very common for the CDR tables to contain multiple copies of the same sequence. It would be nice to deduplicate them. For now the approach would be to group by template, by sequence. So this:

Template	Sequence	Read
IGHV3-11	S.....	R001
IGHV3-11	S.....	R002
IGHV3-11	TAS...	R003
IGHV3-23	S.....	R001

Would become:

Template	Sequence	Reads
IGHV3-11	S.....	R001,R002
IGHV3-11	TAS...	R003
IGHV3-23	S.....	R001

douweschulte · 2022-06-22T15:33:58Z

Another thought: the table could be turned around, focussed on the read instead of the template. If it is then deduplicated it can be easier to spot the diversity of reads which is the thing users want to see here.

douweschulte added the A-html-report Area: Related to the HTML output report label Jun 2, 2022

douweschulte closed this as completed in 02798a0 Sep 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deduplicate CDR tables #165

Deduplicate CDR tables #165

douweschulte commented Jun 2, 2022

douweschulte commented Jun 22, 2022

Deduplicate CDR tables #165

Deduplicate CDR tables #165

Comments

douweschulte commented Jun 2, 2022

douweschulte commented Jun 22, 2022