PEPMatch trimms sequence id #19

HA1-biocopy · 2024-11-26T22:26:04Z

Hi,
I'm testing pepmatch with my custom fasta file (--proteome_file). However, I noticed that pepmatch trimms the protein's id, e.g. in my file the id follows this syntax: xx|GENE-ID|genome
what pepmatch reports is GENE-ID.1 , why? how can I make it returns back the full id?
thank you

The text was updated successfully, but these errors were encountered:

dmx2 · 2024-11-27T02:17:18Z

PEPMatch is mostly built around UniProt proteomes, as that is what most of our users use and the protein ID is typically between pipe (|) characters.

I think there should be a way to just use the entire FASTA header for output. I'll consider this as a feature to add.

HA1-biocopy · 2024-11-27T09:16:55Z

Hi,

I noticed that adding '.1' as a suffix to the name can introduce complications downstream, especially since the name may already contain other symbols. I’ve tried changing the ID delimiter, but it still gets stripped and the suffix is added.

Do you have any suggestions on how to resolve this issue in the meantime, until the feature is implemented?

For context: we’re testing the tool as part of our evaluation process, with plans to purchase a license if it meets our internal needs.

Thanks in advance for your help!

dmx2 · 2024-11-27T19:32:02Z

I need to update the README, but if you pass sequence_version=False into the Matcher class (or -v if using the command line) and this will drop the '.#' suffix to the IDs. Let me know if that works for you.

HA1-biocopy · 2024-12-09T20:38:28Z

thank you again for your help
As I'm still actively testing pepmatch, I noticed that the id of the query sequences is not included in the output. Could you please add that as well? or at least have similar format to blast where we can specifiy the columns we need in the output? thank you

dmx2 · 2024-12-11T01:46:19Z

yes, I will make that a part of the output as well

will update when I make the changes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PEPMatch trimms sequence id #19

PEPMatch trimms sequence id #19

HA1-biocopy commented Nov 26, 2024 •

edited

Loading

dmx2 commented Nov 27, 2024

HA1-biocopy commented Nov 27, 2024

dmx2 commented Nov 27, 2024

HA1-biocopy commented Dec 9, 2024 •

edited

Loading

dmx2 commented Dec 11, 2024

PEPMatch trimms sequence id #19

PEPMatch trimms sequence id #19

Comments

HA1-biocopy commented Nov 26, 2024 • edited Loading

dmx2 commented Nov 27, 2024

HA1-biocopy commented Nov 27, 2024

dmx2 commented Nov 27, 2024

HA1-biocopy commented Dec 9, 2024 • edited Loading

dmx2 commented Dec 11, 2024

HA1-biocopy commented Nov 26, 2024 •

edited

Loading

HA1-biocopy commented Dec 9, 2024 •

edited

Loading