Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_rankN pepXML files #334

Open
chambm opened this issue Nov 14, 2024 · 9 comments
Open

_rankN pepXML files #334

chambm opened this issue Nov 14, 2024 · 9 comments
Assignees

Comments

@chambm
Copy link

chambm commented Nov 14, 2024

Why does MSFragger in DIA mode write pepXML files with rank suffixes instead of using the hit_rank="N" attribute in a single pepXML? Is there an option to always write the DDA way?

@fcyu fcyu self-assigned this Nov 14, 2024
@fcyu
Copy link
Member

fcyu commented Nov 14, 2024

Two major reasons

  1. AFAIK, each spectrum_query can only have one precursor_neutral_mass and assumed_charge, which is not suitable for DIA
  2. Some downstream tools, such as PeptideProphet and Philosopher, don't read >1 ranked search_hit

Best,

Fengchao

@chambm
Copy link
Author

chambm commented Nov 14, 2024

True. Maybe multiple spectrum_query elements in that case? That is the way it's supposed to work for assumed_charge in the chimeric DDA case.

@fcyu
Copy link
Member

fcyu commented Nov 14, 2024

Yes, multiple spectrum_query will work if the spectrum, spectrumNativeID, and start_scan can be non-unique.

But then, how to specify the ranks of the same spectrum?

Best,

Fengchao

@chambm
Copy link
Author

chambm commented Nov 14, 2024

Yes, because of the assumed_charge thing they don't have to be unique. The ranks would be specific to a hypothetical precursor (theoretical mass and charge).

@fcyu
Copy link
Member

fcyu commented Nov 14, 2024

Sorry that my previous question is not clear. How to specify the rank 1, 2, 3 for the search_hit of the same spectrum if list them in a separated spectrum_query? Can the hit_rank starts with > 1?

Thanks,

Fengchao

@chambm
Copy link
Author

chambm commented Nov 14, 2024

Each spectrum_query should start with hit_rank=1. Think of the rank as being for the hypothetical precursor ion rather than for the spectrum. Isn't that how it already works with the _rank1, _rank2 separate files?

@fcyu
Copy link
Member

fcyu commented Nov 14, 2024

Isn't that how it already works with the _rank1, _rank2 separate files?

In the _rank1, _rank2 files, we get the rank information by the file name. If we put all ranks in the same file and separate them in different spectrum_query, is there a way to mark different ranks? Or I have to rank them by the hyperscore when loading the data?

Thanks,

Fengchao

@chambm
Copy link
Author

chambm commented Nov 14, 2024

Yes, I suppose if you need to aggregate everything back at the spectrum level, you'll have to regenerate the ranks by whatever score you want to use. For percolator I'd expect to use its q-value to do the reranking so potentially something that was rank 2 will become rank 1.

@fcyu
Copy link
Member

fcyu commented Nov 14, 2024

Thanks. Then, I need to make the changes and test if the downstream tools such as PeptideProphet and Philosopher support it.

Best,

Fengchao

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants