MSP files management

IDSL.FSA was designed to manage MSP format mass spectrometry files with different structures. IDSL.FSA provide various tools to manage .msp files which a number of them are summarized below:

`msp2FSdb`

The msp2FSdb can generate organized Fragmentation Spectra DataBase (FSDB) libraries for data parsing.

msp2FSdb(path = getwd(), MSPfile_vector = "", massIntegrationWindow = 0, allowedNominalMass = FALSE,
allowedWeightedSpectralEntropy = TRUE, noiseRemovalRatio = 0.01, number_processing_threads = 1)

path: address of .msp file

MSPfile_vector: a vector of .msp file names

massIntegrationWindow: Mass accuracy in Da

allowedNominalMass: c(TRUE, FALSE). Select TRUE only for nominal mass analysis.

allowedWeightedSpectralEntropy: c(TRUE, FALSE). Weighted entropy to measure entropy similarity score.

noiseRemovalRatio: Noise level removal relative to the basepeak to measure entropy similarity score (0-1)

number_processing_threads: number of parallel processing threads compatible with the Windows and Linux environments.

`mgf2msp`

The mgf2msp can convert Mascot generic format files (.mgf) into NIST mass spectra format (.msp). The mgf2msp module is fast which requires <2 sec for .mgf files with ~5,000 fragmentation blocks on a single thread.

mgf2msp(path = getwd(), MGFfileName = "")

path: Location of the original .msp file

MGFfileName: Name of the mgf file with its extension

The converted files are stored in the same directory with .msp extensions.

`mspSplitterPosNeg`

In many instances, .msp public libraries include both positive and negative fragmentation data in one .msp file. Therefore, IDSL.FSA utilized a module, mspSplitterPosNeg, to separate positive and negative MSP blocks for a rapid and efficient annotation. This module is so easy to use:

mspSplitterPosNeg(path = getwd(), MSPfile = "", number_processing_threads = 1)

path: Location of the original .msp file

MSPfile: Name of the .msp file with its extension

number_processing_threads: number of parallel processing threads compatible with the Windows and Linux environments

The separated MSP blocks are stored in the same directory with "_Pos" and "_Neg" suffixes.

`FSdb2precursorType`

The FSdb2precursorType can detect potential ionization pathways for molecular formulas using a vector of InChIKey values from an FSDB. This function only searches for the first 14 InChIKey letters; and therefore, may result with multiple potential precursor types.

FSdb2precursorType(InChIKeyVector, libFSdb, tableIndicator = "Frequency", number_processing_threads = 1)

InChIKeyVector: A vector of InChIKey values. This value may contain whole InChIKey strings or first 14 InChIKey letters.

libFSdb: A converted MSP library reference file using the msp2FSdb module which is an FSDB produced by the IDSL.FSA package.

tableIndicator: c("Frequency", "PrecursorMZ"). To show frequency or a median of PrecursorMZ values in the output dataframe for each precursor type.

number_processing_threads: number of parallel processing threads compatible with the Windows and Linux environments

A matrix of frequency for each InChIKey in the FSDB. The matrix column headers represent precursor types.

`FSA_msp2Cytoscape`

This FSA_msp2Cytoscape module performs pairwise MSP block analysis to create Cytoscape networks files. This function is especially beneficial to find related peaks in an analysis.

FSA_msp2Cytoscape(path = getwd(), MSPfile = "", mspVariableVector = NULL, mspNodeID = NULL,
massError = 0.01, RTtolerance = NA, minEntropySimilarity = 0.75, allowedNominalMass = FALSE,
allowedWeightedSpectralEntropy = TRUE, noiseRemovalRatio = 0.01, number_processing_threads = 1)

path: address of .msp file

MSPfile: name of .msp file

mspVariableVector: a vector of MSP variables

mspNodeID: MSP Node ID which is the ID that is required for the `specsim' ID generation

massError: Mass accuracy in Da

RTtolerance: Retention time tolerance (min) to match MSP blocks. Select NA to ignore retention time match. This option is so helpful to find co-occurring compounds.

minEntropySimilarity: Minimum entropy similarity score

allowedNominalMass: c(TRUE, FALSE). Select TRUE only for nominal mass analysis.

allowedWeightedSpectralEntropy: c(TRUE, FALSE). Weighted entropy to measure entropy similarity score.

noiseRemovalRatio: Noise level removal relative to the basepeak to measure entropy similarity score (0-1)

number_processing_threads: number of parallel processing threads compatible with the Windows and Linux environments

`FSA_uniqueMSPblockTagger`

This FSA_uniqueMSPblockTagger module performs pairwise MSP blocks analysis to remove similar MSP blocks in an .msp file.

FSA_uniqueMSPblockTagger(path = getwd(), MSPfile = "", aggregateBy = "Name", massError = 0.01,
RTtolerance = NA, minEntropySimilarity = 0.75, noiseRemovalRatio = 0.01, allowedNominalMass = FALSE,
allowedWeightedSpectralEntropy = TRUE, number_processing_threads = 1)

path: address of .msp file

MSPfile: name of .msp file

aggregateBy: a variable to aggregate the MSP blocks based on

massError: Mass accuracy in Da

RTtolerance: Retention time tolerance (min) to match MSP blocks. Select NA to ignore retention time match. This option is so helpful to find co-occurring compounds.

minEntropySimilarity: Minimum entropy similarity score

noiseRemovalRatio: Noise level removal relative to the basepeak to measure entropy similarity score (0-1)

allowedNominalMass: c(TRUE, FALSE). Select TRUE only for nominal mass analysis.

allowedWeightedSpectralEntropy: c(TRUE, FALSE). Weighted entropy to measure entropy similarity score.

number_processing_threads: number of parallel processing threads compatible with the Windows and Linux environments.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MSP files management

`msp2FSdb`

`mgf2msp`

`mspSplitterPosNeg`

`FSdb2precursorType`

`FSA_msp2Cytoscape`

`FSA_uniqueMSPblockTagger`

Clone this wiki locally