MSP files management

IDSL.FSA was designed to manage MSP format mass spectrometry files with different structures. IDSL.FSA provide various tools to manage msp files which a number of them are summarized below:

`mgf2msp`

The mgf2msp convert Mascot generic format files (.mgf) into NIST mass spectra format (.msp). The mgf2msp module is fast which requires <2 sec for mgf files with ~5,000 fragmentation blocks on a single thread.

mgf2msp(path = getwd(), MGFfileName = "")

path: Location of the original msp file

MGFfileName: Name of the mgf file with its extension

The converted files are stored in the same directory with an .msp extension.

`mspSpiltterPosNeg`

In many instances, msp public libraries include both positive and negative fragmentation data in one msp file. Therefore, IDSL.FSA utilized a module, mspSpiltterPosNeg, to separate positive and negative msp blocks for a rapid and efficient annotation. This module is so easy to use:

mspSpiltterPosNeg(path = getwd(), mspFileName = "", number_processing_threads = 1)

path: Location of the original msp file

mspFileName: Name of the msp file with its extension

number_processing_threads: number of parallel processing threads compatible with the Windows and Linux environments

The isolated MSP blocks are stored in the same directory with "POS_" and "NEG_" prefixes.

`FSdb2precursorType`

The FSdb2precursorType can detect potential ionization pathways for molecular formulas using a vector of InChIKey values from an FSDB. This function only searches for the first 14 InChIKey letters; and therefore, may result with multiple potential precursor types.

FSdb2precursorType(InChIKeyVector, libFSdb, tableIndicator = "Frequency", number_processing_threads = 1)

InChIKeyVector: A vector of InChIKey values. This value may contain whole InChIKey strings or first 14 InChIKey letters.

libFSdb: A converted MSP library reference file using the msp2FSdb module which is an FSDB produced by the IDSL.FSA package.

tableIndicator: c("Frequency", "PrecursorMZ"). To show frequency or a median of PrecursorMZ values in the output dataframe for each precursor type.

number_processing_threads: number of parallel processing threads compatible with the Windows and Linux environments

A matrix of frequency for each InChIKey in the FSDB. The matrix column headers represent precursor types.

`FSA_msp2Cytoscape`

This FSA_msp2Cytoscape module performs pairwise MSP blocks analysis to create Cytoscape networks files. This function is especially beneficial to find related peaks in an analysis.

FSA_msp2Cytoscape(path, MSPfile, mspVariableVector = NULL, mspNodeID = NULL,
massError = 0.01, RTtolerance = NA, minEntropySimilarity = 0.75, noiseRemovalRatio = 0.01,
allowedNominalMass = FALSE, allowedWeightedSpectralEntropy = TRUE, number_processing_threads = 1)

path: address of msp file

MSPfile: name of msp file

mspVariableVector: a vector of msp variables

mspNodeID: msp Node ID which is the ID that is required for the `specsim' ID generation

massError: Mass accuracy in Da

RTtolerance: Retention time tolerance (min) to match msp blocks. Select NA to ignore retention time match. This option is so helpful to find co-occurring compounds.

minEntropySimilarity: Minimum entropy similarity score

noiseRemovalRatio: Noise level removal relative to the basepeak to measure entropy similarity score (in percent)

allowedNominalMass: c(TRUE, FALSE). Select TRUE only for nominal mass analysis.

allowedWeightedSpectralEntropy: c(TRUE, FALSE). Weighted entropy to measure entropy similarity score.

number_processing_threads: number of parallel processing threads compatible with the Windows and Linux environments

`FSA_uniqueMSPblockTagger`

This FSA_uniqueMSPblockTagger module performs pairwise MSP blocks analysis to create Cytoscape networks files. This function is especially beneficial to find related peaks in an analysis.

FSA_uniqueMSPblockTagger(path, MSPfile, aggregateBy = "Name", massError = 0.01, RTtolerance = NA,
minEntropySimilarity = 0.75, noiseRemovalRatio = 0.01, allowedNominalMass = FALSE,
allowedWeightedSpectralEntropy = TRUE, number_processing_threads = 1)

path: address of msp file

MSPfile: name of msp file

aggregateBy: a variable to aggregate the MSP blocks based on

massError: Mass accuracy in Da

RTtolerance: Retention time tolerance (min) to match msp blocks. Select NA to ignore retention time match. This option is so helpful to find co-occurring compounds.

minEntropySimilarity: Minimum entropy similarity score

noiseRemovalRatio: Noise level removal relative to the basepeak to measure entropy similarity score (in percent)

allowedNominalMass: c(TRUE, FALSE). Select TRUE only for nominal mass analysis.

allowedWeightedSpectralEntropy: c(TRUE, FALSE). Weighted entropy to measure entropy similarity score.

number_processing_threads: number of parallel processing threads compatible with the Windows and Linux environments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MSP files management

`mgf2msp`

`mspSpiltterPosNeg`

`FSdb2precursorType`

`FSA_msp2Cytoscape`

`FSA_uniqueMSPblockTagger`

Clone this wiki locally