CycloBranch
|
The dereplication is the process of annotation of already known compounds (i.e. the database search). To perform the dereplication, click on Search->Settings in the main window of the application and configure the tool.
The most important parameters are:
Mode - select Compare Peaklist(s) with Database - MS, LC-MS, MSI.
File - select an input file with mass spectra. The list of supported file formats is available here (you may need a third-party software for some file formats). Enable the checkbox, if you would like to extract profile mass spectra from the file for later visualization (note: the input file must include profile spectra; the extraction may be time-consuming; and the visualization must be enabled in the spectrum detail window).
Charge - the maximum charge of generated theoretical peaks. If the value is equal to 2, then the theoretical peaks of singly and doubly charged ions are generated. Use -1 for negative mode.
FWHM - see the profile peaks in your spectra and estimate the full width at half maximum (i.e. the width of a representative peak at half of his height; the value corresponds to m/z ratio of a peak divided by its resolution). The theoretical peaks are split or merged using FWHM value prior they are matched with experimental peaks.
Sequence/Compound Database File - select a text file serving as a database. The proper text file can be prepared using Sequence/Compound Database Editor or created using a template in Excel. Several in-house prepared databases are also available. In this mode, it is sufficient if you define only a name and a molecular formula (neutral) for every compound. The type of compound can be defined as 'other' and the monoisotopic mass is calculated from the molecular formula automatically. The text file has items separated by tabulator on each row and can also be easily prepared using a script (please, remember that you have to keep the correct number of columns/tabulators to make the file readable by the search engine).
Ion Types - select the types of ions to be searched. Remember that for negative ions (e.g. [M-H]-), the value of charge must be negative (i.e. -1).
Generate Full Isotope Patterns - enable the checkbox to generate the full isotopic patterns of compounds (the parameter FWHM must be estimated correctly). Optionally, you can disable this feature to generate only the theoretical peaks of monoisotopic ions. It may be advantageous for a fast check of a dataset.
Minimum Number of Isotopic Peaks - define the minimum number of peaks which must be matched in an isotopic pattern to report a given compound. The feature Generate Full Isotope Patterns must be enabled.
We recommend to save the configuration into an '.ini' file.
Click OK. In the main window select Search->Run to process the dataset.
Double-click on the row to display the details. Click on the highlighted icon to display the profile spectrum (if the option was enabled in the settings dialog).
Follow the instructions in Tutorial 1 and use the following extra parameter:
Minimum Number of Spectra - the minimum number of consecutive scans in which a compound must be matched to be reported.
All MS1 scans included in the input file are listed in the output report. Double-click on any row to display the spectrum from the corresponding scan.
In the main window, click on Tools->Chromatogram to see the chromatographic peaks of detected compounds and their retention times. Double-click on any chromatographic peak to open the corresponding mass spectrum.
In the main window, click on Tools->Summary Table of Matched Peaks to see the list of all matched peaks in all the scans. Double-click on any row to open the corresponding mass spectrum. The table can be filtered using any column e.g. Name or Theoretical m/z. Start typing into the text field and the names of identified compounds are automatically suggested. Use the arrows to browse the identified compounds individually.
The table is interactively-connected with the chromatogram. What you see in the table is automatically sent to Chromatogram window. So, you can easily browse extracted ion chromatograms of all identified compounds.
See the Tutorial 1 and Tutorial 2. In this example, any database of compounds is not used but the molecular formulas of compounds are generated from an input list of chemical elements e.g. H, C, O, N, S, and P. Because the search space grows exponentially with increasing m/z value and the number of input elements, carefully configure the following parameters to limit the number of false-positive formulas reported, to speed-up the search process and to save the computer's main memory:
Mode - select Compound Search - MS, LC-MS, MSI.
m/z Error Tolerance - define m/z error tolerance e.g. 2 ppm. You can use bigger value e.g. 5 or 10 ppm but in this case define also the parameter Isotope m/z Tolerance.
Minimum Threshold of Relative Intensity [%] - use e.g. 3 or 5% for an initial check of a dataset. Check the results and optionally reduce to 1% or lower value. Use 0 to disable this feature. The value is used together with Minimum Threshold of Absolute Intensity.
Minimum Threshold of Absolute Intensity - use bigger value (e.g. 100000) for an initial check of a dataset and then optionally reduce to lower value. Use 0 to disable this feature. The value is used together with Minimum Threshold of Relative Intensity.
m/z Ratio - define the range of m/z values in which the compounds should be detected. If a narrow interval is used (e.g. 100 or 200 Da), the search process is faster. If a wide interval (e.g. 1000 Da) is used, the searching is slower.
FWHM - see the profile peaks in your spectra and estimate the full width at half maximum (i.e. the width of a representative peak at half of his height; the value corresponds to m/z ratio of a peak divided by its resolution). The theoretical peaks are split or merged using FWHM value prior they are matched with experimental peaks.
Ion Types - select the types of ions to be searched. Remember that for negative ions (e.g. [M-H]-), the value of charge must be negative (i.e. -1). For an initial check of a dataset, use only one or a very small number of ions.
Neutral Losses / Chemical Elements - Define an input list of chemical elements. Even if any custom element can be added, use the smallest number of elements as possible. Use the colon to limit the maximum number of occurrences of elements like sulfur or phosphorus (e.g. S:2 or P:1; double-click on an element to edit this value). If the ions like [M+Na]+, [M+K]+ and [M+Fe-2H]+ are searched, do not add the elements Na, K, and Fe, respectively. They are used automatically. Use the HCON button to quickly define the basic elements H, C, O, N, S, and P.
Maximum Number of Combined Losses / Elements - define the maximum number of elements in a molecule (e.g. 150 or 200). Example, the ion [C39H60N6O15+H]+ has 39+60+6+15+1=121 elements.
Generate Full Isotope Patterns - enable the checkbox to generate the full isotopic patterns of compounds (the parameter FWHM must be estimated correctly).
Minimum Number of Isotopic Peaks - define the minimum number of peaks which must be matched in an isotopic pattern to report a given compound. The feature Generate Full Isotope Patterns must be enabled. Use e.g. 2 if a big value of Minimum Threshold of Relative Intensity [%] is used.
Minimum Number of Spectra - the minimum number of consecutive scans in which a compound must be matched to be reported.
Basic Formula Check - apply Senior's filtering rules to remove unlikely molecular formulas. See https://doi.org/10.1186/1471-2105-8-105 for more details.
Advanced Formula Check - advanced filtering rules are applied if molecular formulas of compounds are generated. See the tables 1, 2, and 3 in https://doi.org/10.1186/1471-2105-8-105 for more details.
N/O Ratio Check - check if the number of nitrogen atoms is less or equal to the number of oxygen atoms.
Isotope m/z Tolerance - the maximum m/z error of difference between an isotopic peak and the most intense peak in an experimental and theoretical isotopic pattern (0 = disabled); see the definition of Isotope m/z Tolerance.
Isotope Intensity Tolerance - the maximum error tolerance of intensities of matched isotopes (e.g. 10%); see the definition of Isotope Intensity Tolerance.
Similarly to the Tutorial 2, the compounds can be browsed in the Summary Table of Matched Peaks while their extracted ion chromatograms can be directly seen in the Chromatogram window. The difference is that the formulas of compounds are automatically suggested instead of the names of compounds.
Follow the instructions in Tutorial 1 and Tutorial 2. Modify the following parameters:
File - select an input imzML file.
Minimum Number of Spectra - the minimum number of single-pixel mass spectra in which a compound must be matched to be reported.
Every row in the output report includes coordinates [x, y] and corresponds to a single-pixel mass spectrum. Double-click on any row to display the spectrum from the corresponding pixel.
In the main window, click on Tools->Image Fusion. In the Image Fusion tool, click on File->Open, select Optical Image, and click Ok. Then select the image file which was used during the acquisition of your data. The optical image is automatically correlated with pixels (or squares) corresponding to the individual mass spectra. In some cases, the correlation of data may be vendor specific, see the details here.
The color of every square represents the sum of absolute/relative intensities of all matched peaks in the correspoding mass spectrum by default. However, the Image Fusion tool is interactively-connected with Summary Table of Matched Peaks. To display a single compound or to browse spatial distributions of all identified compounds, open the Tools->Summary Table of Matched Peaks and filter the list of all matched peaks by the name of a compound or by theoretical m/z ratio. Start typing into the text field and the names of identified compounds are automatically suggested. Use the arrows to browse the identified compounds individually.
If you would like to make a fusion with another image (e.g. a histology image), click on File->Open, select Histology Image, and click Ok. Then select an image in *.jpg, *.jpeg, *.png, *.tif, *.tiff, *.bmp or *.gif file format. Click on View->Show Selection to see a bounding box around the opened histology image. Make sure that Histology Image layer is selected in the list of layers. The histology image must be correlated with optical image manually. To make the fusion, move the histology image by mouse to overlap the analyzed sample on the optical image. If needed, adjust the size and rotation angle of the histology image on the corresponding toolbar (the active toolbar is green). Optionally, adjust the transparency and order of any layer on the right panel. See more details here.
In this example, any database of compounds is not used but the molecular formulas of compounds are generated from an input list of chemical elements e.g. H, C, O, N, S, and P. Because the search space grows exponentially with increasing m/z value and the number of input elements, carefully configure the following parameters to limit the number of false-positive formulas reported, to speed-up the search process and to save the computer's main memory:
Mode - select Compound Search - MS, LC-MS, MSI.
File - select an input imzML file. The file commonly contains profile mass spectra. In this case, make sure that you have installed OpenMS. Enable the checkbox, if you would like to visualize the profile mass spectra when the data-processing is completed (note: the input file must include profile spectra; the extraction may be time-consuming; and the visualization must be enabled in the spectrum detail window).
Charge - the maximum charge of generated theoretical peaks. If the value is equal to 2, then the theoretical peaks of singly and doubly charged ions are generated. Use -1 for negative mode.
m/z Error Tolerance - define m/z error tolerance e.g. 2 ppm. You can use bigger value e.g. 5 or 10 ppm but in this case define also the parameter Isotope m/z Tolerance.
Minimum Threshold of Relative Intensity [%] - use e.g. 3 or 5% for an initial check of a dataset. Check the results and optionally reduce to 1% or lower value. Use 0 to disable this feature. The value is used together with Minimum Threshold of Absolute Intensity.
Minimum Threshold of Absolute Intensity - use bigger value (e.g. 100000) for an initial check of a dataset and then optionally reduce to lower value. Use 0 to disable this feature. The value is used together with Minimum Threshold of Relative Intensity.
m/z Ratio - define the range of m/z values in which the compounds should be detected. If a narrow interval is used (e.g. 100 or 200 Da), the search process is faster. If a wide interval (e.g. 1000 Da) is used, the searching is slower.
FWHM - see the profile peaks in your spectra and estimate the full width at half maximum (i.e. the width of a representative peak at half of his height; the value corresponds to m/z ratio of a peak divided by its resolution). The theoretical peaks are split or merged using FWHM value prior they are matched with experimental peaks.
Ion Types - select the types of ions to be searched. Remember that for negative ions (e.g. [M-H]-), the value of charge must be negative (i.e. -1). For an initial check of a dataset, use only one or a very small number of ions.
Neutral Losses / Chemical Elements - Define an input list of chemical elements. Even if any custom element can be added, use the smallest number of elements as possible. Use the colon to limit the maximum number of occurrences of elements like sulfur or phosphorus (e.g. S:2 or P:1; double-click on an element to edit this value). If the ions like [M+Na]+, [M+K]+ and [M+Fe-2H]+ are searched, do not add the elements Na, K, and Fe, respectively. They are used automatically. Use the HCON button to quickly define the basic elements H, C, O, N, S, and P.
Maximum Number of Combined Losses / Elements - define the maximum number of elements in a molecule (e.g. 150 or 200). Example, the ion [C39H60N6O15+H]+ has 39+60+6+15+1=121 elements.
Generate Full Isotope Patterns - enable the checkbox to generate the full isotopic patterns of compounds (the parameter FWHM must be estimated correctly).
Minimum Number of Isotopic Peaks - define the minimum number of peaks which must be matched in an isotopic pattern to report a given compound. The feature Generate Full Isotope Patterns must be enabled. Use e.g. 2 if a big value of Minimum Threshold of Relative Intensity [%] is used.
Minimum Number of Spectra - the minimum number of pixels in which a compound must be matched to be reported.
Basic Formula Check - apply Senior's filtering rules to remove unlikely molecular formulas. See https://doi.org/10.1186/1471-2105-8-105 for more details.
Advanced Formula Check - advanced filtering rules are applied if molecular formulas of compounds are generated. See the tables 1, 2, and 3 in https://doi.org/10.1186/1471-2105-8-105 for more details.
N/O Ratio Check - check if the number of nitrogen atoms is less or equal to the number of oxygen atoms.
Isotope m/z Tolerance - the maximum m/z error of difference between an isotopic peak and the most intense peak in an experimental and theoretical isotopic pattern (0 = disabled); see the definition of Isotope m/z Tolerance.
Isotope Intensity Tolerance - the maximum error tolerance of intensities of matched isotopes (e.g. 10%); see the definition of Isotope Intensity Tolerance.
The output report and the visualization of results are described in Tutorial 4. The only difference is that the molecular formulas of compounds are suggested in the Summary Table of Matched Peaks instead of names of compounds.
The following tutorial shows how to perform the annotation of peaks corresponding to fragment ions of a small molecule in a single MS/MS spectrum. If you would like to annotate single MS/MS spectra of (non-)ribosomal peptides and siderophores, see the tutorials for CycloBranch (1.x). See also the extra pages for linear, cyclic, branched, and branch-cyclic peptides; the building blocks editor, sequence/compound database editor, modifications editor, and draw peptide tool.
In this example, we annotate MS/MS spectrum of [M+H]+ ion of pyochelin with m/z 325.06733.
The following parameters in the Settings are used:
Mode - select Compare Peaklist(s) with Spectrum of Searched Sequence - MS/MS.
Peptide Type - select Other.
File - select an input file with MS/MS spectrum. Because the mode Compare Peaklist(s) with Spectrum of Searched Sequence - MS/MS is used, the file can also include multiple MS/MS spectra. In such a case, all the spectra in the file are compared with the same theoretical spectrum and listed in the output report. (Note: This feature is not supported if the modes De Novo Search Engine - MS/MS and Compare Peaklist with Database - MS/MS are used. These modes require only one MS/MS spectrum. If a file contains multiple spectra, the spectrum can be selected using the parameter Scan no. which determines the order of MS/MS spectrum in the file.)
Precursor m/z Ratio - define the m/z ratio of precursor ion. The value is not read automatically from a file.
Precursor Ion Adduct - Keep empty for [M+H]+ ion. For [M+Na]+, [M+K]+, and [M+Fe-2H]+ ions use the values Na, K, and FeH-2, respectively. See more details here.
Charge - the maximum charge of generated theoretical peaks. If the value is equal to 2, then the theoretical peaks of singly and doubly charged ions are generated. Use -1 for negative mode.
Precursor m/z Error Tolerance - define m/z error tolerance of precursor ion (the value can be the same as m/z Error Tolerance).
m/z Error Tolerance - define m/z error tolerance (e.g. 2 ppm).
Minimum Threshold of Relative Intensity [%] - use e.g. 3 or 5% for an initial check of a spectrum. Check the results and optionally reduce to 1% or lower value. Use 0 to disable this feature. The value is used together with Minimum Threshold of Absolute Intensity.
Minimum Threshold of Absolute Intensity - in this mode, you can use 0 to disable this feature. The value is used together with Minimum Threshold of Relative Intensity.
Minimum m/z Ratio - define the minimum m/z ratio. In this mode, it is advantageous to keep a window between the singly charged precursor ion and the minimum m/z ratio as small as possible to speed-up the search process. In our example, the window is small i.e. 325.06733 - 100 = 225.06733 and thus the number of generated combinations of elements which may drop out from the precursor ion is also small. See the parameter Neutral Losses / Chemical Elements.
FWHM - see the profile peaks in your spectra and estimate the full width at half maximum (i.e. the width of a representative peak at half of his height; the value corresponds to m/z ratio of a peak divided by its resolution). The theoretical peaks are split or merged using FWHM value prior they are matched with experimental peaks. The value is used only if the parameter Generate Full Isotope Patterns is enabled.
Neutral Losses / Chemical Elements - Define an input list of chemical elements which can drop out from the precursor ion. Use the colon to limit the maximum number of occurrences of elements like sulfur or phosphorus (e.g. S:2 or P:1; double-click on an element to edit this value). Use the HCON button to quickly define the basic elements H, C, O, N, S, and P.
Maximum Number of Combined Losses / Elements - define the maximum number of combined elements. In our example, we fragment the ion [C14H16N2O3S2+H]+. So, you do not have to use a bigger value than 14+16+2+3+2+1=38.
Generate Full Isotope Patterns - enable the checkbox to generate the full isotopic patterns of fragment ions (the parameter FWHM must be estimated correctly). In this mode, enable the feature only if the isolation window was wide enough and you expect the presence of isotopic peaks of fragment ions in the MS/MS spectrum. You can disable this feature to generate only the theoretical peaks of monoisotopic ions and to speed-up the annotation.
Minimum Number of Isotopic Peaks - define the minimum number of peaks which must be matched in an isotopic pattern to report a match of a fragment ion. The feature Generate Full Isotope Patterns must be enabled.
Basic Formula Check - the Senior's filtering rules are applied to the combinations of elements which may drop out from the precursor ion. See https://doi.org/10.1186/1471-2105-8-105 for more details.
Formula - define the formula corresponding to the precursor ion (neutral state; no adducts).
Click OK. In the main window select Search->Run to process the dataset. Double-click on the row in the output report to see the details of identification.
In the detail window you can see the annotations of fragment ions, visualize the profile spectra (if they were included in the input file), etc.