CycloBranch
Settings

Settings - Dereplication (MS)

The settings dialog is opened using the command "Search -> Settings... ". Use F1 to open this help.

The application works in five different modes. This page summarizes the parameters used for dereplication. The differences in the configuration for other modes are described on extra pages, see the parameter Mode.

See also the following tutorials: Tutorial 1: Dereplication in MS1 data, Tutorial 2: Dereplication in LC-MS data, Tutorial 4: Dereplication in MSI data and fusion with histology image, and the page Dereplication.

Settings dialog - dereplication (MS).

Basic Buttons

  • OK (Enter) - Accept changes and hide window.
  • Cancel (Esc) - Drop changes and hide window.
  • Apply - Accept changes and keep window opened.
  • Load (CTRL + L) - Load settings from a file (*.ini).
  • Save (CTRL + S) - Save settings in the current file (*.ini). When a file has not been loaded yet, the "Save As ..." file dialog is opened.
  • Save As... (CTRL + D) - Save settings into a file (*.ini).

Search

Mode

Maximum Number of Threads

The maximum number of threads used during a search process.


Experimental Spectrum/Spectra

File (CTRL + P)

Select an input file with experimental spectra. Supported file formats:

File format Supported types of spectra (data processing) Supported types of spectra (visualization) Description Notes
txt centroid centroid plain text file Mass-to-charge ratio and intensity on each line separated by tabulator; multiple peaklists must be separated by an empty line.
mgf centroid centroid Mascot Generic File format
mzML profile and centroid profile and centroid standard data format m/z values and intensities must be stored as 64-bit/32-bit floats; no compression and zlib compression are supported. OpenMS 2.x must be installed.
mzXML centroid centroid standard data format OpenMS 2.x must be installed; the FileConverter tool is used to automatically convert this file format into mzML.
imzML profile and centroid profile and centroid standard data format Processed or continuos data file format; m/z values and intensities must be stored as 64-bit/32-bit floats; no compression. OpenMS 2.x must be installed.
baf profile and centroid profile and centroid Bruker's Analysis File CompassXport 3.0.13.1 or CompassXtract 3.2.201 (64-bit) must be installed; see details here; Windows only.
raw profile and centroid profile and centroid Thermo raw file OpenMS 2.x including ProteoWizard must be installed; Windows only.
raw profile and centroid profile and centroid Waters raw directory Select a *.dat file in the raw directory; Windows only.
mis (deprecated) profile and centroid centroid Bruker's flexImaging File (old data format) CompassXport 3.0 must be installed; Windows only. If the file "[name].mis" is selected, all the "analysis.baf" files in the subfolders of "[name]" directory are searched and processed.
ser (deprecated) profile and centroid centroid Apex file format

CompassXport 3.0 must be installed; Windows only.

Multiple files can be selected in the modes 'Compare Peaklist(s) with Database - MS, LC-MS, MSI' and 'Compound Search - MS, LC-MS, MSI'. The feature is not available for imzML file format. If a directory name is specified, it is searched recursively and all found files are processed - the feature is currently available for baf files (Bruker) and raw directories (Waters) only.

The visualization of profile mass spectra is enabled if the checkbox next to the file name is checked. This feature is available for the following file formats:

  • baf files (Bruker; Windows only)
  • raw directories (Waters; Windows only)
  • raw files (Thermo; Windows only)
  • mzML
  • imzML

CycloBranch supports the standard imzML data format for imaging mass spectra (see imzml.org). If the file includes profile spectra, they are automatically converted to centroided spectra using the OpenMS pipeline NoiseFilterGaussian >> BaseLineFilter >> PeakPickerHiRes (requires OpenMS 2.x installed). An external script is used to convert the raw data thus the parameters of the pipeline (e.g., S/N ratio) can be adjusted (edit the file "External/windows/raw2peaks.bat", "External/linux/raw2peaks.sh" or "External/macosx/raw2peaks.sh" depending on your data requirements and the platform). Since the data conversion may be very time-consuming (hrs/days) due to the size of a dataset (e.g., tens of gigabytes), CycloBranch stores the converted files with centroided spectra for further use. The new filenames are "filename_converted_fwhm_value.imzML" and "filename_converted_fwhm_value.ibd". CycloBranch automatically detects these files when the search process is repeated and opens a dialog to recommend their use. The "filename_converted_fwhm_value.imzML" can be used directly as a peaklist file in Settings to skip the recommendation dialog.

The dialog which recommends the use of centroided spectra instead of profile spectra.

Charge

MS mode - the value determines the maximum charge of generated theoretical peaks; a negative value is allowed. For example, the value 3 means that theoretical peaks of compouds are generated with charges 1+, 2+, and 3+. The value -1 means that theoretical peaks of compounds are generated with charge 1-.

MS/MS mode - the value defines the charge of precursor ion; a negative value is allowed. The charge of precursor ion in an input peaklist file is ignored.

m/z Error Tolerance

Enter the m/z error tolerance in MS mode or the fragment m/z error tolerance in MS/MS mode [ppm].

Minimum Threshold of Relative Intensity

Enter the minimum threshold of relative intensity in %. Peaks with relative intensities below the threshold are removed from an input experimental peaklist.

Minimum Threshold of Absolute Intensity

Enter the minimum threshold of absolute intensity. Peaks with absolute intensities below the threshold are removed from an input experimental peaklist.

The relative and absolute thresholds are used simultaneously.

m/z Ratio

Enter the minimum m/z ratio and maximum m/z ratio. Experimental peaks with m/z ratios below/above the thresholds are removed from the input experimental peaklist(s). For maximum m/z, 0 = disabled.

Retention Time

Limit the range of retention time in which the compounds are searched in 'Compare Peaklist(s) with Database - MS, LC-MS, MSI' and 'Compound Search - MS, LC-MS, MSI' modes if LC-MS data are processed.

FWHM

Full width at half maximum. The value is used if the profile spectra are converted into peaklists (mzML and imzML input files) and if the full isotope patterns of compounds are generated.


Isotope Ratios

54Fe/56Fe Ratio

Enter the minimum and maximum intensity ratios of peaks corresponding to 54Fe and 56Fe. If the option 'Generate Full Isotope Patterns' is enabled, the value of 'Minimum Number of Isotopic Peaks' must be at least 2 to enable this feature. The feature is enabled only if the checkbox is checked.

60Ni/58Ni Ratio

Enter the minimum and maximum intensity ratios of peaks corresponding to 60Ni and 58Ni. If the option 'Generate Full Isotope Patterns' is enabled, the value of 'Minimum Number of Isotopic Peaks' must be at least 2 to enable this feature. The feature is enabled only if the checkbox is checked. The default maximum value is 0.5 (for FWHM <= 0.001 only). The minimum and maximum values must be estimated empirically for FWHM > 0.001, because the intesity of 60Ni may be accumulated with 13C2, etc.

65Cu/63Cu Ratio

Enter the minimum and maximum intensity ratios of peaks corresponding to 65Cu and 63Cu. If the option 'Generate Full Isotope Patterns' is enabled, the value of 'Minimum Number of Isotopic Peaks' must be at least 2 to enable this feature. The feature is enabled only if the checkbox is checked. The default maximum value is 0.6 (for FWHM <= 0.001 only). The minimum and maximum values must be estimated empirically for FWHM > 0.001, because the intesity of 65Cu may be accumulated with 13C2, etc.

66Zn/64Zn Ratio

Enter the minimum and maximum intensity ratios of peaks corresponding to 66Zn and 64Zn. If the option 'Generate Full Isotope Patterns' is enabled, the value of 'Minimum Number of Isotopic Peaks' must be at least 2 to enable this feature. The feature is enabled only if the checkbox is checked. The default maximum value is 0.7 (for FWHM <= 0.001 only). The minimum and maximum values must be estimated empirically for FWHM > 0.001, because the intesity of 66Zn may be accumulated with 13C2, etc.

68Zn/64Zn Ratio

Enter the minimum and maximum intensity ratios of peaks corresponding to 68Zn and 64Zn. If the option 'Generate Full Isotope Patterns' is enabled, the value of 'Minimum Number of Isotopic Peaks' must be at least 3 to enable this feature. The feature is enabled only if the checkbox is checked. The default maximum value is 0.5 (for FWHM <= 0.001 only). The minimum and maximum values must be estimated empirically for FWHM > 0.001, because the intesity of 68Zn may be accumulated with 13C4, etc.


Theoretical Spectrum/Spectra

Sequence/Compound Database File (CTRL + T)

A text file containing a database of sequences/compounds. See Format of Sequence/Compound Databases.

Ion Types

The types of ions for MS, LC-MS, and MSI data can be defined in global Preferences dialog. The default list of ions is shown here.

If a metal ion is selected, the isotopes having natural abundance greater than 3% are generated automatically in theoretical peaklists next to the most abundant isotopes. If the option Generate Full Isotope Patterns is selected, this feature is disabled because full isotopic patterns are calculated.

The following types of fragment ions are defined for MS/MS spectra:

  • A (a-ion)
  • B (b-ion)
  • C (c-ion)
  • X (x-ion)
  • Y (y-ion)
  • Z (z-ion)

Note that y-ions are not supported when cyclic peptides are searched. Fragment ions series of linear and cyclic polyketides are described on extra pages. See Nomenclature of Linear Polyketide Series and Nomenclature of Cyclic Polyketide Series.

Buttons:

  • "Select All" - select all fragment ions in the list.
  • "Clear All" - unselect all fragment ions in the list.
  • "Reset" - b-ion is selected automatically if a cyclic peptide is choosen as "Peptide Type". B-ion and y-ion are selected if a linear, branched or branch-cyclic peptide is choosen.

Neutral Losses

Define and select the types of neutral losses which will be used when generating theoretical spectra.

Buttons:

  • "Select All" - select all neutral losses in the list.
  • "Clear All" - unselect all neutral losses in the list.
  • "Add" - add a new item at the end of the list - a valid molecular formula must be entered.
  • "Remove" - remove selected items from the list.
  • "Default" - clear the list and define a list of default neutral losses.
  • "HCON" - clear the list and set the default chemical elements H, C, O, N, S, and P.

The default neutral losses are defined as follows:

FormulaDescription
H2Owater
NH3ammonia
COformyl group
CO2carboxyl group
CONHcarbamyl group
CH2shortened carbon chain
C6H4benzene ring removal
CH2N2part of Arg side chain
CH2OSer side chain
CH2SCys side chain
C4H9NLys side chain
C4H4N2His side chain
C4H9N3Arg side chain
C9H7NTrp side chain

In MS/MS mode, the molecular formulas of proteinogenic amino acids side chains can be optionally combined from the default values as follows:

Amino acidSide chainAmino acidSide chain
Gly-AspC2H2O2 (CH2+CO2)
AlaCH2GlnC3H5NO (CH2+CH2+CONH)
SerCH2OLysC4H9N
ProC3H4 (not defined)GluC3H4O2 (CH2+CH2+CO2)
ValC3H6 (CH2+CH2+CH2)MetC3H6S (CH2+CH2+CH2S)
ThrC2H4O (CH2+CH2O)HisC4H4N2
CysCH2SPheC7H6 (CH2+C6H4)
LeuC4H8 (CH2+CH2+CH2+CH2)ArgC4H9N3
IleC4H8 (CH2+CH2+CH2+CH2)TyrC7H6O (CH2O+C6H4)
AsnC2H3NO (CH2+CONH)TrpC9H7N

In some cases, the list of neutral losses can be replaced by a list of chemical elements (e.g. H, C, O, N, S, and P). This is useful if an experimental spectrum contains peaks corresponding to ions with unknown neutral losses, if a metabolite (i.e. a non-peptidic compound) is fragmented or if the mode 'Compound Search - MS, LC-MS, MSI' is used.

Maximum Number of Combined Losses

Maximum number of combined neutral losses.

Report Unmatched Theoretical Peaks

If checked, all unmatched theoretical peaks are reported. If unchecked, unmatched theoretical peaks are reported only if a corresponding isotope pattern has been matched. This feature may spend a lot of main memory, keep it disabled if possible.

Generate Full Isotope Patterns

The full isotope patterns of compounds are generated in theoretical spectra. The FWHM value is used for this purpose. If checked, the deisotoping is disabled automatically.

Minimum Number of Isotopic Peaks

The minimum number of peaks which must be annotated in an isotopic pattern. The option "Generate Full Isotope Patterns" must be enabled.

Minimum Number of Spectra

The minimum number of spectra in which a compound must be identified to be reported.
LC-MS data = the minimum number of consecutive scans;
MSI data = the minimum number of pixels.

Minimum Number of Ion Types

The minimum number of ion types which must be matched to report a given compound. Use e.g. Ion Types: [M+H]+, [M+Na]+; Charge: 2; Neutral Losses: H2O; Maximum Number of Combined Losses: 1. If Minimum Ion Types: 2, then any pair of ions from the set of ions [M+H]+, [M+Na]+, [M+2H]2+, [M+Na+H]2+, [M+H-H2O]+, [M+Na-H2O]+, [M+2H-H2O]2+, [M+Na+H-H2O]2+ must be matched to report a given compound.

Basic Formula Check

Apply Senior's filtering rules. See https://doi.org/10.1186/1471-2105-8-105 for more details.

In 'Compound Search - MS, LC-MS, MSI' mode, the rules are applied to generated compounds. In other modes, the rules are applied to combinations of neutral losses, see https://doi.org/10.1021/acs.analchem.0c00170 for more details.

Calculate FDRs

Calculate FDRs for LC-MS and MSI data (experimental feature). It can optionally be disabled to cut the processing time in half.

Minimum Intensity of Highest Peak in Isotopic Pattern

Enter the minimum relative/absolute intensity of the most intense peak in an isotopic pattern. Isotopic patterns with relative/absolute intensities below this value will be kept in the spectrum but not annotated.

Isotope m/z Tolerance

The maximum m/z error of difference between an isotopic peak and the most intense peak in an experimental and theoretical isotopic pattern (0 = disabled) [ppm].
Example, let's have two matched peaks in an isotopic pattern of a compound, an isotopic peak with m/z error 7 ppm and the most intense peak with m/z error 4 ppm:
if m/z Error Tolerance = 10 ppm (i.e. <-10,10>), Isotope m/z Tolerance = 0 ppm (i.e. disabled) => the compound is reported as a correct hit;
if m/z Error Tolerance = 10 ppm (i.e. <-10,10>), Isotope m/z Tolerance = 2 ppm (i.e. <0,2>) => the compound is discarded because 7-4=3 is bigger than 2.

Isotope Intensity Tolerance

The maximum error tolerance of intensities of matched isotopes (0% or 100% = disabled) [in % of relative intensity of the most intense peak].
Example:
Isotope Intensity Tolerance = 10%, Relative Intensity of the Most Intense Peak = 100% => the tolerance of relative intensities of isotopes is 10%;
Isotope Intensity Tolerance = 10%, Relative Intensity of the Most Intense Peak = 50% => the tolerance of relative intensities of isotopes is 5%; etc.