CycloBranch
|
The settings dialog is opened using the command "Search -> Settings... ". Use F1 to open this help.
The application works in five different modes. This page summarizes the parameters used for dereplication. The differences in the configuration for other modes are described on extra pages, see the parameter Mode.
See also the following tutorials: Tutorial 1: Dereplication in MS1 data, Tutorial 2: Dereplication in LC-MS data, Tutorial 4: Dereplication in MSI data and fusion with histology image, and the page Dereplication.
The maximum number of threads used during a search process.
Select an input file with experimental spectra. Supported file formats:
File format | Supported types of spectra (data processing) | Supported types of spectra (visualization) | Description | Notes |
---|---|---|---|---|
txt | centroid | centroid | plain text file | Mass-to-charge ratio and intensity on each line separated by tabulator; multiple peaklists must be separated by an empty line. |
mgf | centroid | centroid | Mascot Generic File format | |
mzML | profile and centroid | profile and centroid | standard data format | m/z values and intensities must be stored as 64-bit/32-bit floats; no compression and zlib compression are supported. OpenMS 2.x must be installed. |
mzXML | centroid | centroid | standard data format | OpenMS 2.x must be installed; the FileConverter tool is used to automatically convert this file format into mzML. |
imzML | profile and centroid | profile and centroid | standard data format | Processed or continuos data file format; m/z values and intensities must be stored as 64-bit/32-bit floats; no compression. OpenMS 2.x must be installed. |
baf | profile and centroid | profile and centroid | Bruker's Analysis File | CompassXport 3.0.13.1 or CompassXtract 3.2.201 (64-bit) must be installed; see details here; Windows only. |
raw | profile and centroid | profile and centroid | Thermo raw file | OpenMS 2.x including ProteoWizard must be installed; Windows only. |
raw | profile and centroid | profile and centroid | Waters raw directory | Select a *.dat file in the raw directory; Windows only. |
mis (deprecated) | profile and centroid | centroid | Bruker's flexImaging File (old data format) | CompassXport 3.0 must be installed; Windows only. If the file "[name].mis" is selected, all the "analysis.baf" files in the subfolders of "[name]" directory are searched and processed. |
ser (deprecated) | profile and centroid | centroid | Apex file format | CompassXport 3.0 must be installed; Windows only. |
Multiple files can be selected in the modes 'Compare Peaklist(s) with Database - MS, LC-MS, MSI' and 'Compound Search - MS, LC-MS, MSI'. The feature is not available for imzML file format. If a directory name is specified, it is searched recursively and all found files are processed - the feature is currently available for baf files (Bruker) and raw directories (Waters) only.
The visualization of profile mass spectra is enabled if the checkbox next to the file name is checked. This feature is available for the following file formats:
CycloBranch supports the standard imzML data format for imaging mass spectra (see imzml.org). If the file includes profile spectra, they are automatically converted to centroided spectra using the OpenMS pipeline NoiseFilterGaussian >> BaseLineFilter >> PeakPickerHiRes (requires OpenMS 2.x installed). An external script is used to convert the raw data thus the parameters of the pipeline (e.g., S/N ratio) can be adjusted (edit the file "External/windows/raw2peaks.bat", "External/linux/raw2peaks.sh" or "External/macosx/raw2peaks.sh" depending on your data requirements and the platform). Since the data conversion may be very time-consuming (hrs/days) due to the size of a dataset (e.g., tens of gigabytes), CycloBranch stores the converted files with centroided spectra for further use. The new filenames are "filename_converted_fwhm_value.imzML" and "filename_converted_fwhm_value.ibd". CycloBranch automatically detects these files when the search process is repeated and opens a dialog to recommend their use. The "filename_converted_fwhm_value.imzML" can be used directly as a peaklist file in Settings to skip the recommendation dialog.
MS mode - the value determines the maximum charge of generated theoretical peaks; a negative value is allowed. For example, the value 3 means that theoretical peaks of compouds are generated with charges 1+, 2+, and 3+. The value -1 means that theoretical peaks of compounds are generated with charge 1-.
MS/MS mode - the value defines the charge of precursor ion; a negative value is allowed. The charge of precursor ion in an input peaklist file is ignored.
Enter the m/z error tolerance in MS mode or the fragment m/z error tolerance in MS/MS mode [ppm].
Enter the minimum threshold of relative intensity in %. Peaks with relative intensities below the threshold are removed from an input experimental peaklist.
Enter the minimum threshold of absolute intensity. Peaks with absolute intensities below the threshold are removed from an input experimental peaklist.
The relative and absolute thresholds are used simultaneously.
Enter the minimum m/z ratio and maximum m/z ratio. Experimental peaks with m/z ratios below/above the thresholds are removed from the input experimental peaklist(s). For maximum m/z, 0 = disabled.
Limit the range of retention time in which the compounds are searched in 'Compare Peaklist(s) with Database - MS, LC-MS, MSI' and 'Compound Search - MS, LC-MS, MSI' modes if LC-MS data are processed.
Full width at half maximum. The value is used if the profile spectra are converted into peaklists (mzML and imzML input files) and if the full isotope patterns of compounds are generated.
Enter the minimum and maximum intensity ratios of peaks corresponding to 54Fe and 56Fe. If the option 'Generate Full Isotope Patterns' is enabled, the value of 'Minimum Number of Isotopic Peaks' must be at least 2 to enable this feature. The feature is enabled only if the checkbox is checked.
Enter the minimum and maximum intensity ratios of peaks corresponding to 60Ni and 58Ni. If the option 'Generate Full Isotope Patterns' is enabled, the value of 'Minimum Number of Isotopic Peaks' must be at least 2 to enable this feature. The feature is enabled only if the checkbox is checked. The default maximum value is 0.5 (for FWHM <= 0.001 only). The minimum and maximum values must be estimated empirically for FWHM > 0.001, because the intesity of 60Ni may be accumulated with 13C2, etc.
Enter the minimum and maximum intensity ratios of peaks corresponding to 65Cu and 63Cu. If the option 'Generate Full Isotope Patterns' is enabled, the value of 'Minimum Number of Isotopic Peaks' must be at least 2 to enable this feature. The feature is enabled only if the checkbox is checked. The default maximum value is 0.6 (for FWHM <= 0.001 only). The minimum and maximum values must be estimated empirically for FWHM > 0.001, because the intesity of 65Cu may be accumulated with 13C2, etc.
Enter the minimum and maximum intensity ratios of peaks corresponding to 66Zn and 64Zn. If the option 'Generate Full Isotope Patterns' is enabled, the value of 'Minimum Number of Isotopic Peaks' must be at least 2 to enable this feature. The feature is enabled only if the checkbox is checked. The default maximum value is 0.7 (for FWHM <= 0.001 only). The minimum and maximum values must be estimated empirically for FWHM > 0.001, because the intesity of 66Zn may be accumulated with 13C2, etc.
Enter the minimum and maximum intensity ratios of peaks corresponding to 68Zn and 64Zn. If the option 'Generate Full Isotope Patterns' is enabled, the value of 'Minimum Number of Isotopic Peaks' must be at least 3 to enable this feature. The feature is enabled only if the checkbox is checked. The default maximum value is 0.5 (for FWHM <= 0.001 only). The minimum and maximum values must be estimated empirically for FWHM > 0.001, because the intesity of 68Zn may be accumulated with 13C4, etc.
A text file containing a database of sequences/compounds. See Format of Sequence/Compound Databases.
The types of ions for MS, LC-MS, and MSI data can be defined in global Preferences dialog. The default list of ions is shown here.
If a metal ion is selected, the isotopes having natural abundance greater than 3% are generated automatically in theoretical peaklists next to the most abundant isotopes. If the option Generate Full Isotope Patterns is selected, this feature is disabled because full isotopic patterns are calculated.
The following types of fragment ions are defined for MS/MS spectra:
Note that y-ions are not supported when cyclic peptides are searched. Fragment ions series of linear and cyclic polyketides are described on extra pages. See Nomenclature of Linear Polyketide Series and Nomenclature of Cyclic Polyketide Series.
Buttons:
Define and select the types of neutral losses which will be used when generating theoretical spectra.
Buttons:
The default neutral losses are defined as follows:
Formula | Description |
---|---|
H2O | water |
NH3 | ammonia |
CO | formyl group |
CO2 | carboxyl group |
CONH | carbamyl group |
CH2 | shortened carbon chain |
C6H4 | benzene ring removal |
CH2N2 | part of Arg side chain |
CH2O | Ser side chain |
CH2S | Cys side chain |
C4H9N | Lys side chain |
C4H4N2 | His side chain |
C4H9N3 | Arg side chain |
C9H7N | Trp side chain |
In MS/MS mode, the molecular formulas of proteinogenic amino acids side chains can be optionally combined from the default values as follows:
Amino acid | Side chain | Amino acid | Side chain | |
---|---|---|---|---|
Gly | - | Asp | C2H2O2 (CH2+CO2) | |
Ala | CH2 | Gln | C3H5NO (CH2+CH2+CONH) | |
Ser | CH2O | Lys | C4H9N | |
Pro | C3H4 (not defined) | Glu | C3H4O2 (CH2+CH2+CO2) | |
Val | C3H6 (CH2+CH2+CH2) | Met | C3H6S (CH2+CH2+CH2S) | |
Thr | C2H4O (CH2+CH2O) | His | C4H4N2 | |
Cys | CH2S | Phe | C7H6 (CH2+C6H4) | |
Leu | C4H8 (CH2+CH2+CH2+CH2) | Arg | C4H9N3 | |
Ile | C4H8 (CH2+CH2+CH2+CH2) | Tyr | C7H6O (CH2O+C6H4) | |
Asn | C2H3NO (CH2+CONH) | Trp | C9H7N |
In some cases, the list of neutral losses can be replaced by a list of chemical elements (e.g. H, C, O, N, S, and P). This is useful if an experimental spectrum contains peaks corresponding to ions with unknown neutral losses, if a metabolite (i.e. a non-peptidic compound) is fragmented or if the mode 'Compound Search - MS, LC-MS, MSI' is used.
Maximum number of combined neutral losses.
If checked, all unmatched theoretical peaks are reported. If unchecked, unmatched theoretical peaks are reported only if a corresponding isotope pattern has been matched. This feature may spend a lot of main memory, keep it disabled if possible.
The full isotope patterns of compounds are generated in theoretical spectra. The FWHM value is used for this purpose. If checked, the deisotoping is disabled automatically.
The minimum number of peaks which must be annotated in an isotopic pattern. The option "Generate Full Isotope Patterns" must be enabled.
The minimum number of spectra in which a compound must be identified to be reported.
LC-MS data = the minimum number of consecutive scans;
MSI data = the minimum number of pixels.
The minimum number of ion types which must be matched to report a given compound. Use e.g. Ion Types: [M+H]+, [M+Na]+; Charge: 2; Neutral Losses: H2O; Maximum Number of Combined Losses: 1. If Minimum Ion Types: 2, then any pair of ions from the set of ions [M+H]+, [M+Na]+, [M+2H]2+, [M+Na+H]2+, [M+H-H2O]+, [M+Na-H2O]+, [M+2H-H2O]2+, [M+Na+H-H2O]2+ must be matched to report a given compound.
Apply Senior's filtering rules. See https://doi.org/10.1186/1471-2105-8-105 for more details.
In 'Compound Search - MS, LC-MS, MSI' mode, the rules are applied to generated compounds. In other modes, the rules are applied to combinations of neutral losses, see https://doi.org/10.1021/acs.analchem.0c00170 for more details.
Calculate FDRs for LC-MS and MSI data (experimental feature). It can optionally be disabled to cut the processing time in half.
Enter the minimum relative/absolute intensity of the most intense peak in an isotopic pattern. Isotopic patterns with relative/absolute intensities below this value will be kept in the spectrum but not annotated.
The maximum m/z error of difference between an isotopic peak and the most intense peak in an experimental and theoretical isotopic pattern (0 = disabled) [ppm].
Example, let's have two matched peaks in an isotopic pattern of a compound, an isotopic peak with m/z error 7 ppm and the most intense peak with m/z error 4 ppm:
if m/z Error Tolerance = 10 ppm (i.e. <-10,10>), Isotope m/z Tolerance = 0 ppm (i.e. disabled) => the compound is reported as a correct hit;
if m/z Error Tolerance = 10 ppm (i.e. <-10,10>), Isotope m/z Tolerance = 2 ppm (i.e. <0,2>) => the compound is discarded because 7-4=3 is bigger than 2.
The maximum error tolerance of intensities of matched isotopes (0% or 100% = disabled) [in % of relative intensity of the most intense peak].
Example:
Isotope Intensity Tolerance = 10%, Relative Intensity of the Most Intense Peak = 100% => the tolerance of relative intensities of isotopes is 10%;
Isotope Intensity Tolerance = 10%, Relative Intensity of the Most Intense Peak = 50% => the tolerance of relative intensities of isotopes is 5%; etc.