CycloBranch
Settings

Settings Dialog

The settings dialog is opened using the command "Search -> Settings... ". Use F1 to open this help.

settings.png
Main settings of the engine.

Basic Buttons

  • OK (Enter) - Accept changes and hide window.
  • Cancel (Esc) - Drop changes and hide window.
  • Apply - Accept changes and keep window opened.
  • Load (CTRL + L) - Load settings from a file (*.ini).
  • Save (CTRL + S) - Save settings in the current file (*.ini). When a file has not been loaded yet, the "Save As ..." file dialog is opened.
  • Save As... (CTRL + D) - Save settings into a file (*.ini).
  • Edit (CTRL + E) - Edit the peptide sequence using the Draw Peptide Tool. The sequence is imported into the draw peptide tool and visualized if its syntax is correct (see also Format of Sequence Databases).

Search

Mode

  • 'De Novo Search Engine' - the default mode of the application.
  • 'Compare Peaklist with Spectrum of Searched Sequence' - a theoretical spectrum is generated for the input "Searched Sequence" and is compared with the peaklist.
  • 'Compare Peaklist with Database - MS/MS data' - a peaklist is compared with theoretical spectra generated from a database of sequences.
  • 'Compare Peaklist(s) with Database - MS or MSI data' - compound search; dereplication; the peaklists are compared with theoretical peaks generated from a database of compounds/sequences.

Maximum Number of Threads

A maximum number of threads used if an experimental peaklist is compared with theoretical spectra of peptide sequence candidates.


Experimental Spectrum/Spectra

Peptide Type

Select the type of peptide or polyketide.

File (CTRL + P)

Select a file. Supported file formats:

File format Supported data type Description Notes
txt peaklists plain text file Mass-to-charge ratio and intensity on each line separated by tabulator; multiple peaklists must be separated by an empty line.
mgf peaklists Mascot Generic File format
mzML profile spectra or peaklists standard data format m/z values and intensities must be stored as 64-bit/32-bit floats; no compression. OpenMS 2.x must be installed.
mzXML peaklists standard data format OpenMS 2.x must be installed; the FileConverter tool is used to automatically convert this file format into mgf.
imzML profile spectra or peaklists standard data format Processed or continuos data file format; m/z values and intensities must be stored as 64-bit/32-bit floats; no compression. OpenMS 2.x must be installed.
baf profile spectra Bruker's Analysis File CompassXport 3.0 must be installed; Windows only.
raw profile spectra or peaklists Thermo raw file OpenMS 2.x including ProteoWizard must be installed; Windows only.
raw peaklists Waters raw directory Select a *.dat file in the raw directory; Windows only.
mis (deprecated) profile spectra Bruker's flexImaging File (old data format) CompassXport 3.0 must be installed; Windows only. If the file "[name].mis" is selected, all the "analysis.baf" files in the subfolders of "[name]" directory are searched and processed.
ser (deprecated) profile spectra Apex file format

CompassXport 3.0 must be installed; Windows only.

CycloBranch supports the standard imzML data format for imaging mass spectra (see imzml.org). If the file includes profile spectra, they are automatically converted to centroided spectra using the OpenMS pipeline NoiseFilterGaussian >> BaseLineFilter >> PeakPickerHiRes (requires OpenMS 2.x installed). An external script is used to convert the raw data thus the parameters of the pipeline (e.g., S/N ratio) can be adjusted (edit the file "External/windows/raw2peaks.bat", "External/linux/raw2peaks.sh" or "External/macosx/raw2peaks.sh" depending on your data requirements and the platform). Since the data conversion may be very time-consuming (hrs/days) due to the size of a dataset (e.g., tens of gigabytes), CycloBranch stores the converted files with centroided spectra for further use. The new filenames are "filename_converted_fwhm_value.imzML" and "filename_converted_fwhm_value.ibd". CycloBranch automatically detects these files when the search process is repeated and opens a dialog to recommend their use. The "filename_converted_fwhm_value.imzML" can be used directly as a peaklist file in Settings to skip the recommendation dialog.

msi-dialog.png
The dialog which recommends the use of centroided spectra instead of profile spectra.

Scan no.

Number of a spectrum to be processed in a LC-MS/MS data file. The number may be different from an internal scan number stored in the input data file.

Precursor m/z Ratio

Enter the precursor mass-to-charge (m/z) ratio. The value is decharged automatically. The precursor m/z ratio in the input peaklist file is ignored.

Precursor Ion Adduct

Enter the formula of a precursor ion adduct (e.g., Na, K, Li). H is used by default if the field is empty. If a metal is used (e.g., FeH-2) and the option "Generate Full Isotope Patterns" is disabled, the peaks of isotopes having natural abundances greater than 3% are generated automatically in theoretical peaklists next to the peaks of most abundant isotopes.

Examples of precursor ion adducts:

Precursor IonPrecursor Adduct
[M+H]+H or empty
[M+Na]+Na
[M+K]+K
[M-H]-H or empty
[M+Fe-2H]+FeH-2
[M+Fe-3H+Na]+FeH-3Na
[M+Fe-4H]-FeH-4
[M+Al-2H]+AlH-2
[M+Si-3H]+SiH-3
[M+Zn-H]+ZnH-1
[M+Zn-2H+Na]+ZnH-2Na
[M+Zn-3H]-ZnH-3

Charge

Enter the charge of precursor ion. Negative values are allowed. The charge of precursor ion in the input peaklist file is ignored. If the mode "Compare Peaklist(s) with Database - MS or MSI data" is used, the value determines the maximum charge of generated theoretical peaks. For example, 3 means that theoretical peaks of compouds are generated with charges 1+, 2+, and 3+. The value -1 means that theoretical peaks of compounds are generated with charge 1-.

Precursor m/z Error Tolerance

Enter the precursor m/z error tolerance in ppm.

m/z Error Tolerance

Enter the m/z error tolerance in MS mode or the fragment m/z error tolerance in MS/MS mode [ppm].

Minimum Threshold of Relative Intensity

Enter the minimum threshold of relative intensity in %. Peaks with relative intensities below the threshold are removed from the peaklist.

Minimum Threshold of Absolute Intensity

Enter the minimum threshold of absolute intensity. Peaks with absolute intensities below the threshold are removed from the peaklist.

Minimum m/z Ratio

Enter the minimum m/z ratio. Peaks with m/z ratios below the threshold are removed from the peaklist.

FWHM

Full width at half maximum. The value is used if the profile spectra are converted into peaklists (mzML and imzML) and if the full isotope patterns of compounds are generated.


Database of Building Blocks

Building Blocks Database File (CTRL + B)

A text file containing a database of building blocks. See Format of Building Blocks Databases.

Maximum Number of Combined Blocks (start / middle / end)

Maximum number of combined building blocks to skip a gap in a de novo graph. A small value speeds up the search and vice versa. Depending on the position of a gap between peaks in a spectrum, the three different values are set up:

  • Start: Maximum number of combined building blocks to skip a gap leading from a start point in a de novo graph.
  • Middle: Maximum number of combined building blocks to skip a gap in the middle of the graph. Except gaps leading from a start point or to an end point.
  • End: Maximum number of combined building blocks to skip a gap leading to an end point in the de novo graph.

In the following scheme, beauverolide I is detected. Since the peak between m/z 1 and 304 is missing, the minimum values start/middle/end are 2/1/1. Otherwise, beauverolide I is not detected when "Incomplete Paths in De Novo Graph" is set up to "keep" or "remove".

denovograph-v4.png
De novo graph created from mass spectrum of beauverolide I.

Tip: When an unknown molecule is analyzed, check peptide sequence tags first. Use the following settings:

  • Set up start/middle/end to 1/1/1.
  • Set up the "Peptide Type" to "linear" or "cyclic".
  • Set up the "Incomplete Paths in De Novo Graph" to "connect".

When a branched peptide is analyzed, the minimum values are 1/2/1 (the branch is in the middle), 1/1/3 (the branch is at the end) or 3/1/1 (the branch is at the beginning). If smaller values are used, the branch cannot be detected. See also Branched Sequence Detection.

When a branch-cyclic peptide is analyzed, the minimum values are 1/2/1, 1/1/2 or 2/1/1. See also Branch-cyclic Sequence Detection.

Maximum Cumulative Mass of Blocks

Enter the maximum cumulative mass of combined blocks (0 = the maximum mass is unlimited). A small value speeds up the search and vice versa.

N-/C-terminal Modifications File (CTRL + M)

A text file containing a list of N-terminal and C-terminal modifications. See Format of Modification Databases.


Miscellaneous

Incomplete Paths in De Novo Graph

The operation to be performed with edges forming incomplete paths (i.e., the paths which do not lead from the start node or do not lead to the end node). The following options can be selected:

  • keep (edges are kept - useful if you would like to see the whole de novo graph in 'View -> Graph')
  • remove (edges are removed - speeds up the search)
  • connect (edges are connected - useful if you are looking for sequence tags)

Cyclic N-terminus

The water molecule is subtracted from all theoretical N-terminal fragment ions and the theoretical precursor mass. This feature is useful if a linear peptide includes a small cycle close to the N-terminus. If the linear polyketide is selected as the peptide type, the water molecule is subtracted only from the precursor ion.

Cyclic C-terminus

The water molecule is subtracted from all theoretical C-terminal fragment ions and the theoretical precursor mass. This feature is useful if a linear peptide includes a small cycle close to the C-terminus. If the linear polyketide is selected as the peptide type, the water molecule is subtracted only from the precursor ion.

Enable Scrambling

Generate scrambled fragment ions of cyclic peptides in theoretical spectra.

Disable Precursor Mass Filter

Disable the filtering of sequence candidates by precursor mass. This option can be used to determine a peptide family when a modified peptide is included in a sequence database.

Regular Order of Ketide Blocks

Keep only polyketide sequence candidates whose ketide building blocks are in the regular order [water eliminating block]-[2H eliminating block]-[water eliminating block]-[2H eliminating block], etc.


Theoretical Spectrum/Spectra

Sequence/Compound Database File (CTRL + T)

A text file containing a database of sequences/compounds. See Format of Sequence/Compound Databases.

Score Type

A score for peptide-spectrum matches. The following scores are predefined:

  • Number of matched peaks
  • Sum of relative intensities of matched peaks
  • Number of b-ions
  • Number of y-ions
  • Number of b-ions + y-ions

Maximum Number of Sequence Candidates Reported

A maximum length of an output report with peptide sequence candidates.

Peptide Sequence Tag

Each peptide sequence candidate generated from a de novo graph must fulfil the peptide sequence tag. Otherwise, its theoretical spectrum is not generated and the peptide sequence candidate is excluded from the search. A name of a building block must be enclosed in '[' and ']'. Enclosed building blocks must be separated by '-'. A branch of a branched or a branch-cyclic peptide must be enclosed in "\(" and "\)". Additional backslashes must be used because of a special meaning of '(' and ')' in regular expressions. The tag can be a regular expression in ECMAScript syntax except symbols '[' and ']' which have a special meaning and enclose a building block. Examples of peptide sequence tags:

  • [Val]-[Lac]-[Val]-[Hiv] (a peptide sequence candidate must contain the following subsequence: valine, lactic acid, valine, 2-hydroxyisovaleric acid)
  • ([Val]-[Lac]-[Val]-[Hiv]-*){3} (the tag must be repeated exactly 3x in a peptide sequence candidate, i.e., the tag corresponds to valinomycin; the symbol '*' means that the previous symbol '-' may or may not be present)
  • [Pro]-[Ile]-[Ile]\([Orn]-[N-Ac-Ile]\)[Phe] (a tag corresponding to the branched peptide linearized pseudacyclin A)

Note that a peptide sequence candidate is kept if the tag is detected in any linearized sequence of a cyclic, branched or branch-cyclic peptide sequence. See also Format of Sequence Databases.

Ion Types

The following types of fragment ions are defined for MS/MS spectra:

  • A (a-ion)
  • B (b-ion)
  • C (c-ion)
  • X (x-ion)
  • Y (y-ion)
  • Z (z-ion)

Note that y-ions are not supported when cyclic peptides are searched. Fragment ions series of linear and cyclic polyketides are described on extra pages. See Nomenclature of Linear Polyketide Series and Nomenclature of Cyclic Polyketide Series.

Buttons:

  • "Select All" - select all fragment ions in the list.
  • "Clear All" - unselect all fragment ions in the list.
  • "Reset" - b-ion is selected automatically if a cyclic peptide is choosen as "Peptide Type". B-ion and y-ion are selected if a linear, branched or branch-cyclic peptide is choosen. [M+H]+ ion is selected if the mode 'Compare Peaklist(s) with Database - MS or MSI data' is selected.

The following types of ions are defined for MS and MSI spectra:

  • [M+H]+
  • [M+Na]+
  • [M+K]+
  • [M-H]-
  • [M+Na-2H]-
  • [M+K-2H]-
  • [M+Fe-2H]+
  • [M+Fe-3H+Na]+
  • [M+Fe-3H+K]+
  • [2M+Fe-2H]+
  • [2M+Fe-3H+Na]+
  • [2M+Fe-3H+K]+
  • [3M+Fe-2H]+
  • [3M+Fe-3H+Na]+
  • [3M+Fe-3H+K]+
  • [3M+2Fe-5H]+
  • [3M+2Fe-6H+Na]+
  • [3M+2Fe-6H+K]+
  • [M+Fe-4H]-
  • [2M+Fe-4H]-
  • [3M+Fe-4H]-
  • [3M+2Fe-7H]-
  • [M+NH4]+
  • [M]+
  • [M]-
  • [M+Li]+
  • [M+Mg-H]+
  • [M+Mg-2H+Na]+
  • [M+Mg-2H+K]+
  • [M+Mg-3H]-
  • [M+Al-2H]+
  • [M+Al-3H+Na]+
  • [M+Al-3H+K]+
  • [M+Al-4H]-
  • [M+Si-3H]+
  • [M+Si-4H+Na]+
  • [M+Si-4H+K]+
  • [M+Si-5H]-
  • [M+Ca-H]+
  • [M+Ca-2H+Na]+
  • [M+Ca-2H+K]+
  • [M+Ca-3H]-
  • [M+Sc-2H]+
  • [M+Sc-3H+Na]+
  • [M+Sc-3H+K]+
  • [M+Sc-4H]-
  • [M+Cr-2H]+
  • [M+Cr-3H+Na]+
  • [M+Cr-3H+K]+
  • [M+Cr-4H]-
  • [M+Mn-H]+
  • [M+Mn-2H+Na]+
  • [M+Mn-2H+K]+
  • [M+Mn-3H]-
  • [M+Co-H]+
  • [M+Co-2H+Na]+
  • [M+Co-2H+K]+
  • [M+Co-3H]-
  • [M+Ni-H]+
  • [M+Ni-2H+Na]+
  • [M+Ni-2H+K]+
  • [M+Ni-3H]-
  • [M+Cu-H]+
  • [M+Cu-2H+Na]+
  • [M+Cu-2H+K]+
  • [M+Cu-3H]-
  • [M+Zn-H]+
  • [M+Zn-2H+Na]+
  • [M+Zn-2H+K]+
  • [M+Zn-3H]-
  • [M+Ga-2H]+
  • [M+Ga-3H+Na]+
  • [M+Ga-3H+K]+
  • [M+Ga-4H]-

If a metallic ion is selected, the isotopes having natural abundance greater than 3% are generated automatically in theoretical peaklists next to the most abundant isotopes. If the option Generate Full Isotope Patterns is selected, this feature is disabled and vice versa.

Neutral Losses

Define and select the types of neutral losses which will be generated in theoretical spectra.

Buttons:

  • "Select All" - select all neutral losses in the list.
  • "Clear All" - unselect all neutral losses in the list.
  • "Add" - add a new item at the end of the list - a valid molecular formula must be entered.
  • "Remove" - remove selected items from the list.
  • "Default" - clear the list and defines a list of default neutral losses.

The default neutral losses are defined as follows:

FormulaDescription
H2Owater
NH3ammonia
COformyl group
CO2carboxyl group
CONHcarbamyl group
CH2shortened carbon chain
C6H4benzene ring removal
CH2N2part of Arg side chain
CH2OSer side chain
CH2SCys side chain
C4H9NLys side chain
C4H4N2His side chain
C4H9N3Arg side chain
C9H7NTrp side chain

The molecular formulas of proteinogenic amino acids side chains can be optionally combined from the default values as follows:

Amino acidSide chainAmino acidSide chain
Gly-AspC2H2O2 (CH2+CO2)
AlaCH2GlnC3H5NO (CH2+CH2+CONH)
SerCH2OLysC4H9N
ProC3H4 (not defined)GluC3H4O2 (CH2+CH2+CO2)
ValC3H6 (CH2+CH2+CH2)MetC3H6S (CH2+CH2+CH2S)
ThrC2H4O (CH2+CH2O)HisC4H4N2
CysCH2SPheC7H6 (CH2+C6H4)
LeuC4H8 (CH2+CH2+CH2+CH2)ArgC4H9N3
IleC4H8 (CH2+CH2+CH2+CH2)TyrC7H6O (CH2O+C6H4)
AsnC2H3NO (CH2+CONH)TrpC9H7N

Maximum Number of Combined Losses

Maximum number of combined neutral losses.

Report Unmatched Theoretical Peaks

If checked, all unmatched theoretical peaks are reported. If unchecked, unmatched theoretical peaks are reported only if a corresponding isotope pattern has been matched. This feature may spend a lot of main memory, keep it disabled if possible.

Generate Full Isotope Patterns

The full isotope patterns of compounds are generated in theoretical spectra. The FWHM value is used for this purpose. If checked, the deisotoping is disabled automatically.

Minimum Pattern Size

The minimum number of peaks that must be matched in an isotope pattern of a compound to be reported (MS and MSI). "Generate Full Isotope Patterns" must be enabled.


Searched Sequence

Sequence

A peptide sequence which you are searching for or a peptide sequence tag. A peptide sequence must be entered if "Mode" is set up to "Compare Peaklist with Spectrum of Searched Sequence". Otherwise, the option is similar to "Peptide Sequence Tag" with a difference that a peptide sequence candidate is not removed from the search but it is just highlighted in the output list of peptide sequence candidates. If "Compare Peaklist with Spectrum of Searched Sequence" mode is used, the only possible format of the expression is a peptide sequence (e.g.,

[Val]-[Lac]-[Val]-[Hiv]-[Val]-[Lac]-[Val]-[Hiv]-[Val]-[Lac]-[Val]-[Hiv]

or

[Pro]-[Ile]-[Ile]\([Orn]-[N-Ac-Ile]\)[Phe]

). Regular expressions like

([Val]-[Lac]-[Val]-[Hiv]-*){3}

are not supported in this mode. See also Format of Sequence Databases.

N-terminal Modification

A name of an N-terminal modification which belongs to the searched peptide. The name must be defined in N-/C-terminal Modifications File.

C-terminal Modification

A name of a C-terminal modification which belongs to the searched peptide. The name must be defined in N-/C-terminal Modifications File.

Branch Modification

A name of an N-terminal or C-terminal modification which belongs to a branch of a searched peptide (branched and branch-cyclic peptides only). The name must be defined in N-/C-terminal Modifications File.