CycloBranch
Settings

Settings Dialog

The settings dialog is opened using the command "Search -> Settings... ". Use F1 to open this help.

settings.png
Main settings of the engine.

Basic Buttons

  • OK (Enter) - Accept changes and hide window.
  • Cancel (Esc) - Drop changes and hide window.
  • Apply - Accept changes and keep window opened.
  • Load (CTRL + L) - Load settings from a file (*.ini).
  • Save (CTRL + S) - Save settings in the current file (*.ini). When a file has not been loaded yet, the "Save As ..." file dialog is opened.
  • Save As... (CTRL + D) - Save settings into a file (*.ini).
  • Edit (CTRL + E) - Edit the peptide sequence using the Draw Peptide Tool. The sequence is imported into the draw peptide tool and visualized if its syntax is correct (see also Format of Sequence Databases).

Spectrum

Peptide Type

Select the type of peptide or polyketide.

File (CTRL + P)

Select a file. Supported file formats:

File format Supported data type Description Notes
txt peaklists plain text file Mass-to-charge ratio and intensity on each line separated by tabulator; multiple peaklists must be separated by an empty line.
mgf peaklists Mascot Generic File format
mzML profile spectra or peaklists standard data format m/z values and intensities must be stored as 64-bit/32-bit floats; no compression. OpenMS 2.0/1.11 must be installed.
mzXML peaklists standard data format OpenMS 2.0/1.11 must be installed; the FileConverter tool is used to automatically convert this file format into mgf.
imzML profile spectra or peaklists standard data format Processed or continuos data file format; m/z values and intensities must be stored as 64-bit/32-bit floats; no compression. OpenMS 2.0/1.11 must be installed.
baf profile spectra Bruker's Analysis File CompassXport 3.0 must be installed; Windows only.
raw peaklists Waters raw directory Select a *.dat file in the raw directory; Windows only.
mis (deprecated) profile spectra Bruker's flexImaging File (old data format) CompassXport 3.0 must be installed; Windows only. If the file "[name].mis" is selected, all the "analysis.baf" files in the subfolders of "[name]" directory are searched and processed.
ser (deprecated) profile spectra Apex file format

CompassXport 3.0 must be installed; Windows only.

CycloBranch supports the standard imzML data format for imaging mass spectra (see imzml.org). If the file includes profile spectra, they are automatically converted to centroided spectra using the OpenMS pipeline NoiseFilterGaussian >> BaseLineFilter >> PeakPickerHiRes (requires OpenMS 2.0/1.11 installed). An external script is used to convert the raw data thus the parameters of the pipeline (e.g., S/N ratio) can be adjusted (edit the file "External/windows/raw2peaks.bat", "External/linux/raw2peaks.sh" or "External/macosx/raw2peaks.sh" depending on your data requirements and the platform). Since the data conversion may be very time-consuming (hrs/days) due to the size of a dataset (e.g., tens of gigabytes), CycloBranch stores the converted files with centroided spectra for further use. The new filenames are "filename_converted_fwhm_value.imzML" and "filename_converted_fwhm_value.ibd". CycloBranch automatically detects these files when the search process is repeated and opens a dialog to recommend their use. The "filename_converted_fwhm_value.imzML" can be used directly as a peaklist file in Settings to skip the recommendation dialog.

msi-dialog.png
The dialog which recommends the use of centroided spectra instead of profile spectra.

Scan no.

Number of a spectrum to be processed in a LC-MS/MS data file. The number may be different from an internal scan number stored in the input data file.

Precursor m/z Ratio

Enter the precursor mass-to-charge (m/z) ratio. The value is decharged automatically. The precursor m/z ratio in the input peaklist file is ignored.

Precursor Ion Adduct

Enter the formula of a precursor ion adduct (e.g., Na, K, Li). H is used by default if the field is empty. If a metal is used (e.g., FeH-2) and the option "Generate Full Isotope Patterns" is disabled, the peaks of isotopes having natural abundances greater than 3% are generated automatically in theoretical peaklists next to the peaks of most abundant isotopes.

Examples of precursor ion adducts:

Precursor IonPrecursor Adduct
[M+H]+H or empty
[M+Na]+Na
[M+K]+K
[M-H]-H or empty
[M+Fe-2H]+FeH-2
[M+Fe-3H+Na]+FeH-3Na
[M+Fe-4H]-FeH-4
[M+Al-2H]+AlH-2
[M+Zn-H]+ZnH-1
[M+Zn-2H+Na]+ZnH-2Na
[M+Zn-3H]-ZnH-3

Charge

Enter the charge of precursor ion. Negative values are allowed. The charge of precursor ion in the input peaklist file is ignored. If the mode "Compare Peaklist(s) with Database - MS or MSI data" is used, the value determines the maximum charge of generated theoretical peaks. For example, 3 means that theoretical peaks of compouds are generated with charges 1+, 2+, and 3+. The value -1 means that theoretical peaks of compounds are generated with charge 1-.

Precursor m/z Error Tolerance

Enter the precursor m/z error tolerance in ppm.

m/z Error Tolerance

Enter the m/z error tolerance in MS mode or the fragment m/z error tolerance in MS/MS mode [ppm].

m/z Error Tolerance for Deisotoping

Enter the m/z error tolerance for deisotoping in MS/MS mode [ppm] (the same value like "m/z Error Tolerance" is recommended by default; 0 = the deisotoping is disabled).

Minimum Threshold of Relative Intensity

Enter the minimum threshold of relative intensity in %. Peaks with relative intensities below the threshold are removed from the peaklist.

Minimum Threshold of Absolute Intensity

Enter the minimum threshold of absolute intensity. Peaks with absolute intensities below the threshold are removed from the peaklist.

Minimum m/z Ratio

Enter the minimum m/z ratio. Peaks with m/z ratios below the threshold are removed from the peaklist.

FWHM

Full width at half maximum. The value is used if the profile spectra are converted into peaklists (mzML and imzML) and if full isotope patterns of compounds are generated (MS and MSI).


Database of Building Blocks

Building Blocks Database File (CTRL + B)

A text file containing a database of building blocks. See Format of Building Blocks Databases.

Maximum Number of Combined Blocks (start / middle / end)

Maximum number of combined building blocks to skip a gap in a de novo graph. A small value speeds up the search and vice versa. Depending on the position of a gap between peaks in a spectrum, the three different values are set up:

  • Start: Maximum number of combined building blocks to skip a gap leading from a start point in a de novo graph.
  • Middle: Maximum number of combined building blocks to skip a gap in the middle of the graph. Except gaps leading from a start point or to an end point.
  • End: Maximum number of combined building blocks to skip a gap leading to an end point in the de novo graph.

In the following scheme, beauverolide I is detected. Since the peak between m/z 1 and 304 is missing, the minimum values start/middle/end are 2/1/1. Otherwise, beauverolide I is not detected when "Incomplete Paths in De Novo Graph" is set up to "keep" or "remove".

denovograph-v4.png
De novo graph created from mass spectrum of beauverolide I.

Tip: When an unknown molecule is analyzed, check peptide sequence tags first. Use the following settings:

  • Set up start/middle/end to 1/1/1.
  • Set up the "Peptide Type" to "linear" or "cyclic".
  • Set up the "Incomplete Paths in De Novo Graph" to "connect".

When a branched peptide is analyzed, the minimum values are 1/2/1 (the branch is in the middle), 1/1/3 (the branch is at the end) or 3/1/1 (the branch is at the beginning). If smaller values are used, the branch cannot be detected. See also Branched Sequence Detection.

When a branch-cyclic peptide is analyzed, the minimum values are 1/2/1, 1/1/2 or 2/1/1. See also Branch-cyclic Sequence Detection.

Maximum Cumulative Mass of Blocks

Enter the maximum cumulative mass of combined blocks (0 = the maximum mass is unlimited). A small value speeds up the search and vice versa.

N-/C-terminal Modifications File (CTRL + M)

A text file containing a list of N-terminal and C-terminal modifications. See Format of Modification Databases.


Miscellaneous

Incomplete Paths in De Novo Graph

The operation to be performed with edges forming incomplete paths (i.e., the paths which do not lead from the start node or do not lead to the end node). The following options can be selected:

  • keep (edges are kept - useful if you would like to see the whole de novo graph in 'View -> Graph')
  • remove (edges are removed - speeds up the search)
  • connect (edges are connected - useful if you are looking for sequence tags)

Cyclic N-terminus

The water molecule is subtracted from all theoretical N-terminal fragment ions and the theoretical precursor mass. This feature is useful if a linear peptide includes a small cycle close to the N-terminus. If the linear polyketide is selected as the peptide type, the water molecule is subtracted only from the precursor ion.

Cyclic C-terminus

The water molecule is subtracted from all theoretical C-terminal fragment ions and the theoretical precursor mass. This feature is useful if a linear peptide includes a small cycle close to the C-terminus. If the linear polyketide is selected as the peptide type, the water molecule is subtracted only from the precursor ion.

Enable Scrambling

Generate scrambled fragment ions of cyclic peptides in theoretical spectra.

Disable Precursor Mass Filter

Disable the filtering of sequence candidates by precursor mass. This option can be used to determine a peptide family when a modified peptide is included in a sequence database.

Regular Order of Ketide Blocks

Keep only polyketide sequence candidates whose ketide building blocks are in the regular order [water eliminating block]-[2H eliminating block]-[water eliminating block]-[2H eliminating block], etc.


Application

Mode

  • 'De Novo Search Engine' - the default mode of the application.
  • 'Compare Peaklist with Spectrum of Searched Sequence' - a theoretical spectrum is generated for the input "Searched Sequence" and is compared with the peaklist.
  • 'Compare Peaklist with Database - MS/MS data' - a peaklist is compared with theoretical spectra generated from a database of sequences.
  • 'Compare Peaklist(s) with Database - MS or MSI data' - compound search; dereplication; the peaklists are compared with theoretical peaks generated from a database of compounds/sequences.

Sequence/Compound Database File (CTRL + T)

A text file containing a database of sequences/compounds. See Format of Sequence/Compound Databases.

Maximum Number of Threads

A maximum number of threads used if the peaklist is compared with theoretical spectra of peptide sequence candidates.

Score Type

A score for peptide-spectrum matches. The following scores are predefined:

  • Number of b-ions
  • Number of b-ions + dehydrated b-ions
  • Number of b-ions + deamidated b-ions
  • Number of y-ions + b-ions (not for cyclic peptides)
  • Number of y-ions (not for cyclic peptides)
  • Sum of relative intensities of matched peaks
  • Number of matched peaks
  • Number of matched bricks (cyclic peptides; see Cyclic Peptides)

Maximum Number of Sequence Candidates Reported

A maximum length of an output report with peptide sequence candidates. A big value may slow down the search and a lot of main memory may be spent.

Peptide Sequence Tag

Each peptide sequence candidate generated from a de novo graph must fulfil the peptide sequence tag. Otherwise, its theoretical spectrum is not generated and the peptide sequence candidate is excluded from the search. A name of a building block must be enclosed in '[' and ']'. Enclosed building blocks must be separated by '-'. A branch of a branched or a branch-cyclic peptide must be enclosed in "\(" and "\)". Additional backslashes must be used because of a special meaning of '(' and ')' in regular expressions. The tag can be a regular expression in ECMAScript syntax except symbols '[' and ']' which have a special meaning and enclose a building block. Examples of peptide sequence tags:

  • [Val]-[Lac]-[Val]-[Hiv] (a peptide sequence candidate must contain the following subsequence: valine, lactic acid, valine, 2-hydroxyisovaleric acid)
  • ([Val]-[Lac]-[Val]-[Hiv]-*){3} (the tag must be repeated exactly 3x in a peptide sequence candidate, i.e., the tag corresponds to valinomycin; the symbol '*' means that the previous symbol '-' may or may not be present)
  • [Pro]-[Ile]-[Ile]\([Orn]-[N-Ac-Ile]\)[Phe] (a tag corresponding to the branched peptide linearized pseudacyclin A)

Note that a peptide sequence candidate is kept if the tag is detected in any linearized sequence of a cyclic, branched or branch-cyclic peptide sequence. See also Format of Sequence Databases.

Ion Types in Theoretical Spectra

The following types of fragment ions are predefined for MS/MS spectra:

  • A (a-ions)
  • A* (dehydrated a-ions)
  • Ax (deamidated a-ions)
  • A*x (dehydrated and deamidated a-ions)
  • B (b-ions)
  • B* (dehydrated b-ions)
  • Bx (deamidated b-ions)
  • B*x (dehydrated and deamidated b-ions)
  • C (c-ions)
  • C* (dehydrated c-ions)
  • Cx (deamidated c-ions)
  • C*x (dehydrated and deamidated c-ions)
  • X (x-ions)
  • X* (dehydrated x-ions)
  • Xx (deamidated x-ions)
  • X*x (dehydrated and deamidated x-ions)
  • Y (y-ions)
  • Y* (dehydrated y-ions)
  • Yx (deamidated y-ions)
  • Y*x (dehydrated and deamidated y-ions)
  • Z (z-ions)
  • Z* (dehydrated z-ions)
  • Zx (deamidated z-ions)
  • Z*x (dehydrated and deamidated z-ions)

Note that y-ions and their derivates are not supported when cyclic peptides are searched. The button "Select All" selects all fragment ions in the list. The button "Clear All" unselects all fragment ions in the list. The button "Reset" selects b-ions when a cyclic peptide is choosen as "Peptide Type". It selects b-ions and y-ions when a linear, branched or branch-cyclic peptide is choosen. When mode is set up to 'Compare Peaklist(s) with Database - MS or MSI data', [M+H]+ ion is selected.

Fragment ions series of linear and cyclic polyketides are described on extra pages. See Nomenclature of Linear Polyketide Series and Nomenclature of Cyclic Polyketide Series.

The following types of ions are predefined for MS and MSI spectra:

  • [M+H]+
  • [M+Na]+
  • [M+K]+
  • [M-H]-
  • [M+Na-2H]-
  • [M+K-2H]-
  • [M+Fe-2H]+
  • [M+Fe-3H+Na]+
  • [M+Fe-3H+K]+
  • [2M+Fe-2H]+
  • [2M+Fe-3H+Na]+
  • [2M+Fe-3H+K]+
  • [3M+Fe-2H]+
  • [3M+Fe-3H+Na]+
  • [3M+Fe-3H+K]+
  • [3M+2Fe-5H]+
  • [3M+2Fe-6H+Na]+
  • [3M+2Fe-6H+K]+
  • [M+Fe-4H]-
  • [2M+Fe-4H]-
  • [3M+Fe-4H]-
  • [3M+2Fe-7H]-
  • [M+Li]+
  • [M+Mg-H]+
  • [M+Mg-2H+Na]+
  • [M+Mg-2H+K]+
  • [M+Mg-3H]-
  • [M+Al-2H]+
  • [M+Al-3H+Na]+
  • [M+Al-3H+K]+
  • [M+Al-4H]-
  • [M+Ca-H]+
  • [M+Ca-2H+Na]+
  • [M+Ca-2H+K]+
  • [M+Ca-3H]-
  • [M+Sc-2H]+
  • [M+Sc-3H+Na]+
  • [M+Sc-3H+K]+
  • [M+Sc-4H]-
  • [M+Cr-2H]+
  • [M+Cr-3H+Na]+
  • [M+Cr-3H+K]+
  • [M+Cr-4H]-
  • [M+Mn-H]+
  • [M+Mn-2H+Na]+
  • [M+Mn-2H+K]+
  • [M+Mn-3H]-
  • [M+Co-H]+
  • [M+Co-2H+Na]+
  • [M+Co-2H+K]+
  • [M+Co-3H]-
  • [M+Ni-H]+
  • [M+Ni-2H+Na]+
  • [M+Ni-2H+K]+
  • [M+Ni-3H]-
  • [M+Cu-H]+
  • [M+Cu-2H+Na]+
  • [M+Cu-2H+K]+
  • [M+Cu-3H]-
  • [M+Zn-H]+
  • [M+Zn-2H+Na]+
  • [M+Zn-2H+K]+
  • [M+Zn-3H]-
  • [M+Ga-2H]+
  • [M+Ga-3H+Na]+
  • [M+Ga-3H+K]+
  • [M+Ga-4H]-
  • [M+NH4]+

If a metallic ion is selected, the isotopes having natural abundance greater than 3% are generated automatically in theoretical peaklists next to the most abundant isotopes. If the option Generate Full Isotope Patterns is selected, this feature is disabled for MS and MSI data.

Remove Hits of Fragments without Hits of Parent Fragments

If checked, a peak is not matched if the corresponding parent peak is not matched (e.g., a dehydrated b-ion is not matched if corresponding b-ion is not matched). Parent peaks are defined as follows:

  • A (parent ion is b-ion)
  • A* (parent ion is a-ion)
  • Ax (parent ion is a-ion)
  • A*x (parent ion is a-ion)
  • B (parent ion is b-ion)
  • B* (parent ion is b-ion)
  • Bx (parent ion is b-ion)
  • B*x (parent ion is b-ion)
  • C (parent ion is c-ion)
  • C* (parent ion is c-ion)
  • Cx (parent ion is c-ion)
  • C*x (parent ion is c-ion)
  • X (parent ion is x-ion)
  • X* (parent ion is x-ion)
  • Xx (parent ion is x-ion)
  • X*x (parent ion is x-ion)
  • Y (parent ion is y-ion)
  • Y* (parent ion is y-ion)
  • Yx (parent ion is y-ion)
  • Y*x (parent ion is y-ion)
  • Z (parent ion is z-ion)
  • Z* (parent ion is z-ion)
  • Zx (parent ion is z-ion)
  • Z*x (parent ion is z-ion)

Generate Full Isotope Patterns

Full isotope patters of compounds are generated in theoretical spectra (MS and MSI). The FWHM value is used when theoretical patterns are generated.

Minimum Pattern Size

The minimum number of peaks that must be matched in an isotope pattern of a compound to be reported (MS and MSI). "Generate Full Isotope Patterns" must be enabled.


Searched Sequence

Sequence

A peptide sequence which you are searching for or a peptide sequence tag. A peptide sequence must be entered if "Mode" is set up to "Compare Peaklist with Spectrum of Searched Sequence". Otherwise, the option is similar to "Peptide Sequence Tag" with a difference that a peptide sequence candidate is not removed from the search but it is just highlighted in the output list of peptide sequence candidates. If "Compare Peaklist with Spectrum of Searched Sequence" mode is used, the only possible format of the expression is a peptide sequence (e.g.,

[Val]-[Lac]-[Val]-[Hiv]-[Val]-[Lac]-[Val]-[Hiv]-[Val]-[Lac]-[Val]-[Hiv]

or

[Pro]-[Ile]-[Ile]\([Orn]-[N-Ac-Ile]\)[Phe]

). Regular expressions like

([Val]-[Lac]-[Val]-[Hiv]-*){3}

are not supported in this mode. See also Format of Sequence Databases.

N-terminal Modification

A name of an N-terminal modification which belongs to the searched peptide. The name must be defined in N-/C-terminal Modifications File.

C-terminal Modification

A name of a C-terminal modification which belongs to the searched peptide. The name must be defined in N-/C-terminal Modifications File.

Branch Modification

A name of an N-terminal or C-terminal modification which belongs to a branch of a searched peptide (branched and branch-cyclic peptides only). The name must be defined in N-/C-terminal Modifications File.