Session unregistered

Your session timed out and has been unregistered. You will still be able to access data on pages that you have already submitted, but you will not be able to submit any new requests to the GNP server without reloading the page to register a new session.

Click here to reload GNP now.

Out-of-date browser

You are using an old or unsupported browser that has not been tested with this version of GNP. GNP relies on modern browser technology to provide an optimal user experience. For the best performance, we recommend using GNP with the latest version of Google Chrome.

GNP Database Search

Upload a database of molecules to identify known and predicted
compounds within a LC-MS/MS chromatogram

Mass spectrometry settings

.mzXML format is accepted. An example .mzXML file is provided. The input can be either a full LC-MS/MS or simply a series of MS/MS scans. In order to achieve better results, we suggest a basic pre-processing for the input spectra prior to analysis.
* 1. All peaks in MS/MS scans are centroided.
* 2. Isotopic peaks of MS/MS fragments are NOT removed.

Specify the minimum relative intensity for an MS2 fragment peak to be considered in peak matching. GNP refines the input MS2 spectra by removing peaks with intensity below this filter.

  %

Specify the window size to filter the database for each MS/MS scan. Only the database structures within the mass window will be scored.

  ±Da

GNP fetches the charge state of MS/MS scan precursor ions from the input mzXML files. GNP will process a MS/MS scan multiple times, using all charge states in the specified range.

  •  to 

In the default mode of precise search, iSNAP searches for structures that can be precisely matched with input MS/MS scans. If analogue search mode is selected, iSNAP will search for analog structures that vary from the seed structure at one monomer.

  • Precise Search
  • Precise and Analogue Search

Specify the minimum P1 score required for displaying a search result (see Ibrahim et al. for description).

Specify the minimum P2 score required for displaying a search result (see Ibrahim et al. for description).

Theoretical fragmentation settings

The built-in NRP database contains ~1100 NRP structures compiled from Antibase and the Dictionary of Natural Products. Users can also define compounds to be included in a search.

  • Built-in NRP database
  • Built-in lantibiotics database
  • Built-in ribosomal peptide database
  • User-defined compounds

Set up rules theoretical fragmentation rules for database structures. Please mark the allowed fragmentation types, and specify the number of sites can be simultaneously cleaved.


  •  to 

  •  to 

  •  to 

  •  to 

  •  to 

  •  to 

  •  to 

Specify the m/z tolerance for matching a theoretical fragment with mass peaks in MS/MS spectra.

  ±Da

Specify the charge states of generated theoretical fragment ions.

  •   to 

To aid in matched fragment identification, GNP can automatically generate images of the structure identified for each fragment. GNP can also generate images of the structure of each result. Both options extend the runtime and are disabled by default.

  • Generate result images
  • Generate matched fragment images

Help

Learn how to use the GNP natural product discovery platform

Introduction

GNP is an integrated online genomics and chemoinformatics platform for natural product discovery. The GNP natural product discovery platform has three main components: genome search, scaffold library generation, and compound identification with iSNAP database search.

Genome Search

Step 1: Upload sequence file

Upload a whole genome, DNA cluster or contig, or amino acid cluster. Sequence must be in FASTA format.

Step 2: Scaffold identification

Predicted chemical scaffolds for nonribosomal peptide and polyketide gene clusters will be loaded in the scaffold library generation screen automatically when the genome search has finished executing. These structures can be combinatorialized using the scaffold library generator.

Scaffold Library Generator

For each predicted scaffold, R groups can be added using the JSME molecule editor. See the JSME help page for more information regarding the molecule editor.

Step 1: Scaffold input

Scaffolds will be automatically generated for each gene cluster found by GNP's genome search. However, you can also input your own scaffold or library of scaffolds for combinatorialization by navigating to GNP's scaffold screen.

A single scaffold can be entered in SMILES format or multiple scaffolds can be uploaded in SMILES format, with each molecule on its own line. Uploaded scaffolds should not contain R groups, as the addition of R groups is only supported after upload.

Step 2: Library generation

To add R groups to a predicted scaffold, select 'X' in the editor, and type R and your desired R-group number (for example, R4). Each scaffold must contain at least one R group. R group numbers must be a series of consecutive integers beginning at 1. Once you have added all R groups for a given scaffold, you can move on to the next by selecting 'Next'. If you do not wish to include a given scaffold in your library, select 'Remove'.

To define R groups, continue selecting 'Next' until you have reached the end of the list of predicted scaffolds, at which point the 'Next' button will read 'Add R Group', and draw your R group using the molecule editor. Your R group must contain the pseudo-atom 'A' which signifies the site of attachment. If your first R group is a hydrogen atom, for example, draw the molecule A–H in the molecule editor. You can draw the pseudo-atom 'A' by selecting 'X' in the editor and entering the letter A.

To continue adding R groups, select 'Add R Group' again.

For your R group to be combinatorialized onto predicted scaffolds, you must check the box corresponding to the appropriate R group or R groups on the scaffolds. For example, if you wish to combinatorialize R-Group #1 onto R1, R3 and R5, these boxes must be checked.

Once all R groups have been added, select 'Submit'.

If the name of a scaffold or R group is displayed in red text along the right side, it is missing an R group or a site of attachment in its structure. This must be corrected before submitting the form to generate your library.

Libraries will be automatically loaded into the iSNAP database search screen for querying against LC-MS data. You can also download your full library in iSNAP database format by selecting the 'download' link beside the generated scaffold library option within the database selection settings.

iSNAP Database Search

iSNAP Database Search

iSNAP database search allows the user to perform both dereplication and prediction guided discovery of natural products. A variety of mass spectrometry and theoretical fragmentation settings are available to modify based on the quality of the LC-MS data and the fragmentation rules that are applied to the natural product database. Please refer to Ibrahim et al. and Wyatt et al for more information.

Step 1: Convert instrument data to .mzXML

Instrument vendors usually provide free software that can convert native acquisitions to standard formats. For instance, ReAdW can be used to convert ThermoFinnigan raw files, and CompassXport for Bruker raw files, etc.

There are also third-party efforts trying to simplify the conversion. ProteoWizard's msconvert supports the conversion of Agilent, Bruker, Thermo, Waters and AB Sciex file formats into mzXML.

Step 2: Input a .mzXML file

Click “Choose File”, and select your .mzXML file in the pop-up dialog. If you don't have an .mzXML file of NRP compounds, we provide a test example, the LC-MS/MS of Bacillus sp. fermentation, which can be download by clicking the link “Example .mzXML file.”

Step 3: Select database

By default, iSNAP will use only its internal database of ~1100 nonribosomal peptides. Users can also select from curated databases of lantibiotics or ribosomal peptides. Alternatively, users can upload their own database, or a library of predicted chemical structures generated by the GNP platform. See Wyatt et al. for applications of prediction-guided discovery.

Step 4: Select mass spectrometry and fragmentation settings

Users can select the intensity cut-off for each MS2 fragment that is used by iSNAP's matching algorithm, the mass window or tolerance, precursor charge, and the type of search mode (precise or analog) (See Ibrahim et al and Johnston et al. respectively). Fragmentation rules can be selected, as well as fragment mass tolerance, and structure charge.

Step 5: Submit the search

After submitting your task, iSNAP will perform NRP dereplication with our built-in database containing about 1100 NRP structures. Please keep the web page open until your files have finished uploading. A link will be displayed, where your results will be shown. The progress of the iSNAP search will be shown on this page until the analysis has completed.

Step 6: Prediction guided discovery

The prediction guided discovery plot identifies MS2 scans within your LC-MS chromatogram that most closely match your user-defined or predicted library, if applicable.

Step 7: Understand reports

The results of the iSNAP database search will be summarized in a report for your inspection after NRP iSNAP finishes the analysis. MS/MS scans identified as NRP compounds are listed in the report, sorted by P1 Score. Both P1 and P2 scores indicate confidence of the identification but are calculated in different ways (see Ibrahim et al.). All columns can be sorted. The brief report on the web page only shows the identifications with relatively high P1 and P2 scores. A complete NRP Search Report can also be downloaded. It provides detailed information for each input MS/MS scans.

To view the matched fragments for a given scan, select the scan number. This will open a new window containing the mass and structure of all matched fragments including each fragment ion’s intensity.

Reports can additionally be downloaded as Excel spreadsheets.

Your results will be stored by the iSNAP server for 60 days, after which time they will be automatically deleted. The iSNAP report includes the date on which your results will expire.

About

The GNP platform is an integrated platform for the genomic discovery of natural products. GNP is the first software package to integrate biosynthetic structural predictions based on genome data to generate a database of putative polyketide and/or nonribosomal peptide assembly-line products that is subsequently used to search real LC-MS/MS chromatograms. GNP uses statistical operations based on the iSNAP dereplication algorithm to identify the genetically-encoded or ‘cryptic’ metabolite based on the prediction database generated from the genome and/or by user input. GNP integrates the Chemistry Development Kit1, JSME2, SmiLib3, Glimmer4, hmmer5, BLAST6 and iSNAP7 to analyze microbial genomes and their extracts for cryptic natural products.

Note: the GNP server goes down for maintenance every Saturday at 4 AM EST. ALl jobs running at that time will be aborted.

References

1 Steinbeck, C., Hoppe, C., Kuhn, S., Floris, M., Guha, R., & Willighagen, E. L. Recent developments of the chemistry development kit (CDK)—an open-source java library for chemo- and bioinformatics. Current Pharmaceutical Design, 12(17), 2111-2120 (2006). doi:10.2174/138161206777585274

2 Bienfait, B., & Ertl, P. JSME: a free molecule editor in JavaScript. J. Cheminformatics, 5:24 (2013). doi:10.1186/1758-2946-5-24

3 Schüller, A., Hänke, V., & Schneider, G. SmiLib v2.0: a Java-based tool for rapid combinatorial library enumeration. QSAR & Combinatorial Science 26, 407-410 (2007). doi:10.1002/qsar.200630101

4 Delcher, A. L., Bratke, K. A., Powers, E. C., & Salzberg, S. L. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics, 23(6), 673-679 (2007). doi:10.1093/bioinformatics/btm009

5 Finn, R. D., Clements, J., & Eddy, S. R. HMMER web server: interactive sequence similarity searching. Nucleic Acids Research, 39(suppl 2), W29-W37 (2011). doi:10.1093/nar/gkr367

6 Altschul, S. F., Gish, W., Miller, W., Myers, E. W., & Lipman, D. J. Basic local alignment search tool. Journal of Molecular Biology, 215(3), 403-410 (1990). doi:10.1016/S0022-2836(05)80360-2

7 Ibrahim, A., Yang, L., Johnston, C., Liu, X., Ma, B., & Magarvey, N.A. Dereplicating nonnribosomal peptides using an informatic search algorithm for natural products (iSNAP) discovery. PNAS 109:47, 19196-19201 (2012). doi:10.1073/pnas.1206376109

Scaffold Library Generator

Combinatorialize a single scaffold or a database of
scaffold structures for GNP database search

Enter a scaffold molecule in SMILES format...

...or upload a database of SMILES in plain text format, with each molecule on its own line.

Replace atoms or moieties of the scaffold molecule with numbered R groups (e.g. R1, R2, R3, etc.) using the 'X' tool. The indices of your R sites must be a series of consecutive integers beginning at 1. Each scaffold must contain at least 1 R group.

Draw substituents to attach to the numbered sites of variability (R1, R2, R3, etc.) with the pseudo-atom "A" at the R site and select the sites at which to combinatorialize with this R group. Each R group must contain at least one "A" atom and be associated with at least one R site. Click "submit" when you are done adding R groups.

  • Scaffold #
  • R-Group #

Genome Search

Detect biosynthetic clusters in a genome and predict
nonribosomal peptide and polyketide products

Upload a sequence file in FASTA format. A sample cluster is provided.

What kind of sequence is this?


Specify the maximum length between orfs, in base pairs, to consider them part of the same biosynthetic cluster.

  bp

Specify cutoff values for domain analysis scores below which results will be considered false positives, and will not be considered for combinatorialization or included in the generation of predicted structures. Default values are suggested.

Global cutoff:

Thiolation/thioesterase domain cutoff:

Adenylation domain cutoff:

Acyltransferase domain cutoff:

Fatty acyl-AMP ligase cutoff:

Fragmenter

Generate all possible theoretical fragments for a molecule with
GNP's rule-based fragmentation algorithms

Enter a molecule in SMILES format.

Set up rules theoretical fragmentation rules for database structures. Please mark the allowed fragmentation types, and specify the number of sites can be simultaneously cleaved.


  •  to 

  •  to 

  •  to 

  •  to 

  •  to 

  •  to 

  •  to 

GNP: from Genes to Natural Products

An integrated online genomics and chemoinformatics platform for natural product discovery