Predicted proteins having a significant hit to only an environmen

Predicted proteins having a significant hit to only an environmental sequence are then described inhibitor Z-VAD-FMK as ��hypothetical protein��, a minimally useful annotation. In contrast, because of the rich metadata accompanying each sequence within MGOL (Figure 3), for each predicted viral metagenome ORF run through the VIROME pipeline it is possible to extract additional biological meaning such as the predominant ecosystems where the peptide occurs and whether the peptide is found only in viruses. In both VMGAP and VIROME, the inclusion of BLAST analysis against environmental peptides improves the informative sequence content of viral metagenomes as compared to analysis using the MetaVir pipeline, which is based solely on homology searches against known viral genome sequences within the NCBI RefSeq database [38].

Through the VIROME pipeline, typically 70% of Sanger read length viral metagenome sequences from aquatic environments obtain a classification other than ORFan. Another strength of the VIROME analysis pipeline and web-application interface is the ability to retrieve read sequences, predicted ORFs, predicted peptides and top-hit BLAST results according to a large variety of search criteria. This functionality allows for a broad range of sequence retrieval, from individual sequences to whole libraries. For the researcher, the capability of customized sequence retrieval empowers subsequent sequence-based analyses, especially molecular phylogenetic analyses, which are a cornerstone of molecular ecological studies.

In addition to customized searches, the VIROME web-application provides a summary display of BLASTP results organized by criteria such as the taxonomic origin of sequence homologs or functional terms associated with sequence homologs from databases such as KEGG, COG, GO, ACLAME, and SEED. Because VIROME links the sequence information from five annotated databases with UniRef 100 sequences, it is possible to garner a great deal of functional information for those sequences hitting known sequences within UniRef 100. MG-RAST uses a similar strategy with the M5NR non-redundant protein database. In the VIROME web-application interface, views summarizing homology search results according to functional and taxonomic criteria are displayed using fully interactive charts (e.g., pie charts and bar charts) that are dynamically linked to BLAST data.

These summaries provide a ready means for researchers to effectively bin sequences according to a variety of criteria for subsequent analyses such as assembly and clustering. Finally, an important practical concern is that the VIROME pipeline is administered and maintained as a web resource and does not require GSK-3 researchers to have access to advance computing infrastructure (e.g., a database server and computational grid).

No related posts.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>