GO-Module
About GO-Module
Tutorial
Acknowledgments
Citing GO-Module
Version 1.0
|
GO-Module FAQ
1) What is new in current GO-Module?
2) What is GO-Module used for?
3) How can one know that the resulting GO terms belong to a biomodule?
4) What the version of Gene Ontology used to generate the GO_Module?
5) What does GO-Module do with retired GO terms?
6) What is the method behind the GO-Module application?
7) Why does GO-Module use a step length of 1 instead of 2 (grandparent-parent-child, or parent-child-grandchild)?
8) What are the features of GO-Module?
9) What is the best way to utilize the tabular formatted results?
1) What is new in current GO-Module?
The following features were added to GO-Module Version 1.2, as compared to GO-Module Version 1.0:
• Any and all "retired" GO IDs of the input list are automatically replaced with up-to-date GO IDs (when possible according to the new version of GO annotations). GO-module now reports all "retired" GO IDs of the input list in the tabular results.
• The tabular output format of GO-Module can now be dynamically sorted and additionally each GO term is linked via URL to geneontology.org.
• We corrected issue of the terms with no biomodule ID. We support and explain why a "T" term can be part of more than one GO Module (FAQ No. 6).
• We have clarified and substantiated the how GO-Module can simplify a GO input list. We now report GO-Module results and reduction of features over 10 more GO lists from seven articles unbiasedly selected in Genome Research from 2010/5-2010/12.
• GO-Module now reports the version of GO database used for both the input and output pages. GO database was updated from the version originally downloaded on March 15th, 2010 to a newer version downloaded on December 10th, 2010 from geneontology.org.
• Assuming all input nodes are 'significant', GO-module annotates all contiguous descendents of a "K" term as "T". In the previous version, as long as contiguous descendents of a "K" node were in a contiguous path between two "K" nodes, GO-Module 1.0 annotated them as "F" (redundant false positive ones, e.g. node "n" in Figure A). Thus, the node's other "T" contiguous descendents had no biomodule IDs in GO-Module 1.0. This has been debugged in GO-Module 1.2 as Figure A illustrates
Figure A. Comparison of GO-Module 1.2 with GO-Module 1.0.
Hierarchical relationships are illustrated as arrows pointing from parent nodes towards their child nodes. "K" terms are seen as blue octagons, "T" terms as black circles, and "F" terms as dashed circles.
2) What is GO-Module used for?
GO-Module is a web-accessible synthesis and visualization tool developed for end-user biologists to greatly simplify the interpretation of prioritized Gene Ontology terms. It is designed as a supplementary tool for current GO-analysis tools, e.g. DAVID enrichment analysis, GSEA.
GO lists with less than 500 terms are the norm (Table 1 in the manuscript) and can be clearly visualized in the pdf output. The zoom feature on Adobe can easily accommodate these graphical results since the pdf output is in a scalable vector graphic format. Further, the vector graphic components of each independent biomodules of the PDF output can be grouped and/or repositioned by standard PDF editors such as Adobe Illustrator.
3) How can one know that the resulting GO terms belong to a biomodule?
Note that this assignment is a number that has no bearing on the rank and that a "True" node may have more than one biomodule label which are each separated by a semi-colon.
4) What the version of Gene Ontology used to generate the GO-Module?
The version of Gene Ontology file for a specific version of GO-Module was given at the top of the output page. Currently, GO_Module version 1.2 uses the GO file downloaded on Dec. 10th, 2010 from http://www.geneontology.org/ontology/obo_format_1_2/gene_ontology_ext.obo.
5) What does GO-Module do with retired GO terms?
GO-Module remains current with GO database updates, and provides interactive linkages to the resultant GO-terms in the tabular result. Retired codes entered in the input are thus replaced with up-to-date GO IDs if available and this replacement is noted at the end of the tabular result. This translation is reported in the GO-Module output table. Non-translated and thus non-analyzed GO terms are also reported. The retired GO terms with no translation in the new GO file are not part of any class ('K', 'T', 'F') nor any biomodule.
6) What is the method behind the GO-Module application?
The algorithm follows four steps as described below and is illustrated in Suppl. Figure 1 where hierarchical relationships are represented by arrows pointing from parent nodes towards their child nodes.
• Step 1 (Panel A): Annotates a node as "K" (key GO term, seen as blue octagons) if every one of its children or parents is less prioritized than itself. These key terms are locally prioritized with a step length of 1 to represent the key features of the input GO terms. Note that in cases of equally prioritized terms, the descendent is noted as "K" (e.g., node "q" in panel A).
• Step 2 (Panel B): As long as contiguous descendents of a "K" node are not themselves new "K" nodes, GO-Module annotates them as "T" (subsumed true positive GO terms as members of the biomodule, seen as yellow circles).
• Step 3 (Panel C): Assigns "F" (false positive, seen as dashed lined white circles) to the remaining GO terms.
• Step 4 (Panel D): Assigns a unique numerical label to each "K" GO term and to all its "T" descendants (seen as large dotted ellipses around many GO terms in Panel D). Note that this assigns a biomodule number that has no bearing on the rank and that a "T" node may have more than one biomodule label (e.g. node "s" in Panel D).
Suppl. Figure S1. The illustration of the GO-Module algorithm.
Hierarchical relationships are illustrated as arrows pointing from parent nodes towards their child nodes. Key terms "K" are seen as blue octagons, true positive "T" terms as yellow circles, and false positive "F" terms as small dashed white circles." Large dashed ellipses correspond to GO biomodules. Note that "T" terms can be subsumed in more than one biomodule, each biomodule is identified by a single "K" term.
7) Why does GO-Module use a step length of 1 instead of 2 (grandparent-parent-child, or parent-child-grandchild)?
A step length of 1 was used in order to greedily search the local minima in a directed acyclic graph (key terms) among all inputted GO terms, as described in Step 1 of FAQ No. 6.
The assignment of GO-Module IDs has no step length limitation. As long as contiguous descendents of a "K" node are not themselves new "K" nodes, GO-Module annotates them as "T" (subsumed true positive GO terms) and as a member of the biomodule. If a contiguous descendent of a "K" node itself is new "K" nodes (e.g., node "v" in Suppl. Figure S1), local prioritization (e.g. the lowest p-value compared with its parent nodes) separates this note as a new "K" term.
8) What are the features of GO-Module?
GO-Module's resulting features are the distinct key terms of the GO-Module analysis.
9) What is the best way to utilize the tabular formatted results?
The online web tool provides a basic sorting ability for the tabular results, however, we suggest downloading the file to another program (such as Microsoft excel) to perform more sophisticated analysis functions such a filtering or multi-line sorting.
|
Top
|
|