Frequently Asked Questions

Results - pathways, GO terms and compounds

This binomial distribution is used to test the null hypothesis that the user’s input genes are not over-represented within any SuperPath, GO term or compound in the databases from which GeneAnalytics extracts its data (See Resources and statistics).

The presented score is a transformation (-log2) of the resulting p-value, where higher scores indicate better matches.

The score range is divided into three levels of quality, based on the p-value corrected for the multiple comparisons (using the false discovery rate method):

High: corrected p-value smaller or equal to 0.0001.

Medium: corrected p-value higher than 0.0001 but smaller or equal to 0.05.

Low: corrected p-value higher than 0.05.

The score is presented by a score bar whose color indicates the match quality: dark green for high, light green for medium and beige for low.  This graphic visualization of the score enables the user to evaluate the overall quality of the results.

Results are ranked in descending order of their score. If several matches have the same score, they are ordered by the ratio of matched to total number of genes in the entity (from highest to lowest).

The detailed result tables in the pathways/ GO terms and compounds sections present all the results of which the  score was derived from a p-value (after correction for multiple comparisons) smaller or equal to 1.

If less than 20 results pass this threshold,  the best 20 will be displayed results, even though they have lower statistical significance. 

A SuperPath clusters one or multiple pathways from various PathCards data sources, based on similarity in their associated genes.

In GeneAnalytics, the SuperPaths are presented with links to their cards in PathCards, the list of their constituent pathways and the number of matched genes versus their total number of genes.

(Read more about pathway analysis in PathCard).

The Pathways section includes a filter that enables viewing results derived from a specific data source only.

Note that for pathways originating from Reactome, the matched genes are highlighted in the original source pathway illustration.

The compounds section in GeneAnalytics is the only section which does not yet rely on a single unique database that unifies multiple compound sources and provides one web page for each compound (in contrast: tissues and cells data are unified in LifeMap Discovery, diseases are unified in MalaCards, pathways are unified in PathCards and GO terms in the GeneOntology database). The GeneAnalytics Compounds section takes advantage of multiple sources which relate to more than 83,000 compounds, including those found in GeneCards® (for more information about the compounds data sources click here). 

The Novoseek data source extracts knowledge from biological databases and text repositories, providing relationships between chemical compounds and genes based on scoring algorithm running on Pubmed articles. Note that the Novoseek website is no longer available. Read more about Novoseek data in GeneCards and about its literature-text mining algorithm.

We have applied a unification process which seeks out similar compounds described in different data sources, to enable gene aggregation for unified compounds, and to avoid redundancy in the resulting compounds list. Compounds unification is established by an identical name and/or combination of other identifiers as: CAS number, PubChem ID and synonyms. Unified compounds are shown with links to all relevant data sources (the exact compound name is shown near the original data source name).

Metabolites unification: the following compound families contain thousands of metabolites which were unified based on their primary name and associated genes.

If genes associated with these compounds are matched to your gene set, GeneAnalytics presents only the matched group, to avoid a multitude of identical results. The evidence link enables viewing all the relevant metabolites in the original database. The unified compounds and their specific groups are as following:

1. Triglycerides

Group name# of associated genes# of metabolites in the group
Triglycerides group A 26 170
Triglycerides group B 30 113
Triglycerides group C 39 6
Triglycerides group D 34 13631

2. Diglycerides

Group name# of associated genes# of metabolites in the group
Diglycerides group A 130 803
Diglycerides group B 131 39
Diglycerides group C 131 1
Diglycerides group D 115 435

3. Phosphatidylcholines

Group name# of associated genes# of metabolites in the group
Phosphatidylcholines  group A 78 955
Phosphatidylcholines  group B 72 119
Phosphatidylcholines  group C 44 73

4. Phosphatidylethanolamines

Group name# of associated genes# of triglycerides in the group
Phosphatidylethanolamines group A 43 959
Phosphatidylethanolamines group B 30 114

 

Start analyzing your gene sets

SIGN UP FREE >