You must indicate the input species before inserting your gene set. This information is only required in order to identify your gene symbols and their orthologs.
The matching algorithm considers genes and gene orthologs, and differs between the distinct sections:
LifeMap Discovery®, regardless of their species (human, mouse, rat, chicken, pig).
Please note that changing the input species after inserting gene symbols will activate a new identification process.
GeneAnalytics identifies official human and mouse gene symbols only.
Currently, GeneAnalytics is recommended for the analysis of gene sets that contain 300 or fewer genes. Analyzing longer lists may yield biased results, with over-representation of entities that contain higher number of genes.
If you insert a gene set with more than 300 genes, you will be asked whether you want to proceed with your long set, or to trim the list to 300 genes. If you choose to trim the list, the first 300 genes will be used (duplicate genes will be removed automatically).
You can insert your gene symbols by either:
1. Typing in the gene symbol(s) in the input window, one gene at a time. Use the auto-complete feature to define the correct official gene symbol.
2. Pasting a list of genes into the input window. The pasted genes automatically undergo an identification procedure.
3. Uploading a file containing the gene list. Only text files are accepted. The uploaded genes automatically undergo an identification procedure.
Unidentified genes:
Unidentified genes are genes that were not recognized as official human or mouse gene symbols.
Unidentified genes are not included in the analysis and do not impact its results.
To correct an unidentified gene:
Following the gene symbol correction, the identified genes will be automatically added to the “ready for analysis” gene list.
Ready for analysis:
Only the genes included in this list will be analyzed.
Each gene in the ‘ready for analysis’ list, is shown with its official symbol, full name and all available aliases/synonyms.
In order to edit gene symbols in this list, delete the gene symbol and re-type your desired gene symbol in the input box above.
This section presents all the queried genes that were identified and included in the analysis.
Click on ‘notes’ to see the genes in your query that found to be abundant or defined as housekeeping genes in human (read more on abundant and housekeeping genes).
These genes get lower scores in the tissues and cells analysis. You may consider removing them from your query to optimize the results.
The detailed results table presents all entities in which at least one of the analyzed genes is expressed, along with links to their cards in LifeMap Discovery.
The entities are presented in descending order of their matching score. If several entities have the same score, they are ordered by the ratio of matched to total number of genes in the entity (from highest to lowest). In single-gene queries, in vivo entities appear before in vitro entities with the same score.
The list can be sorted by any other parameter presented in the table, by simply clicking on the column title. Please note that sorting by the number of matched genes per entity can provide important information, but should be considered with caution due to the large variance in the total number of genes per entity.
Each gene in each entity has a score, which is based on the entity type and the gene annotations. These annotations are based on information from the scientific literature and/or on bioinformatics calculations performed using expression data in LifeMap Discovery. Each gene can have one or more of the following annotations:
To receive a list of all genes expressed in a specific tissue, organ or developmental path, including annotations for selective markers, specific genes and tissue-enriched genes, please contact us.
The entity score calculations are based on the gene score, and are different between single-gene and multiple-gene queries:
The entity scores are classified by their quality (high, medium or low), indicated by the color of the score bar (dark green, light green or beige, respectively). The distribution of the scores among the different quality levels is shown by clicking on the pie icon.
Clicking on the entity name will lead you to the entity card in LifeMap Discovery, which contains a full list of all genes known to be expressed in the entity, and additional information about its development, signaling pathways, related diseases and more.
Please note the followings:
The number of genes in the entity that match the query and the total number of genes in entity (which is presented in parenthesis).
Sorting the list by this parameter can be informative but should be considered with caution since the total number of genes per entity varies significantly.
Clicking on the number of matched genes opens a list of the genes that includes gene symbols, full name, links to GeneCards® and NCBI, and information about their expression, localization and evidence:
Expression information:
Matched genes can be indicated as either 'expressed' or 'positive selective marker':
Positive selective marker : This indication appears only in cells and describes genes that are either established cell markers, or that have been suggested to be characteristic of the cell, through their expression.
Expressed gene : a gene that is known to be expressed in the entity but is not defined as a cell marker.
Evidence:
Indication for the type of evidence supporting the expression of the indicated gene (clicking on the entity name will lead you to a table with active links to all supporting sources of evidence):
: Scientific literature.
: High throughput experiments, such as microarray and RNA sequencing, available at GEO and/or the scientific literature.
: Large scale data set.
This panel has two functions:
Tissue/system results:
The tissue/system score calculations are based on the gene score (read more about gene scores in the tissue and cell analysis score section), and differ between single-gene and multiple-gene queries:
Tissue/system filter:
This filter can be used to show only in vivo or only in vitro results.
This filter can be used to show only specific entity types including:
This filter can be used to show only prenatal or only postnatal results.
Entities that are defined as “prenatal-postnatal” appear when filtering for both prenatal and postnatal results.
The disease matching score is based on the following parameters:
For each gene, the maximal score of all the above mentioned possible scored is used as the final gene score. The disease score is based on the final scores of all the matched genes.
This panel filters results in accordance with the Malacards disease categorization. The Malacards algorithm categorizes each disease into 0-4 anatomical and 0-5 global disease categories based on:
Use this filter to focus the results.
The numbers indicate the number of hits per disease category. These numbers are modified upon use of the additional category filter.
Note that not all the diseases are categorized.
Association category | Genetic association | Source |
---|---|---|
Causative mutation | Pathogenic | ClinVar |
Likely Pathogenic | ClinVar | |
Molecular basis known | OMIM | |
Causative germline mutation | Orphanet | |
Causative somatic mutation | Orphanet | |
Causative mutation | Uniprot | |
Risk factor | Confers sensitivity | ClinVar |
Risk factor | ClinVar | |
Genetic association | OMIM | |
Susceptibility factor | OMIM | |
Modifying Germline mutation | Orphanet | |
Role in phenotype | Orphanet | |
Modifying Somatic mutation | Orphanet | |
Resistance factor | Protective | ClinVar |
Resistance factor | OMIM | |
Genetic tests | Genetic tests | GeneTest |
Drug response | Drug response | ClinVar |
Structural gene variation | Structural variation | OMIM |
Gene fusion | Orphanet | |
Unconfirmed association | Unconfirmed association | OMIM |
Candidate gene tested | Orphanet | |
Genetic linkage | OMIM |
This list presents pathways, GO terms, phenotypes or compounds that match your gene set, with links to their cards in the relevant database (PathCards or GeneCards).
(read more on pathways, GO terms, phenotypes or compounds analysis result types and data sources).
The matches are presented in descending order of the matching scores. If several matches have the same score, they are ordered by the ratio of matched to total number of genes associated with the matched entity (from highest to lowest).
Clicking on the number of matched genes for each match opens a list of these genes, which can be used as a new query.
The binomial distribution is used to test the null hypothesis that the user’s input genes are not over-represented within any SuperPath, GO term or compound. The presented score for each match is a transformation (-log2) of the resulting p-value, where higher scores indicate better matches. Results with p-values lower than 10-50 are assigned the maximum score.
The score range is divided into three quality levels, based on the p-value corrected for multiple comparisons (using the false discovery rate method):
High: corrected p-value smaller or equal to 0.0001
Medium: corrected p-value higher than 0.0001 but smaller or equal to 0.05
Low: corrected p-value higher than 0.05
The scores are classified by their quality (high, medium or low), indicated by the color of the score bar (dark green, light green or beige, respectively). The distribution of the quality of the results can be viewed by clicking on the pie icon.
The entities are presented in descending order of their matching scores. If several matches have the same score, they are ordered by the ratio of matched to total number of genes associated with the entity (from highest to lowest).
The table presents all results whose score was derived from corrected p-values smaller or equal to 1. If less than 20 results pass this threshold, the best 20 results will be presented, even if they are of lower statistical significance.
GeneAnalytics Compounds section takes advantage of multiple sources related to more than 83,000 compounds, including those found in GeneCards. (for more information about the data sources click here).
The Novoseek data source extracts knowledge from biological databases and text repositories, providing relationships between chemical compounds and genes based on scoring algorithm running on Pubmed articles. Note that the Novoseek website is no longer available. Read more about Novoseek data in GeneCards and about its literature-text mining algorithm.
We have applied a unification process which seeks out similar compounds described in different data sources, to enable gene aggregation for unified compounds, and to avoid redundancy in the resulting compounds list. Compounds unification is established by an identical name and/or combination of other identifiers as: CAS number, PubChem ID and synonyms. Unified compounds are shown with links to all relevant data sources (the exact compound name is shown near the original data source name).
Metabolites unification:
The following compound families contain thousands of metabolites which were unified based on their primary name and associated genes.
If genes associated with these compounds are matched to your gene set, GeneAnalytics presents only the matched group, to avoid a multitude of identical results. The evidence link enables viewing all the relevant metabolites in the original database. The unified compounds and their specific groups are as following:
1. Triglycerides
Group name | # of associated genes | # of metabolites in the group |
---|---|---|
Triglycerides group A | 26 | 170 |
Triglycerides group B | 30 | 113 |
Triglycerides group C | 39 | 6 |
Triglycerides group D | 34 | 13631 |
2. Diglycerides
Group name | # of associated genes | # of metabolites in the group |
---|---|---|
Diglycerides group A | 130 | 803 |
Diglycerides group B | 131 | 39 |
Diglycerides group C | 131 | 1 |
Diglycerides group D | 115 | 435 |
3. Phosphatidylcholines
Group name | # of associated genes | # of metabolites in the group |
---|---|---|
Phosphatidylcholines group A | 78 | 955 |
Phosphatidylcholines group B | 72 | 119 |
Phosphatidylcholines group C | 44 | 73 |
4. Phosphatidylethanolamines
Group name | # of associated genes | # of triglycerides in the group |
---|---|---|
Phosphatidylethanolamines group A | 43 | 959 |
Phosphatidylethanolamines group B | 30 | 114 |