Tutorial:Over-representationAnalysis

From OpenTutorials
Jump to: navigation, search

This tutorial outlines a typical over-representation analysis workflow in GenMAPP-CS. GenMAPP-CS supports over-representation analysis on pathways and GO terms using the GO-Elite program. The example data used is a subset of a microarray experiment examining mouse myometrium during pregnancy. The data is available here and has been described here.

Loading data and creating coloring criteria

For this tutorial, we will be working with the EB vs ES data used in the Expression Analysis tutorial.

Running GO-Elite

  • To begin GO-Elite analysis, select Run GO-Elite... from the Actions drop-down.
  • At the top of the GO-Elite interface there are three options for input types: Criteria, Network and File. For the purposes of this tutorial, select Criteria.
  • In the GO-Elite interface, there are a few analysis parameters to set, described below. For detailed information on these parameters, refer to the GO-Elite manual.
  • Primary ID Column: Select the column in your data that contains the ID your data is annotated with.
  • System: Select the identifier type that your data is annotated with, which is the same identifier type that was used when the data was imported to GenMAPP-CS. For the example data, we will select "Affymetrix".
  • In the Criteria Set drop-down choose the "EB vs ES" criteria set.
  • In the Criteria drop-down, select "up-regulated". Note: GO-Elite analysis currently works on one criteria at a time. Support for selecting all criteria for analysis is being added.
  • Click Run analysis to start the analysis.
  • When the analysis is complete, repeat for the down-regulated criteria.


go elite interface


The status of the analysis for each criteria is displayed in the Status tab of the GO-Elite Results tab which appears in the in the Results Panel. The Status panel summarizes the analysis, gives you ongoing status messages about the analysis and also includes a link to the web service job. This link can be useful for retrieving the full analysis results. There is also a Stdout tab in the Results Panel, which contains programmatically relevant information directly from the GO-Elite web service.

Once analysis is started, you will notice entries for the selected criteria in the Results panel of Workspaces. The coloring of these entries will indicate the status of analysis; they will remain red until analysis is complete, at which point they will turn green.

Viewing GO-Elite results

Once GO-Elite results are complete, the entries in the Results panel of Workspaces will appear green, and the GO-Elite Results in the Results Panel will be populated with additional tabs.

GO-Elite results


  • Results tabs are organized under main tabs representing each criteria included in analysis, for example up-regulated and down-regulated in this case.
  • In addition to the Status and Stdout tabs, results for GO terms and pathways are are displayed in a tabs labeled GO and Pathway, respectively.
  • For each results tab, the organization is the same:
    1. Results are ordered based on descending Z-score.
    2. Results highlighted in yellow pass the default criteria of 3 or more genes changed.
    3. Number Changed for a particular pathway or GO term refers to the number of genes (or probes) in the input data that was changed as defined by the criteria.
    4. Number Measured refers to the number of genes (or probes) matching the pathway or GO term in the input data.
    5. PermuteP: The p-value calculated by permutations of the data, as described here.
    6. AdjustedP: The permuted p-value adjusted by the Benjamini-Hochberg method.

By default, the analysis does not include the GO-Elite pruning method nor does it permute the data to calculate a permuted p-value. To rerun the analysis with either or both of these options, simply enter a number of permutations (2000 is recommended) and a pruning method (z-score is recommended) in the bottom of the Results Panel.

Results can also be exported as tab-delimited text by clicking the Export button at the bottom right of the Results Panel.

For the pathway results, each pathway is clickable.

  • Click on any of the pathway results, for example "Endochondral Ossification", to open the pathway.
  • Pathway nodes will be colored according to the relevant criteria used for GO-Elite.


GO-Elite pathway


In the "Endochondral ossification" pathway, the results show 17 genes changed (green) out of 45 measured (grey). Since some nodes are associated with multiple probes, which may not all be significantly changed (as defined by the criteria), the pathway may not actually show all 17 nodes colored. In the case of the "Endochondral ossification" pathway, we see 14 nodes colored green. Looking at the Backpage for all genes, we can identify three additional nodes (Calm1, Igf1 and Akt1) that are also associated with significantly changed probes.


GO-Elite backpage

Accessing full GO-Elite results

The Results Panel displays only pruned and filtered GO-Elite results. The full GO-Elite results are accessible via web services.

  • In the Status tab, look for an entry labeled "Link to webservice job". Copy the URL and paste it into a browser window.
  • Go to "GO-Elite_results/CompleteResults/" to find both pruned and original complete results.

To learn more about the algorithms used by GO-Elite and the various output files, refer to the GO-Elite manual.