This tutorial presents an analysis workflow for combining proteomics and transcriptomics data in GenMAPP-CS, showing how two different datasets that don't correlate at the protein level can be integrated and analyzed in the context of pathways. The data used for this tutorial is from a study examining both protein abundance (mass-spec) and transcript abundance (expression array) during maturation of human dendritic cells. The data is described in detail here.
To download GenMAPP-CS, please visit our website.
GenMAPP-CS allows import of any number of datasets. The datasets can be annotated with different identifiers and can contain different data types etc. In this case, there are two datasets, both annotated with Entrex gene identifiers:
The transcript-level data is from the Affymetrix expression arrays (HG U133 Plus 2.0), and has been summarized at the gene level and annotated with Entrez gene IDs. The data looks like this:
Entrez_ID mRNA ratio mDC/iDC 1 1.47 2 0.31 9 0.64 10 1
The expression data is available for download here: File:Buschow mRNA.txt.
Protein abundance data was collected using mass spectrometry, and after peptide identification the data was converted to IPI and Entrez gene identifiers using the biomaRt Bioconductor package. The data looks like this:
Entrez_ID ensembl_Id ipi_id empai_immature empai_mature protein ratio mDC/iDC result_type Detected in: 2 ENSG00000175899 IPI00478003.1 0.57 0.03 0.05 estimated iDC 9 ENSG00000171428 IPI00644361.2 0.13 0.15 1.22 estimated iDC 16 ENSG00000090861 IPI00027442.4 0.73 0.93 1.27 detected both 22 ENSG00000131269 IPI00306748.1 0.06 0.12 1.91 estimated mDC
The proteomics data is available for download here: File:Buschow protein.txt.
- In the Database panel of Workspaces, select the human database, since the data for this tutorial is from human dendritic cells. Detailed instructions on how to select a database are available in the Expression Analysis tutorial.
- Import both data files consecutively under Import dataset from table... in the Actions menu. For detailed instructions on how to use the GenMAPP-CS Dataset Import, refer to the Expression Analysis tutorial.
Creating coloring criteria
- Create two Criteria Sets, one for the expression data and one for the proteomics data, with the same cutoffs for the ratio for both data types:
- Expression data
- mRNA up 2-fold: mRNA ratio mDC/iDC > 2 → orange
- mRNA down 2-fold: mRNA ratio mDC/iDC < 0.5 → light blue
- Proteomics data
- protein up 2-fold: protein ratio mDC/iDC > 2 → red
- protein down 2-fold: protein ratio mDC/iDC < 0.5 → blue
The two criteria should look like this:
Over-representation analysis with GO-Elite
- Setup and run GO-Elite for the "Protein up 2-fold" criteria. Make sure to indicate "EntrezGene" as Primary ID System. Note: GO-Elite analysis currently works on one criteria at a time. Support for selecting all criteria for analysis is being added. The GO-Elite interface should look like this:
- Repeat GO-Elite analysis for the down-regulated criteria, and then for both criteria in the "mRNA" criteria set.
For details on how to run GO-Elite, please refer to the GO-Elite tutorial.
Once the GO-Elite analysis is complete, the Results Panel will have 4 tabs, one for each criteria in the two criteria sets. Comparing the pathway results for the 2-fold up criteria in the two datasets, we see the following lists:
Comparing the lists of pathways and GO terms between the two datasets, we can see the following:
- For 2-fold up, several of the top pathway hits are identical between the two datasets.
- Several of the up-regulated pathways represent the same processes as those identified in the publication, which include several known DC maturation markers (CD86, HLA I and II).
- The GO results for the up-regulated transcripts and proteins reveal many immune-related processes, such as cytokine- and inflammatory-related processes.
Explore data on networks
- Open the "Toll-like receptor signaling pathway" pathway by clicking on the pathways in the GO-Elite results list.
- In the Criteria Set panel of Workspaces, select either of the two criteria sets, right-click and select Combine All Criteria.
As noted in the related publication, there is not much overall correlation between differential mRNA and protein expression at the level of individual genes and proteins. However, the two data types are complimentary when analyzed in the context of pathways.