Tutorial:ClusterMakerScenario3

From OpenTutorials
Jump to: navigation, search


Tutorial Sources
Tutorial Curators Scooter Morris
Data Files VOCSuperfamily.cys
Version Applies to clusterMaker 1.10. Last updated: 10/21/2011


clusterMaker is a Cytoscape plugin that unifies different clustering techniques and displays into a single interface. Current clustering algorithms include hierarchical, AutoSOME, k-medoid, and k-Means for clustering expression or genetic data; and MCL, AutoSOME, TransClust, SPCS, Affinity Propagation and MCODE for clustering similarity networks to look for protein families. A recent BMC Bioinformatics publication: "clusterMaker: A Multi-algorithm Clustering Plugin for Cytoscape" discussed the capabilities of clusterMaker by utilizing three scenarios. The third scenario focused on the utilization of clusterMaker to analyze protein similarity data to support functional annotation.

Biological Use Case: Functional annotation by clustering protein similarity networks

Procedure

  1. Download VOCSuperfamily.cys
  2. Go to File → Open and select VOCSuperfamily.cys from where you downloaded it.

This session file contains a protein similarity network downloaded from the Structure-Function Linkage Database (SFLD). In this network superfamily_6 nodes represent proteins and edges represent the similarity (measured by pairwise BLAST scores) between the two connected proteins. The network represents the VOC (vicinal oxygen chelate) superfamily, a group of metal-dependent enzymes that share a common fold motif and catalyze a variety of reactions.

Perform MCL Clustering

We'll start by using MCL to cluster the network

  1. Select Plugins → Cluster → MCL cluster to bring up the MCL cluster Settings dialog.
  2. Set the Granularity parameter to 2.5
  3. Set the Array sources to BlastProbability
  4. Set the Edge weight conversion to -LOG(value)
  5. Set the Edge weight cutoff to 5.5. If desired, you can look at the histogram of edge weights by clicking on the Edge Histogram button.
  6. Click on Create Clusters to run MCL.
MCL cluster settings

Visualize MCL Clusters

  1. Click on Visualize Clusters to create a new network with the partitioned clusters.
MCL clusters

The session file has preconfigured visual mappings assigning the node color to the SFLD family. Nodes colored red are unclassified. It can be seen from the network above that several of the clusters have only two colors: red and the color assigned to a particular family. These clusters add evidence to support functional assignment of the unclassified proteins to the same family as the other proteins in the cluster.

Perform TransClust Clustering

We can repeat this for Transitivity Clustering...

  1. Select the original (unclustered) network, superfamily_6, in the Network Panel
  2. Select Plugins → Cluster → Transitivity Clustering to bring up the TransClust Cluster Settings dialog.
  3. Set the Array sources to BlastProbability
  4. Set the Edge weight conversion to -LOG(value)
  5. Set the Edge weight cutoff to 5.5. If desired, you can look at the histogram of edge weights by clicking on the Edge Histogram button.
  6. Click on Create Clusters to run TransClust.
Transitivity clustering settings

These may be visualized as before by clicking on Visualize Clusters'. To compare the clustering results, enable the network selection linkage: Plugins → Cluster → Link network selection. When this is enabled, selecting a group of proteins in one of the clustered views will also select them in all of the others.

Perform SCPS Clustering

We can repeat this for SCPS Clustering...

  1. Select the original (unclustered) network, superfamily_6, in the Network Panel
  2. Select Plugins → Cluster → SCPS Clustering to bring up the SCPS Cluster Settings dialog.
  3. Set the Array sources to BlastProbability
  4. Set the Edge weight conversion to -LOG(value)
  5. Set the Edge weight cutoff to 5.5. If desired, you can look at the histogram of edge weights by clicking on the Edge Histogram button.
  6. Click on Create Clusters to run SCPS.
SCPS clustering settings

These may be visualized as before by clicking on Visualize Clusters.

Comparing Clusters

It's sometimes convenient to be able to see several networks arranged on the screen together. Cytoscape provides the capability to tile multiple network views.

  1. Select the parent network: superfamily_6
  2. Right-click on the title in the Network panel to bring up the context menu
  3. Select Destroy View to remove that view.
  4. Select View → Arrange Network Windows → Tiled to tile all of the windows.
  5. To increase the level of detail, for each window, select View → Show Graphics Details
All three clustering results