clusterMaker is a Cytoscape plugin that unifies different clustering techniques and displays into a single interface. Current clustering algorithms include hierarchical, AutoSOME, k-medoid, and k-Means for clustering expression or genetic data; and MCL, AutoSOME, TransClust, SPCS, Affinity Propagation and MCODE for clustering similarity networks to look for protein families. A recent BMC Bioinformatics publication: "clusterMaker: A Multi-algorithm Clustering Plugin for Cytoscape" discussed the capabilities of clusterMaker by utilizing three scenarios. The first scenario focused on the utilization of clusterMaker to analyze gene expression data -- in this case, from a large mouse gene expression data set downloaded from GEO.
Biological Use Case: Gene expression analysis in a network context.
- Download MouseInteractome.cys.
- Go to File → Open and select MouseInteractome.cys from where you downloaded it to load the session.
Run Hierarchical Clustering
- Select Plugins → Cluster → Hierarchical cluster.
- Select pairwise centroid-linkage for Linkage
- Select Uncentered correlation for the Distance Metric
- In the Source for array data box, select all 182 attributes.
- Select Ignore nodes/edges with no data.
- Deselect Create groups from clusters.
- Click Create Clusters.
- When you have created the clusters, the Visualize Clusters clusters button should become active. Click Visualize Clusters.
Visualize Results using Treeview
You will now see an Eisen treeview visualization. On the treeview window, explore by clicking on points on the dendogram. Clicking/selecting a particular row in the heatmap will result in the expression values for that column being overlaid on the network view.
- Use shift-drag to draw a box and see results on network.
- Use shift-click to pick individual columns.
- Select an individual row by clicking on it.
- You can adjust the color scheme and contrast by going to Settings. You can also use the Settings dialog to fit the entire treeview onto the screen by selecting Fill under both the X and Y Global: scale values.
Run AutoSOME Clustering
We can also use AutoSOME to cluster this same data. First, we'll bring up the settings dialog for AutoSOME
- Select Plugins → Cluster → AutoSOME to bring up the AutoSOME Clustering Settings dialog
- Select all of the Array Sources as we did above, and change the P-Value Threshold to 0.1.
- Under Data Normalization we also want to set the Sum of Squares=1 setting to Both
- Finally, we can run the clustering itself by selecting Create Clusters.
Visualizing AutoSOME Cluster Results
- After the clustering has completed, you can display the results as a heatmap by either selecting Visualize Clusters or choosing Heatmap under Choose Visualization in the Data Output section of the dialog.
- To visualize the entire heat map, select Settings... as we did above to Fill both X and Y.
AutoSOME clusters can also be displayed as a network by dropping all of the inter-cluster edges. To do this:
- Change the visualization to Network under Choose Visualization in the Data Output section.
- Click Display
AutoSOME Fuzzy Clustering
AutoSOME can also be used to produce fuzzy clusters that can show the relationship between the clusters. These can be very useful for exploring transcriptome clusters, for example, to show how different clusters of diverse transcriptomes relate to one another.
- Open up the AutoSOME Clustering Settings dialog again: Plugins → Cluster → AutoSOME
- Set the P-Value Threshold to 0.05
- Set the Sum of Squares=1 to Both under Data Normalization
- Under Fuzzy Cluster Network Settings:
- Check Perform Fuzzy Clustering
- Set the Distance Metric to Uncentered Correlation
- Set the Maximum number of edges to display in fuzzy network to 4000
- Perform the clustering by clicking on Create Clusters
AutoSOME Fuzzy Clusters Visualization
- To visualize the fuzzy clusters, select Network under Choose Visualization in the Data Output section.
- Click Display
In the resulting network, each node represents a condition (attribute value) from the data set and the thickness and opacity of the edges reflect increasing frequency of co-clustering between any given pair of nodes over all ensemble iterations.