Tutorial:Introduction to Cytoscape
Slideshow Introduction to Cytoscape (about 30 minutes)
Handout Introduction_to_Cytoscape.pdf (16 pages)
Tutorial Sources Cytoscape Tutorial (Yeyejide Adeleye)
Tutorial Curators Anna Kuchinsky, Scooter Morris, Alex Pico
Cytoscape is an open source software tool for integrating, visualizing, and analyzing data in the context of networks. This tutorial will cover:
- Navigating Cytoscape
- Visualizing Data on Networks
Part two of this tutorial can be found here
This section will introduce the Cytoscape user interface. First of all we will look at the basic UI of Cytoscape. Then we will show all menu features of Cytoscape and the extended functionality provided by plugins.
Cytoscape Layout and User Interface
Launch Cytoscape. You should see a window that looks like this:
- At the top of the Cytoscape Desktop window is the toolbar, which contains the command buttons. The name of each command button is shown when the mouse pointer hovers over it.
- In the upper right is the Main Network View window, where network data will be displayed. This region is initially blank.
- At left is the Control Panel (Network Management) Panel. This lists the available networks by name and provides information on the number of nodes and edges.
- Immediately below the Control Panel is the Network Overview Pane
- At lower right is the Data Panel which can be used to display node, edge, and network attribute data
The Network Management and Data browser panels are dockable tabbed panels known as CytoPanels. You can undock any of these panels by clicking on the Float Window control in the upper-right corner of the CytoPanel. The Data Panel starts off with three tabs: Node Attribute Browser, Edge Attribute Browser, and Network Attribute Browser; the Network Management panel starts off with four tabs: Network, VizMapper, Editor, and Filters. Loaded plugins might add tabs to either of these CytoPanels.
We will briefly run through all the menus available in Cytoscape.
The File menu contains basic file functionality:
- File → Open for opening a Cytoscape session file
- File → New for creating a new network
- File → Save for saving a session file
- File → Import for importing data such as networks and attributes
- File → Export for exporting data and images.
- File → Print allows printing
- File → Quit closes all windows of Cytoscape and exits the program
The Edit menu contains:
- Undo and Redo functions which undo and redo edits made in the Attribute Browser, the Network Editor and the Layout.
- Options for creating and destroying views (graphical representations of a network) and networks
- Options for deleting selected nodes and edges from the current network.
- All deleted nodes and edges can be restored to the network via Edit → Undo.
- Edit → Preferences → Properties to edit preferences for properties and plugins
The View menu allows you to display or hide:
- The network management panel (Control Panel)
- The attribute browser (Data Panel)
- Results Panel
The Select menu contains:
- Options for selecting nodes and edges
- The Select → Use Filters option allows filters to be created for automatic selection of portions of a network whose node or edge attributes meet a filtering criterion (see below for the filters section).
The Layout menu has an array of features for visually organizing the network:
- Rotate, Scale, Align and Distribute are tools for manipulating the network visualization.
- The bottom section of the menu lists a variety of layout algorithms which automatically lay a network out.
The Plugins menu contains options for managing your plugins (install/update/delete) and may have options added by plugins that have already been installed, such as the Agilent Literature Search or Merge Networks.
- Depending on which plugins are loaded, the plugins that you see may be different than what appear here.
- The Help menu allows you to launch the help viewer and browse the table of contents for this manual.
- The “About…” option displays information about the running version of Cytoscape.
Loading a Simple Network
- Go to File-> Import -> Network (multiple file types)
- You should see the Import Network File Dialog
- For Data Source Type select Local and then click Select
- Open the sampleData folder and select galFiltered.sif and then click on Open and then Import
You should see the following:
The SIF file format is about as simple as it gets. It consists of 3 columns: source, interaction type, and target. “Source” and “target” are gene/protein identifiers that are used to define nodes, while “interaction type” serves to label the edge connecting each pair of nodes.
Manipulating Your Network
Now that you have a network loaded, you can interact with it in a number of ways:
- Start by clicking on the node at the upper left corner of the network. The node with turn
yellow. If you hold your mouse down over the node and drag it around the node will move on the screen.
- Now add another node to the selection by holding down the Shift key and clicking on a node. Note that both nodes are now selected (yellow). Again, move the nodes around. Node that both nodes will move.
- To select a group of nodes, hold the mouse down in the upper left-hand corner and drag your mouse over a region of the network. Again, a group of nodes will be selected and can be moved around on the screen.
- To zoom in on the selected nodes, click on the icon.
- To move the window around the network, you can either use the middle mouse button, or drag the small window outlined in blue around in the Network Overview Pane.
- Finally, zoom your network out by clicking on the icon.
While useful, hand selecting nodes in dense networks can be error-prone and difficult. However, you can specifically search for a node by name or attribute:
- In the Search: box at the top of the screen, type in ynr050c. This will select that node and zoom the display to focus on it.
The Search: box will also allow you to select nodes by other attributes, but first, we need to import more attributes...
Visualizing Data on Networks
Cytoscape provides a number of features to load arbitrary data and visualize that data by mapping the attribute values to visual styles.
Importing Your Data
Cytoscape can read file structures that are delimited Text or Excel files.
- Go to File-> Import->Attribute from Table (Text/MS Excel)
- You should see the Import Attribute from Table Dialog Box
- Make sure the Node radio attribute button is selected in Data Sources Attributes
- In Data Sources, click on Select File
- Select galExpData.pvals from the sampleData folder and then click on Open
- In the Advanced section, select Show Text File Import Options
- By default, Tab is selected in the Delimiter section. Instead, select Space.
- In the Attribute Names section, select Transfer first line attribute names
- Also in the Advanced section, select Import Everything
The ‘Import Everything’ option tells Cytoscape to load all of your data, not just the records that match currently loaded networks.
If you were to click ‘Import’ now you would see a pop-up message complaining about duplicate attribute names. Notice that columns 6, 7 and 8 have the same names as 3, 4 and 5. The next steps show you how to fix column names without having to start over from Excel.
- Right click on the column header of the first duplicate name (column 6, gal1RG) to open the Set Attribute Name and Type dialog box
- Add the suffix ‘sig’ to the name (e.g, ‘gal1RGsig’) to distinguish the column as containing p-values (significance)
- Repeat these last two steps on columns 7 and 8 (gal4RG and gal80R)
- Now click Import
You’ve successfully loaded data! Now explore the Data Panel to confirm the mapping of the data to the network.
- Locate the Select Attributes button in the Data Panel
- Select the data attributes: gal1RG, gal4RG, and gal80R
- Return to the Data Panel by click away from the attribute list
- Select some nodes in the network (ctrl-A selects all nodes) and see the associated data in the Data Panel
You now have access to your networked data and can begin playing with visualization.
- Go to Layout -> Cytoscape Layouts -> Force-Directed Layout
- Go to the VizMapper tab in the Control Panel and select Sample 1
These are default visual styles. In the next section you will customize a visual style to highlight your data values.
In the previous section, you were able to select a couple of attributes to display in the Data Panel. In this section, we will explore a little more about attributes and the Data Panel.
- First click on the Configure search options icon: .
- Change the Select Attribute: to COMMON. You should see this screen:
- Click on Apply
- Now type mcm1 in the Search: box. This will select the node:YMR043W, and display the attributes for that node in the Data Panel.
- We're going add a new attribute for MCM1. Click on the icon and select String Attribute.
- Type in pdb for the name of the attribute -- this will define a new string attribute for nodes, and add it to the Data Panel.
- Now click into the empty cell for newly-created pdb attribute for YMR043W and type 1mnm, which is the PDB ID for the yeast protein mcm1. You need to hit Return or Tab to enter that data.
- Move the pdb attribute to be the second column by dragging the column header to be behind the ID column
- Finally, select a number of nodes and note that the attributes for all of the nodes are shown in the Data Panal
- By clicking on the column header, you can sort the columns. Clicking again changes the order of the sort.
Visualizing Data with VizMapper
- Go to File-> Open (click ‘Yes’ to losing current session)
- You should see the Open a Session File Dialog
- Open the sampleData folder and select galFiltered.cys and then click on Open
Notice how the galFiltered Style maps multiple data and annotation values to:
- Edge Color
- Edge Line Style
- Node Border
- Node Color
- Node Label
- Node Size
- Node Tooltip
Modifying a Visual Style
Customising the way you visualize and manipulate networks is a key function of Cytoscape. This is achieved through the use of the VizMapper tool.
- To launch the VizMapper, either select the VizMapper tab on the Control Panel or click on the VizMapper icon at the top of the tool bar
- Find Node Color in the Visual Mapping Browser and expand it by clicking on the triangle icon for expand/collapse
- Click on the value ‘gal4RGexp’ to select an alternative data value to map to node color: select ‘gal80Rexp’. Notice the changes in the network display.
- Click on the gradient color mapping to open the Gradient Editor
- Control the values and colors of the mapping by means of triangular handles and endpoint markers
- Double-click on any handle or marker to change it’s color: change green to blue and change red to yellow. Notice the immediate changes to the network
- Click-and-drag any handle to slide its value between the min and max
- To save your changes, close the editor
- Find Edge Color in the Visual Mapping Browser and expand it by clicking on the triangle icon
- Notice the discrete mapping of colors to values. Click on any color mapping to change the color.
- Close the editor to save your changes
Creating a Visual Style (Advanced)
- In the Current Visual Style section, start a new visual style by clicking the Options button and selecting “Create new Visual Style”
- Enter a name for your custom visual style when prompted, click on OK
- Find Node Color in the list of Unused Properties and double-click to activate
- For Node Color select the value ‘gal1RGexp’ to map expression fold values. For Mapping Type select ‘Continuous’. For Graphical View click on the gradient to open the dialog:
- Click on Min/Max button and set to -1 and 1, respectively
- Double-click on gradient handles to set colors, e.g.
- below -1.0 = green
- -0.8 = green
- 0.8 = red
- above 1.0 = red
- Click Add to add another gradient handle (added to max by default). Drag to center at 0.0. Leave as white color.
- Close gradient dialog
- Next, activate Node Size. Select ‘gal1RGsig’ to map p-values. For Mapping Type select ‘Continuous’. Click on gradient to open dialog:
- Note: We want smaller p-values (more significant) to show as larger nodes
- Click Min/Max to verify the correct values: 1.0E-8 and 1.0E-3, respectively.
- Double-click on solid red box for p-values below the minimum (far left) to set the maximum node size. Set to 70.0
- Double-click on solid red box for p-values below the maximum (far right) to set the minimum node size. Set to 20.0
- Select handles along the top border to drag the x-position (p-value) of the gradient points. When selected, you can also set the Attribute and Node Size value by editing the Handle Settings field. You can also drag (or double-click) the open red boxes to set the y-position (Node Size).
- Set the left open box to Node Size 70.0 and the right open box to 20.0. This defines the range of the gradient.
- Click 'Add' to add a new gradient handle. Set it to Attribute Value = 1E-4 and Node Size = 40.0.
This creates a pseudo-exponential gradient mapping:
- Click 'OK' in the gradient dialog and explore the visualization you’ve created!
- Zoom by selecting an area and clicking
- Use the bird’s-eye-view panel (bottom-left) to pan around network
- Return to VizMapper and switch mappings to another column of data:
- For Node Color select ‘gal4RG’
- For Node Size select ‘gal4RGsig’
Notice the change in the view of the network! You can reuse the mappings across multiple datasets. Now is a good time to save your Cytoscape session, to save the visual style you’ve created.
Laying Out Your Network
A network layout is a process that positions the nodes and edges for the network. There are a large variety of layouts in Cytoscape and plugins might add new layouts. All of the layouts will appear under the Layouts menu. In this section, we'll explore some of the layouts in the Cytoscape Layouts category, which are the core layouts supported by the Cytoscape team. All of the Cytoscape Layouts support the ability to only layout a portion of the network, and most expose parameters that can be used to tune the layout algorithm.
The simplest layout that Cytoscape provides is the Grid Layout, which simply places all of the nodes in a grid arrangement.
- Using the network you loaded before, select Layout->Cytoscape Layouts->Grid. You should see the image below:
- Now, select Edit->Undo Grid Layout Layout and notice how your network reverts to the previous condition.
- Many (but not all) of Cytoscape's actions may be undone.
- Select a group of nodes in your network and again select Layout->Cytoscape Layouts->Grid.
- Notice that this now has a sub-menu with two options: All Nodes and Selected Nodes Only.
- Choose Selected Nodes Only.
- Note that only the nodes you selected are changed.
Grid is very fast, but often not very helpful. Alternatives:
- Circular Layout places all of the nodes in a circular arrangement.
- Very quick
- Usually not very informative.
- Partitions the network into disconnected parts and independently lays out those parts.
- Hierarchical Layout forces the nodes into a tree structure.
- Works best when the network is naturally tree-structured
- Also works reasonably well when the network is mostly hierarchical.
Data-Driven Simple Layouts
Often what is desired is to organize the nodes in space to reflect some data property of the nodes themselves. We'll look at three of the simple ones:
- Degree Sorted Circle: Orders the node around a circle based on node degree (number of edges)
- Attribute Circule Layout: Orders the node around a circle based on the value of some attribute
- Group Attributes Layout: Groups the nodes based on the value of some attribute
Degree Sorted Circle Layout
For example, assume you are interested in hubs (nodes with high degree):
- Select Layout->Degree Sorted Circle Layout.
- Note the highest degree nodes are in the same region of the circle and the degree decreases as you proceed counter-clockwise around the circle.
- Also note that this layout supports partitioning of the network into disconnected components.
Attribute Circle Layout
- Now try Layout->Attribute Circle Layout->Degree.
- This should give you a very similar layout to the Degree sorted circle layout.
- You should also try using other attributes to layout.
Group Attributes Layout
- Finally, select Layout->Cytoscape Layouts->Group Attributes Layout->Degree.
- Note that this layout organizes all of the nodes with the same degree into a circle and then positions the circles into a grid.
- Partitions the graph before layouts.
- These layouts are useful to represent data attributes associated with the nodes of a network, however, they don't provide much information about the network structure itself. For that, using more complicated layouts are required.
The force-directed methods for laying out graphs all use some kind of physical simulation that models the nodes as physical objects and the edges as springs connecting those objects together.
- Cytoscape provides 4 force-directed layouts:
- Edge-weighted Force directed (BioLayout)
- Edge-weighted Spring Embedded
- Spring Embedded
- In this exercise, we'll use the Force-Directed Layout, which is a port of the layout by the same name in the prefuse package (see [http://prefuse.org]).
- We'll do all of our work from the Layout settings dialog, so bring up the dialog by Layout->Settings...
- In the Layout Settings dialog:
- Select Force-Directed Layout under Select algorithm to view settings.
- The result should look like the image on the right
- Lets start by creating a layout using the default parameters.
- Select Execute Layout.
- Note how this exposes more of the structure of the network than any of the previous layouts.
- Now, let's see what the parameters do.
- Start by changing the Default Spring Length to 20, and select Execute Layout.
- The default spring length is the length of a spring (edge) with no forces exerted upon it. Essentially, it's the length of an edge connecting two nodes with no other connections.
- Note how the layout has changed.
- The resulting network tends to be more closely packed in the denser regions.
- Finally, lets change Default Spring Length back to 50 and the Default Spring Coefficient to 1E-3.
- The spring coefficient is a measure of the strength of the spring.
- Now click Execute Layout.
- Note that the network remains pretty compact, even though we've reset the default spring length back to 50.
Force-Directed Layout is also a good layout to use for circumstances when you want to have the length of your edges reflect some numeric weight on each edge.
- Select the attribute containing your edge weights from the menu The edge attribute that contains the weights.
- This often requires some tuning of the spring coefficient and the spring length to get an aesthetically pleasing layout.