Network Analysis in R (Example)

In the previous tutorial, we imported a food web network from published ecological data. Let’s go back to it.
As a reminder, a food web is a network of food relationships between predators and prey (think interconnected food chains).

If you have not stored the final network on your computer, I invite you to go through that tutorial again until you get the network in the igraph format. Make sure to get familiar with the vocabulary.

In this tutorial, you’ll learn how to calculate common key metrics to compare networks.

The table of content is structured as follows:

1) Calculating S and L

2) Deriving secondary metrics

3) More complex metrics and algorithms

4) Video, Further Resources & Summary

Let’s dive into it!

library(tidyverse)
library(glue)
library(igraph)

We have to make distinctions between primary and secondary metrics. To put it simply, secondary metrics are calculated from primary metrics. For instance, one metric we will calculate is “link density” which is derived from the number of links “L” and the number of nodes “S”. Most common and maybe simplest metrics use “L” and “S” to derive properties of the network.

Calculating S and L

As mentioned, the most basic metrics of a network are:
* its number of nodes S,
* its number of links L.

Let’s calculate them both from the matrix and network:

# from the matrix
nrow(adj_matrix) == ncol(adj_matrix) # Check that the adjacency is squared
all(row.names(adj_matrix) == colnames(adj_matrix)) # Check that the squared adjacency has all nodes as rows and columns
 
nodes_mat <- colnames(adj_matrix)
S_mat <- length(nodes)
L_mat <- sum(adj_matrix) # the links are all the ones so the sum of 0s and 1s will give the total number of realized links

# from the network using igraph
nodes_net <- V(network_foodweb)
S_net <-  vcount(network_foodweb) # count the number of vertices or nodes
L_net <- ecount(network_foodweb) # count the number of edges or links
 
print(glue::glue("This graph has {S_net} nodes that share {L_net} links."))
# This graph has 233 nodes that share 2218 links.

Deriving secondary metrics

From S and L, we can get a bunch of informative secondary metrics. For instance, they can tell us how complete a graph is or how spread links are across nodes.

Measures of connectivity

Connectivity measures inform on how well connected the graph is. A fully connected graph would be a graph where all nodes are connected to all other nodes, including themselves.

Measuring connectedness can tell you how complex of a network you are working with and can reveal if it might include sub-structures (a.k.a. subgraphs). There exists a bunch of metrics.

Linkage Density

Z_mat <- L_mat / S_mat # linkage density or average number of links per nodes
Z_net <- ecount(network_foodweb) / vcount(network_foodweb)
 
print(glue::glue("Linkage Density: {Z_net}"))
# Linkage Density: 9.51931330472103

Take a look at the code output above. A value of 9.52 tells us that, on average, a node form 9.52 connections with any other nodes.

Connectance or edge density

Connectance measures how many links are realized out of all possible links. The total number of possible links is thus S squared since each node has a chance to form a link with any other node, including itself.

C_mat <- L_mat / S_mat ^ 2
C_net <- edge_density(network_foodweb, loops = TRUE) # this network includes loops (A --> A) which represent cannibalism in nature
print(glue::glue("Connectance: {C_net}"))
# Connectance: 0.0408554219086739

A value of 0.041 tells us that only 4% of the possible links are realized. This network has a low number of connections.

More complex metrics and algorithms

Graphs were invented to be able to represent paths between objects: paths along the links between the nodes, similar to the roads from your home to the holiday resort you are going to next summer. It is no surprise that we can evaluate those paths.

Distance

The distance of a graph (or path length) measures the length of all the shortest paths between any two nodes. A path is represented by the roads you need to take to reach your destination.

Like in a path of A –> B –> C, we only need to take two roads (represented by the arrows, or links). The path length is 2. In complex networks, there exist many paths to go from A to E, but only a few will take you less time to travel that distance.

distance_net <- distances(network_foodweb, 
          v = V(network_foodweb), # from where are we starting: here any nodes
          to = V(network_foodweb), # to where are we going: here any nodes
          mode = "all")

This function returns a matrix of the same size as the adjacency, but instead of 0s and 1s, we get the shortest length of paths between any two nodes. The diagonal takes 0s because, since we cannot travel in place.

Diameter or longest path

The diameter measures the distance of the longest path across all pairs of nodes.

longest_path <- diameter(network_foodweb, directed = TRUE)
print(glue::glue("Diameter: {longest_path}"))
# Diameter: 7

Shortest path

You can extract the actual path between any two nodes in a human-readable format: A –> B –> C.

paths_S_to_S <-  shortest_paths(
    network_foodweb,
    from = V(network_foodweb),
    to = V(network_foodweb),
    mode = "all",
    output = "both")
 
paths_S_to_S 
# $vpath[[27]]
# + 3/233 vertices, named, from 49d27c4:
#   [1] ACARTIA_SPP        BOREOGADUS_SAIDA   APHERUSA_GLACIALIS
......

Above, only a small part of the output is shown for the sake of saving space.

Average path

We can also return the average path length of all shortest paths.

avg_path_length <- mean_distance(
    network_foodweb,
    directed = TRUE,
    unconnected = TRUE # if the graphs is disconnected, only the existing paths are considered
)
print(glue::glue("Average distance: {avg_path_length}"))
# Average distance: 2.3070777062858

Despite its complexity, this food web network has a low average path. It means we can reach any two nodes through a very small number of connections.

Have you heard of the Milgram experiment? It was the first real-world demonstration of the average path length of a network.

There are plenty more metrics, algorithms, and concepts to explore with networks, but this should give you solid foundations to dig deeper.

Happy exploring!

Video, Further Resources & Summary

Do you need more explanations on network analysis in R? Then you should have a look at the following YouTube video of the Statistics Globe YouTube channel.

The YouTube video will be added soon.

Furthermore, you could have a look at some of the other tutorials on Statistics Globe:

This post has shown how to calculate common metrics to compare networks. In case you have further questions, you may leave a comment below.

Pierre Olivier PhD Statistician & Programmer

This page was created in collaboration with Pierre Olivier. You may have a look at Pierre’s author page to read more about his academic background and the other articles he has written for Statistics Globe.

2 Comments. Leave new

Khem
August 25, 2023 6:53 am

Thanks for making this tutorial. But what does the graph/network look like? there is no out put figure displayed.

Reply
- Cansu (Statistics Globe)
  August 25, 2023 7:41 am
  
  Hello Khem,
  
  Our author shows different network layouts in his tutorial Network Visualization in R. That one could be what you are looking for.
  
  Best,
  Cansu
  
  Reply