Marvellous Social Networks: an applied project

Learning how to describe a social network!

marvel_comics

This blog post was originally an assignment I completed for a Social Networks class during my Master’s in International Economics at the Graduate Institute of Geneva (IHEID). This first post mostly illustrates the descriptive part of the social networks analysis process describing properties of a network of Marvel character friendships. Happy reading!

If you’d like to follow along, the dataset is available in the excellent {migraph} package.

Question 1: Centrality

This part will focus on answering the following centrality related question: What characters are the ideal candidates for crossover comics in the Marvel universe, in the global sense?

The Marvel universe is comprised of multiple franchises that include but are not limited to the Avengers, the Fantastic Four and the X-Men. The question thus naturally arises which characters are most likely to appear in comics, films or books that span across these franchises. Part of the answer to this question will be revealed in this section by looking at the characters with the highest level of betweenness centrality as these high-betweenness characters (ego) tend to be on the shortest path between two other characters (alters). It is worth mentionning two caveats of this analysis before proceeding further with our analysis. First, when looking at the measure of betweenness centrality, we are considering the global rather than the local structure of the network. Second, the initial ison_marvel_relationships is a signed but unweighted network where the ties either represent a friendly relationship or a rival relationship between characters. On the other hand, the absence of a tie indicates the absence of a relationship between the characters. For the purpose of this first question, we will examine the underlying unsigned network as we are only interested in the existence of a relationship since it indicates some kind of previous interaction in a comic book.

The results are unsurprising, since the top 5 characters in terms of betweenness centrality measure are (in order): Wolverine, Spider-Man, Invisible Woman, Hulk, and Black Panther. These are some of the older Marvel characters and are also some of the characters with the most number of appearances. Out of these top 5 characters, none is a “mainstream” villain1 and while three are traditional mainstream characters (Black Panther, Spider-Man, and Hulk) the other two come from either the X-Men franchise or the Fantastic Four franchise. This last point leads us to the answer to the question we’re asking in this first part as both Wolverine and the Invisible Woman seem to have enough preexisting back-story in the form of past relationships with other characters to be able to bridge the gap between the different franchises with the most ease.

Table 1: Top 5 Nodes: Betweenness
Name Gender Appearances Degree Betweenness Closeness Eigenvector
Wolverine Male 12751 43 78.230 0.017 0.232
Spider-Man Male 11963 41 66.501 0.016 0.218
Invisible Woman Female 5082 41 42.519 0.015 0.242
Hulk Male 6062 43 41.241 0.015 0.249
Black Panther Male 2189 32 39.404 0.015 0.180

Question 2: Community

In this subsection, we will look at the question of structural balance of the network and leverage the {signet} package to look at this issue further. More specifically, we’ll try to answer the following question: How balanced/cognitively consistent is the Marvel Relationship network? This will be interesting in our case since the original dataset is signed, meaning that there is additional binary information on the nature of the tie linking two characters (e.g. whether they are allies or foes). Theorized by the Austrian psychologist Fritz Heider in the mid 20th century Heider (1958), balancedness in a social network is driven by cognitive consistency among an ego and two alters. Cognitive consistency being defined as the absence of contradiction in the nature of the ties that the ego has with its two alters. In our case, Hulk, Iron Man and Thanos are an example of a cognitively consistent and therefore balanced triad. This is due to the fact that Hulk is friends with Iron Man and enemies with Thanos, implying/satisfying (by cognitive consistency) that Iron Man is also an enemy of Thanos. Furthermore, it is interesting to examine this concept of balance in a dynamic setting as attitudes between characters may change over time. For example, Hulk becomes an evil character in one of the universes (we’ll assume that he becomes friends with Thanos for this example). To maintain cognitive consistency in the considered triad, one of two things has to happen. Either Iron Man decides that Hulk is no longer his friend since he is a bad guy who became friends with Thanos (i.e. the friend of my enemy is my enemy) or Thanos and Iron Man become friends and every character becomes a bad guy (i.e. the friend of my friend is my friend from Thanos’ perspective).

Type Count Balanced?
+++ 436 Yes
++- 287 No
+– 944 Yes
398 No
Total 2065
Method Results
Triangles 0.668
Walk 0.071
Frustration 0.858

The {signnet} allowed us to both count the number of balanced/unbalanced triads and to compute different measures of balancedness. At the first glance, when considering the simple counting table, we seem to witness more balanced triads than unbalanced ones. This seems to be consistent with Heiders’ theory as the engine of cognitive consistency would have shaped the relations between the characters dynamically and may still be in the process of shaping. Let’s push our analysis a little further and look at the three metrics {signnet} computes: Triangles, Walk and Frustration. By looking at the results, we notice that the Frustration algorithm and the Triangles one yield consistent results while the Walk algorithm seems to contradict the former two. While this seems strange at first glance, this leads back to the different ways the metrics are computed and the way they frame the degree of balancedness of a network. Without going into too much detail, Estrada’s walk measure takes into account the proximity of imbalanced clusters while the other two are agnostic of this effect Estrada (2019). Conceptually, it’s straightforward to see that close imbalanced clusters are generating more tension than ones that are further away. Thus, the answer to our question is depending on the measure we consider as well as on the underlying theory we use to conceptualize the network with.

In conclusion, while the cognitive consistency in a social network seems like a natural law, it should be interpreted carefully, especially in our case where cognitive inconsistency may also be desirable since it " [sometimes] evoke[s], like other patterns with unsolved ambiguities, powerful aesthetic forces of a tragic or comic nature" Heider (1958). Having panel measures of these indexes would certainly cast a light onto the dynamics of these social mechanisms and allow to answer the question at hand in more detail.

Question 3: Position

We have hinted at the difference between the global nature of the concept of betweenness centrality and the local nature of the concept of structural holes in the first part of this paper. This last part will therefore try to answer a slightly different question than in the first part of the paper to illustrate this difference. What characters are the ideal candidates for crossover comics in the Marvel universe, in the local sense?

This slight difference is consequential though as the definition of a cross-over comic is no longer the same as in part 1. Since we are now focusing on the local nature of patterns of ties, we’ll think about cross-over comics as comics where multiple characters “gang up” to fight each other. Note that we’ll consider both heroes and villains in this case as we can’t disaggregate the data accordingly. Thus, a character in a structural hole marked by a low level of constraint, will be the ideal candidate for a movie/comic bridging the universe of multiple other characters within a franchise (e.g. the Avengers). This indicates that Spider-Man, Worlverine, Black Panther, Hulk and Iron Man seeom to be the ideal candidates for such a role. Importantly, note that this is different from the conclusion that we drew in the discussion on centrality in the first part as betweenness highlights characters that were ideally placed to span across multiple franchises of the entire Marvel Network.

Finally, an interesting albeit tangential clue that we get from examining the node constraint measure is that many villains (4 out of 5 in the top 5) seem to find themselves in a part of the network with strong ties (e.g. a high level of constraint). This inverse correlation reflects the fact that villains are more likely to gang up against a hero than a hero is against a single enemy.

Table 2: Head and Tail Nodes: Constraint
Name Gender Appearances Constraint Betweenness Hero?
Ant-Man Male 589 0.262 0.894 Yes
Emma Frost Female 4777 0.25 0.112 Both
Blade Male 438 0.223 0.585 Both
Kingpin Male 1323 0.211 2.815 No
Bullseye Male 898 0.196 1.657 No
Iron Man Male 8579 0.102 32.157 Yes
Hulk Male 6062 0.1 41.241 Yes
Black Panther Male 2189 0.097 39.404 Yes
Wolverine Male 12751 0.094 78.23 Yes
Spider-Man Male 11963 0.093 66.501 Yes

Image credits:

Photo by Erik Mclean on Unsplash

Appendix:

This appendix does not contain any additional content. It’s only function is to print the echoes of the code used to handle, measure and visualize the networks in the previous three sections in an effort to ensure transparency and replicability of our code.

Okay I lied, it actually does one more thing, it tries to remotivate the reader who made it that far with a little randomly generated quote (courtesy of the {dang} package).

## 'Success is not final, failure is not fatal. It is the courage to 
## continue that counts.'  Winston Churchill

And an additional graph to visualize the signed network for question 2:

# Set initial knitr options
knitr::opts_chunk$set(eval = TRUE, echo = FALSE,
                      fig.align = "center",
                      fig.asp = 0.7,
                      dpi = 300,
                      out.width = "80%",
                      fig.pos = "!H",
                      out.extra = "")
rm(list = ls())   # Clean Environment
#********************* Package Management *************************************#
library(migraph)  # Development version 29.10.2021
library(igraph)     # Latest version
library(ggraph)
library(signnet)
library(tidygraph)
library(tidyverse)  # All things data science
library(dang)       # For fun

#***************************** Load data  *************************************#
relationships <- migraph::ison_marvel_relationships
teams         <- migraph::ison_marvel_teams
# Set both objects to the same class to be more consistent
class(relationships) # "tbl_graph" "igraph"
class(teams)         # "igraph"
relationships <-  migraph::as_igraph(relationships)
teams         <- migraph::as_igraph(teams)
# relationships # Relationship network, Unimodal
# teams         # Teams network, Bipartite: teams and individuals

# Initial visualisations
migraph::autographr(relationships)
migraph::autographr(teams)

# Initial visualisation for relationships dataset shows three (unrealistic)
# isolates. Let's drop them.
# PS: Luke Cage and Iron Fist are friends in the MU
isolates <- which(degree(relationships) == 0)
relationships <- delete.vertices(relationships, isolates)

# Note that the relationships network is signed. We can separate the network
# between friends and enemies.

friends <- migraph::to_unsigned(relationships, "positive")
ennemies <- migraph::to_unsigned(relationships, "negative")
#***************************** Question 1  ************************************#

# We'll use the relationships network for this one
# Note that this is a signed network! +1 indicates a friendship tie and -1
# an enemy tie.

relDFnode <- igraph::as_data_frame(relationships, "vertices")
relDFedge <- igraph::as_data_frame(relationships, "edges")

# Create a dataframe with the different centrality measures of every node
centralityFullDF <- dplyr::tibble(
  name         = igraph::get.vertex.attribute(relationships)$name,
  degree       = migraph::node_degree(relationships),
  betweenness  = migraph::node_betweenness(relationships),
  closeness    = migraph::node_closeness(relationships),
  eigenvector  = migraph::node_eigenvector(relationships)) %>%
  arrange(
    desc(betweenness)
  )
# centralityFullDF

# Let's focus on relations rather than friends to explore which characters
# bridge universes
relUnsigned <- igraph::remove.edge.attribute(relationships, "sign")
# For the graph let's revert to a tidygraph object and add node level centrality
relUnsigned <- migraph::as_tidygraph(relUnsigned) %>%
  mutate(degree       = migraph::node_degree(relUnsigned),
         betweenness  = migraph::node_betweenness(relUnsigned),
         closeness    = migraph::node_closeness(relUnsigned),
         eigenvector  = migraph::node_eigenvector(relUnsigned))

# DF for top 5 table output
centralityRelationDF <- dplyr::tibble(
  name         = igraph::get.vertex.attribute(relUnsigned)$name,
  degree       = as.numeric(migraph::node_degree(relUnsigned)),
  betweenness  = as.numeric(migraph::node_betweenness(relUnsigned)),
  closeness    = as.numeric(migraph::node_closeness(relUnsigned)),
  eigenvector  = as.numeric(migraph::node_eigenvector(relUnsigned)))
# Perform inner join on node characteristics and  centrality measures
z <- dplyr::inner_join(igraph::as_data_frame(relUnsigned, "vertices"),
                  centralityRelationDF, "name")
# Get top 5
top5betweenness <-
  head(dplyr::arrange(z, desc(betweenness.y)), 5)[c(1,2,3, 11, 12, 13, 14)]
top5betweenness <- dplyr::rename(top5betweenness,
                                 Name = name,
                                 Degree = degree.x,
                                 Betweenness = betweenness.x,
                                 Closeness = closeness.x,
                                 Eigenvector = eigenvector.x) %>%
  dplyr::mutate(across(where(is.numeric), round, 3))
# Make outtable
kableExtra::kbl(top5betweenness, "html", caption = "Top 5 Nodes: Betweenness") %>%
  kableExtra::kable_styling(position = "center") %>%
  kableExtra::kable_material(full_width = T)
# Set layout
l <- ggraph::create_layout(relUnsigned, layout = "fr")
# Plot graph
ggraph::ggraph(relUnsigned, layout  = l) +
  ggraph::geom_node_point(aes(size  = betweenness,
                              alpha = betweenness,
                              color = Appearances)) +
  ggraph::geom_node_label(aes(filter = betweenness > 39,
                              label  = name),
                          label.padding = 0.15,
                          label.size    = 0,
                          repel         = TRUE) +
  ggraph::geom_edge_link(alpha = 0.05) +
  ggplot2::guides(alpha  = "none") +
  ggplot2::labs(color    = "Appearences",
                size     = "Node Betweenness",
                title    = "Highlighting Betweenness and Appearence",
                subtitle = "What characters are the ideal candidates for crossover comics in the Marvel universe in the global sense?",
                caption  = "Source: {migraph} (2021)") +
  ggplot2::theme_minimal() +
  ggplot2::theme(panel.border     = element_blank(),
                 panel.grid.major = element_blank(),
                 panel.grid.minor = element_blank(),
                 axis.line        = element_blank(),
                 axis.text.x      = element_blank(),
                 axis.ticks.x     = element_blank(),
                 axis.title.y     = element_blank(),
                 axis.text.y      = element_blank(),
                 axis.ticks.y     = element_blank(),
                 axis.title.x     = element_blank(),
                 plot.caption     = element_text(color = "grey20",
                                                 face  = "italic"),
                 plot.title       = element_text(size  = 14),
                 plot.subtitle    = element_text(color = "grey20",
                                                 face  = "italic",
                                                 size  = 9))
#************************* Question 2 *****************************************#
rm(list = ls())
# Back again with the relationships data
relationships <- migraph::ison_marvel_relationships

# Initial visualisation for relationships dataset shows three (unrealistic)
# isolates. Let's drop them.
# PS: Luke Cage and Iron Fist are friends in the MU
isolates <- which(degree(relationships) == 0)
relationships <- delete.vertices(relationships, isolates)
# Set linetypes
edge_linetype <- ifelse(igraph::E(relationships)$sign >= 0,
                        "solid",
                        "dashed")
edge_colour <-   ifelse(igraph::E(relationships)$sign >= 0,
                        "#0072B2",
                        "#E20020")
# Set layout
l <- ggraph::create_layout(relationships, layout = "fr")
# Plot graph
relgraph <- ggraph::ggraph(relationships, layout  = l) +
  ggraph::geom_edge_link0(alpha = 0.4,
                          edge_linetype = edge_linetype,
                          edge_colour   = edge_colour) +
  ggraph::geom_node_point(aes(color = PowerOrigin,
                              alpha = 0.5),
                          size  = 3) +
  ggraph::geom_node_label(aes(filter = name %in% c("Hulk", "Iron Man", "Thanos"),
                              label  = name),
                          label.padding = 0.15,
                          label.size    = 0,
                          repel         = TRUE) +
  ggplot2::guides(alpha  = "none") +
  ggplot2::labs(color    = "Power Origin",
                title    = "Marvel Character Network: Examining the Signs",
                subtitle = "Visualizing network balance",
                caption  = "Source: {migraph} (2021)") +
  ggplot2::theme_minimal() +
  ggplot2::theme(panel.border     = element_blank(),
                 panel.grid.major = element_blank(),
                 panel.grid.minor = element_blank(),
                 axis.line        = element_blank(),
                 axis.text.x      = element_blank(),
                 axis.ticks.x     = element_blank(),
                 axis.title.y     = element_blank(),
                 axis.text.y      = element_blank(),
                 axis.ticks.y     = element_blank(),
                 axis.title.x     = element_blank(),
                 plot.caption     = element_text(color = "grey20",
                                                 face  = "italic"),
                 plot.title       = element_text(size  = 14),
                 plot.subtitle    = element_text(color = "grey20",
                                                 face  = "italic",
                                                 size  = 9))
# We have a network were a +1 tie indicates a friendly raltionship
# and a -1 tie an animosity between two characters

# Step 1: Count the number of triangles
countTriangles <- dplyr::tibble(
  Type = c(names(signnet::count_signed_triangles(relationships)), "Total"),
  Count = c(as.numeric(signnet::count_signed_triangles(relationships)),
                     sum(as.numeric(signnet::count_signed_triangles(relationships)))),
  `Balanced?` = c("Yes", "No", "Yes", "No", ""))

t1 <- knitr::kable(countTriangles,
                align = "c",
                format = "html") %>%
  kableExtra::kable_minimal()
t1
# +++ ++- +-- --- 
# 436 287 944 398 
# The network seems to be more balanced than unbalanced with this quick
# comparaison

# Degree of balancedness:

Balancedness <- dplyr::tibble(
  Method  = c("Triangles", "Walk", "Frustration"),
  Results = c(signnet::balance_score(relationships, method = "triangles"),
              signnet::balance_score(relationships, method = "walk"),
              signnet::balance_score(relationships, method = "frustration"))) %>%
  dplyr::mutate(across(Results, round, 3))
# Make outtable
t2 <- knitr::kable(Balancedness,
                align = "c",
                format = "html") %>%
  kableExtra::kable_minimal()
t2
# Okay, we get quite a different result with the Walk method based on
# Ernesto Estrada measure of balancedness. Let's explain why.
#***************************** Question 3 *************************************#
rm(list = ls())
# Back again with the relationships data
relationships <- migraph::ison_marvel_relationships

# Initial visualisation for relationships dataset shows three (unrealistic)
# isolates. Let's drop them.
# PS: Luke Cage and Iron Fist are friends in the MU
isolates <- which(degree(relationships) == 0)
relationships <- delete.vertices(relationships, isolates)

# DF for top 5 table output
constraintRelationDF <- dplyr::tibble(
  name         = igraph::get.vertex.attribute(relationships)$name,
  constraint   = as.numeric(migraph::node_constraint(relationships)),
  betweenness  = as.numeric(migraph::node_betweenness(relationships)))
# Perform inner join on node characteristics and  centrality measures
z <- dplyr::inner_join(igraph::as_data_frame(relationships, "vertices"),
                  constraintRelationDF, "name")
# Get top 5 table 
bottom5constraint <-
  tail(dplyr::arrange(z, desc(constraint)), 5)[c(1,2,3, 11, 12)] %>%
  dplyr::mutate(across(where(is.numeric), round, 3))
top5constraint <- 
  head(dplyr::arrange(z, desc(constraint)), 5)[c(1,2,3, 11, 12)] %>%
  dplyr::mutate(across(where(is.numeric), round, 3))
top5constraint <- rbind(top5constraint, rep("...", 5))
out <- rbind(top5constraint, bottom5constraint)
rownames(out) <- c()
out$`Hero?` <- c("Yes", "Both", "Both", "No", "No", "...",
                "Yes", "Yes", "Yes", "Yes", "Yes")
out <- dplyr::rename(out,
                     Name = name,
                     Betweenness = betweenness,
                     Constraint = constraint) 
# Make outtable
kableExtra::kbl(out,
                format = "html",
                caption = "Head and Tail Nodes: Constraint",
                align = "c") %>%
  kableExtra::kable_material(full_width = F)
# *********************** Motivational Easter Egg *****************************#
dang::motivate(2)
rm(list = ls())
# Back again with the relationships data
relationships <- migraph::ison_marvel_relationships

# Initial visualisation for relationships dataset shows three (unrealistic)
# isolates. Let's drop them.
# PS: Luke Cage and Iron Fist are friends in the MU
isolates <- which(degree(relationships) == 0)
relationships <- delete.vertices(relationships, isolates)
# Set linetypes
edge_linetype <- ifelse(igraph::E(relationships)$sign >= 0,
                        "solid",
                        "dashed")
edge_colour <-   ifelse(igraph::E(relationships)$sign >= 0,
                        "#0072B2",
                        "#E20020")
# Set layout
l <- ggraph::create_layout(relationships, layout = "fr")
# Plot graph
ggraph::ggraph(relationships, layout  = l) +
  ggraph::geom_edge_link0(alpha = 0.4,
                          edge_linetype = edge_linetype,
                          edge_colour   = edge_colour) +
  ggraph::geom_node_point(aes(color = PowerOrigin,
                              alpha = 0.5),
                          size  = 3) +
  ggraph::geom_node_label(aes(filter = name %in% c("Hulk", "Iron Man", "Thanos"),
                              label  = name),
                          label.padding = 0.15,
                          label.size    = 0,
                          repel         = TRUE) +
  ggplot2::guides(alpha  = "none") +
  ggplot2::labs(color    = "Power Origin",
                title    = "Marvel Character Network: Examining the Signs",
                subtitle = "Visualizing network balance",
                caption  = "Source: {migraph} (2021)") +
  ggplot2::theme_minimal() +
  ggplot2::theme(panel.border     = element_blank(),
                 panel.grid.major = element_blank(),
                 panel.grid.minor = element_blank(),
                 axis.line        = element_blank(),
                 axis.text.x      = element_blank(),
                 axis.ticks.x     = element_blank(),
                 axis.title.y     = element_blank(),
                 axis.text.y      = element_blank(),
                 axis.ticks.y     = element_blank(),
                 axis.title.x     = element_blank(),
                 plot.caption     = element_text(color = "grey20",
                                                 face  = "italic"),
                 plot.title       = element_text(size  = 14),
                 plot.subtitle    = element_text(color = "grey20",
                                                 face  = "italic",
                                                 size  = 9))

References:

Estrada, Ernesto. 2019. “Rethinking Structural Balance in Signed Social Networks.” Discrete Applied Mathematics 268: 70–90.
Heider, Fritz. 1958. The Psychology of Interpersonal Relations. Psychology Press.

  1. NB: we consider only the universe where Hulk doesn’t become a bad character.↩︎

Bernhard Bieri
Bernhard Bieri
Master’s Student in International Economics

My research interests include applied microeconomics, development economics and statistical computing.

Related