Marvellous Social Networks: an applied project
Learning how to describe a social network!
By Bernhard Bieri in Networks Marvel R
December 15, 2021
This blog post was originally an assignment I completed for a Social Networks class during my Master’s in International Economics at the Graduate Institute of Geneva (IHEID). This first post mostly illustrates the descriptive part of the social networks analysis process describing properties of a network of Marvel character friendships. Happy reading!
If you’d like to follow along, the dataset is available in the excellent
{migraph}
package.
Question 1: Centrality
This part will focus on answering the following centrality related question: What characters are the ideal candidates for crossover comics in the Marvel universe, in the global sense?
The Marvel universe is comprised of multiple franchises that include but
are not limited to the Avengers, the Fantastic Four and the X-Men.
The question thus naturally arises which characters are most likely to
appear in comics, films or books that span across these franchises. Part
of the answer to this question will be revealed in this section by
looking at the characters with the highest level of betweenness
centrality as these high-betweenness characters (ego) tend to be on the
shortest path between two other characters (alters). It is worth
mentionning two caveats of this analysis before proceeding further with
our analysis. First, when looking at the measure of betweenness
centrality, we are considering the global rather than the local
structure of the network. Second, the initial
ison_marvel_relationships
is a signed but unweighted network where the
ties either represent a friendly relationship or a rival relationship
between characters. On the other hand, the absence of a tie indicates the
absence of a relationship between the characters. For the purpose of
this first question, we will examine the underlying unsigned network as we are
only interested in the existence of a relationship since it indicates
some kind of previous interaction in a comic book.
The results are unsurprising, since the top 5 characters in terms of betweenness centrality measure are (in order): Wolverine, Spider-Man, Invisible Woman, Hulk, and Black Panther. These are some of the older Marvel characters and are also some of the characters with the most number of appearances. Out of these top 5 characters, none is a “mainstream” villain1 and while three are traditional mainstream characters (Black Panther, Spider-Man, and Hulk) the other two come from either the X-Men franchise or the Fantastic Four franchise. This last point leads us to the answer to the question we’re asking in this first part as both Wolverine and the Invisible Woman seem to have enough preexisting back-story in the form of past relationships with other characters to be able to bridge the gap between the different franchises with the most ease.
Name | Gender | Appearances | Degree | Betweenness | Closeness | Eigenvector |
---|---|---|---|---|---|---|
Wolverine | Male | 12751 | 43 | 78.230 | 0.017 | 0.232 |
Spider-Man | Male | 11963 | 41 | 66.501 | 0.016 | 0.218 |
Invisible Woman | Female | 5082 | 41 | 42.519 | 0.015 | 0.242 |
Hulk | Male | 6062 | 43 | 41.241 | 0.015 | 0.249 |
Black Panther | Male | 2189 | 32 | 39.404 | 0.015 | 0.180 |
Question 2: Community
In this subsection, we will look at the question of structural balance of
the network and leverage the {signet}
package to look at this issue
further. More specifically, we’ll try to answer the following question:
How balanced/cognitively consistent is the Marvel Relationship network?
This will be interesting in our case since the original dataset
is signed, meaning that there is additional binary information on the
nature of the tie linking two characters (e.g. whether they are allies
or foes). Theorized by the Austrian psychologist Fritz Heider in the mid 20th
century Heider (1958), balancedness in a social network is driven
by cognitive consistency among an ego and two alters. Cognitive consistency
being defined as the absence of contradiction in the nature of the ties that
the ego has with its two alters. In our case, Hulk, Iron Man and Thanos
are an example of a cognitively consistent and therefore balanced triad. This
is due to the fact that Hulk is friends with Iron Man and enemies with Thanos,
implying/satisfying (by cognitive consistency) that Iron Man is also an enemy
of Thanos.
Furthermore, it is interesting to examine this concept of balance in a
dynamic setting as attitudes between characters may change over time. For
example, Hulk becomes an evil character in one of the universes
(we’ll assume that he becomes friends with Thanos for this example). To maintain
cognitive consistency in the considered triad, one of two things has to
happen. Either Iron Man decides that Hulk is no longer his friend since he
is a bad guy who became friends with Thanos (i.e. the friend of my enemy is my
enemy) or Thanos and Iron Man become friends and every character becomes a bad
guy (i.e. the friend of my friend is my friend from Thanos’ perspective).
Type | Count | Balanced? |
---|---|---|
+++ | 436 | Yes |
++- | 287 | No |
+– | 944 | Yes |
— | 398 | No |
Total | 2065 |
Method | Results |
---|---|
Triangles | 0.668 |
Walk | 0.071 |
Frustration | 0.858 |
The {signnet}
allowed us to both count the number of balanced/unbalanced
triads and to compute different measures of balancedness. At the first glance,
when considering the simple counting table, we seem to witness more
balanced triads than unbalanced ones. This seems to be consistent with Heiders’
theory as the engine of cognitive consistency would have shaped the relations
between the characters dynamically and may still be in the process of shaping.
Let’s push our analysis a little further and look at the three metrics
{signnet}
computes: Triangles, Walk and Frustration. By looking at the results,
we notice that the Frustration algorithm and the Triangles one yield consistent
results while the Walk algorithm seems to contradict the former two. While this seems
strange at first glance, this leads back to the different ways the metrics are
computed and the way they frame the degree of balancedness of a network.
Without going into too much detail, Estrada’s walk measure takes into account
the proximity of imbalanced clusters while the other two are agnostic of this
effect Estrada (2019). Conceptually, it’s straightforward to see that
close imbalanced clusters are generating more tension than ones that are further
away. Thus, the answer to our question is depending on the measure we consider
as well as on the underlying theory we use to conceptualize the network with.
In conclusion, while the cognitive consistency in a social network seems like a natural law, it should be interpreted carefully, especially in our case where cognitive inconsistency may also be desirable since it " [sometimes] evoke[s], like other patterns with unsolved ambiguities, powerful aesthetic forces of a tragic or comic nature" Heider (1958). Having panel measures of these indexes would certainly cast a light onto the dynamics of these social mechanisms and allow to answer the question at hand in more detail.
Question 3: Position
We have hinted at the difference between the global nature of the concept of betweenness centrality and the local nature of the concept of structural holes in the first part of this paper. This last part will therefore try to answer a slightly different question than in the first part of the paper to illustrate this difference. What characters are the ideal candidates for crossover comics in the Marvel universe, in the local sense?
This slight difference is consequential though as the definition of a cross-over comic is no longer the same as in part 1. Since we are now focusing on the local nature of patterns of ties, we’ll think about cross-over comics as comics where multiple characters “gang up” to fight each other. Note that we’ll consider both heroes and villains in this case as we can’t disaggregate the data accordingly. Thus, a character in a structural hole marked by a low level of constraint, will be the ideal candidate for a movie/comic bridging the universe of multiple other characters within a franchise (e.g. the Avengers). This indicates that Spider-Man, Worlverine, Black Panther, Hulk and Iron Man seeom to be the ideal candidates for such a role. Importantly, note that this is different from the conclusion that we drew in the discussion on centrality in the first part as betweenness highlights characters that were ideally placed to span across multiple franchises of the entire Marvel Network.
Finally, an interesting albeit tangential clue that we get from examining the node constraint measure is that many villains (4 out of 5 in the top 5) seem to find themselves in a part of the network with strong ties (e.g. a high level of constraint). This inverse correlation reflects the fact that villains are more likely to gang up against a hero than a hero is against a single enemy.
Name | Gender | Appearances | Constraint | Betweenness | Hero? |
---|---|---|---|---|---|
Ant-Man | Male | 589 | 0.262 | 0.894 | Yes |
Emma Frost | Female | 4777 | 0.25 | 0.112 | Both |
Blade | Male | 438 | 0.223 | 0.585 | Both |
Kingpin | Male | 1323 | 0.211 | 2.815 | No |
Bullseye | Male | 898 | 0.196 | 1.657 | No |
… | … | … | … | … | … |
Iron Man | Male | 8579 | 0.102 | 32.157 | Yes |
Hulk | Male | 6062 | 0.1 | 41.241 | Yes |
Black Panther | Male | 2189 | 0.097 | 39.404 | Yes |
Wolverine | Male | 12751 | 0.094 | 78.23 | Yes |
Spider-Man | Male | 11963 | 0.093 | 66.501 | Yes |
Image credits:
Photo by Erik Mclean on Unsplash
Appendix:
This appendix does not contain any additional content. It’s only function is to print the echoes of the code used to handle, measure and visualize the networks in the previous three sections in an effort to ensure transparency and replicability of our code.
Okay I lied, it actually does one more thing, it tries to remotivate the
reader who made it that far with a little randomly generated quote
(courtesy of the {dang}
package).
## 'Success is not final, failure is not fatal. It is the courage to
## continue that counts.' Winston Churchill
And an additional graph to visualize the signed network for question 2:
# Set initial knitr options
knitr::opts_chunk$set(eval = TRUE, echo = FALSE,
fig.align = "center",
fig.asp = 0.7,
dpi = 300,
out.width = "80%",
fig.pos = "!H",
out.extra = "")
rm(list = ls()) # Clean Environment
#********************* Package Management *************************************#
library(migraph) # Development version 29.10.2021
library(igraph) # Latest version
library(ggraph)
library(signnet)
library(tidygraph)
library(tidyverse) # All things data science
library(dang) # For fun
#***************************** Load data *************************************#
relationships <- migraph::ison_marvel_relationships
teams <- migraph::ison_marvel_teams
# Set both objects to the same class to be more consistent
class(relationships) # "tbl_graph" "igraph"
class(teams) # "igraph"
relationships <- migraph::as_igraph(relationships)
teams <- migraph::as_igraph(teams)
# relationships # Relationship network, Unimodal
# teams # Teams network, Bipartite: teams and individuals
# Initial visualisations
migraph::autographr(relationships)
migraph::autographr(teams)
# Initial visualisation for relationships dataset shows three (unrealistic)
# isolates. Let's drop them.
# PS: Luke Cage and Iron Fist are friends in the MU
isolates <- which(degree(relationships) == 0)
relationships <- delete.vertices(relationships, isolates)
# Note that the relationships network is signed. We can separate the network
# between friends and enemies.
friends <- migraph::to_unsigned(relationships, "positive")
ennemies <- migraph::to_unsigned(relationships, "negative")
#***************************** Question 1 ************************************#
# We'll use the relationships network for this one
# Note that this is a signed network! +1 indicates a friendship tie and -1
# an enemy tie.
relDFnode <- igraph::as_data_frame(relationships, "vertices")
relDFedge <- igraph::as_data_frame(relationships, "edges")
# Create a dataframe with the different centrality measures of every node
centralityFullDF <- dplyr::tibble(
name = igraph::get.vertex.attribute(relationships)$name,
degree = migraph::node_degree(relationships),
betweenness = migraph::node_betweenness(relationships),
closeness = migraph::node_closeness(relationships),
eigenvector = migraph::node_eigenvector(relationships)) %>%
arrange(
desc(betweenness)
)
# centralityFullDF
# Let's focus on relations rather than friends to explore which characters
# bridge universes
relUnsigned <- igraph::remove.edge.attribute(relationships, "sign")
# For the graph let's revert to a tidygraph object and add node level centrality
relUnsigned <- migraph::as_tidygraph(relUnsigned) %>%
mutate(degree = migraph::node_degree(relUnsigned),
betweenness = migraph::node_betweenness(relUnsigned),
closeness = migraph::node_closeness(relUnsigned),
eigenvector = migraph::node_eigenvector(relUnsigned))
# DF for top 5 table output
centralityRelationDF <- dplyr::tibble(
name = igraph::get.vertex.attribute(relUnsigned)$name,
degree = as.numeric(migraph::node_degree(relUnsigned)),
betweenness = as.numeric(migraph::node_betweenness(relUnsigned)),
closeness = as.numeric(migraph::node_closeness(relUnsigned)),
eigenvector = as.numeric(migraph::node_eigenvector(relUnsigned)))
# Perform inner join on node characteristics and centrality measures
z <- dplyr::inner_join(igraph::as_data_frame(relUnsigned, "vertices"),
centralityRelationDF, "name")
# Get top 5
top5betweenness <-
head(dplyr::arrange(z, desc(betweenness.y)), 5)[c(1,2,3, 11, 12, 13, 14)]
top5betweenness <- dplyr::rename(top5betweenness,
Name = name,
Degree = degree.x,
Betweenness = betweenness.x,
Closeness = closeness.x,
Eigenvector = eigenvector.x) %>%
dplyr::mutate(across(where(is.numeric), round, 3))
# Make outtable
kableExtra::kbl(top5betweenness, "html", caption = "Top 5 Nodes: Betweenness") %>%
kableExtra::kable_styling(position = "center") %>%
kableExtra::kable_material(full_width = T)
# Set layout
l <- ggraph::create_layout(relUnsigned, layout = "fr")
# Plot graph
ggraph::ggraph(relUnsigned, layout = l) +
ggraph::geom_node_point(aes(size = betweenness,
alpha = betweenness,
color = Appearances)) +
ggraph::geom_node_label(aes(filter = betweenness > 39,
label = name),
label.padding = 0.15,
label.size = 0,
repel = TRUE) +
ggraph::geom_edge_link(alpha = 0.05) +
ggplot2::guides(alpha = "none") +
ggplot2::labs(color = "Appearences",
size = "Node Betweenness",
title = "Highlighting Betweenness and Appearence",
subtitle = "What characters are the ideal candidates for crossover comics in the Marvel universe in the global sense?",
caption = "Source: {migraph} (2021)") +
ggplot2::theme_minimal() +
ggplot2::theme(panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.line = element_blank(),
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.x = element_blank(),
plot.caption = element_text(color = "grey20",
face = "italic"),
plot.title = element_text(size = 14),
plot.subtitle = element_text(color = "grey20",
face = "italic",
size = 9))
#************************* Question 2 *****************************************#
rm(list = ls())
# Back again with the relationships data
relationships <- migraph::ison_marvel_relationships
# Initial visualisation for relationships dataset shows three (unrealistic)
# isolates. Let's drop them.
# PS: Luke Cage and Iron Fist are friends in the MU
isolates <- which(degree(relationships) == 0)
relationships <- delete.vertices(relationships, isolates)
# Set linetypes
edge_linetype <- ifelse(igraph::E(relationships)$sign >= 0,
"solid",
"dashed")
edge_colour <- ifelse(igraph::E(relationships)$sign >= 0,
"#0072B2",
"#E20020")
# Set layout
l <- ggraph::create_layout(relationships, layout = "fr")
# Plot graph
relgraph <- ggraph::ggraph(relationships, layout = l) +
ggraph::geom_edge_link0(alpha = 0.4,
edge_linetype = edge_linetype,
edge_colour = edge_colour) +
ggraph::geom_node_point(aes(color = PowerOrigin,
alpha = 0.5),
size = 3) +
ggraph::geom_node_label(aes(filter = name %in% c("Hulk", "Iron Man", "Thanos"),
label = name),
label.padding = 0.15,
label.size = 0,
repel = TRUE) +
ggplot2::guides(alpha = "none") +
ggplot2::labs(color = "Power Origin",
title = "Marvel Character Network: Examining the Signs",
subtitle = "Visualizing network balance",
caption = "Source: {migraph} (2021)") +
ggplot2::theme_minimal() +
ggplot2::theme(panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.line = element_blank(),
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.x = element_blank(),
plot.caption = element_text(color = "grey20",
face = "italic"),
plot.title = element_text(size = 14),
plot.subtitle = element_text(color = "grey20",
face = "italic",
size = 9))
# We have a network were a +1 tie indicates a friendly raltionship
# and a -1 tie an animosity between two characters
# Step 1: Count the number of triangles
countTriangles <- dplyr::tibble(
Type = c(names(signnet::count_signed_triangles(relationships)), "Total"),
Count = c(as.numeric(signnet::count_signed_triangles(relationships)),
sum(as.numeric(signnet::count_signed_triangles(relationships)))),
`Balanced?` = c("Yes", "No", "Yes", "No", ""))
t1 <- knitr::kable(countTriangles,
align = "c",
format = "html") %>%
kableExtra::kable_minimal()
t1
# +++ ++- +-- ---
# 436 287 944 398
# The network seems to be more balanced than unbalanced with this quick
# comparaison
# Degree of balancedness:
Balancedness <- dplyr::tibble(
Method = c("Triangles", "Walk", "Frustration"),
Results = c(signnet::balance_score(relationships, method = "triangles"),
signnet::balance_score(relationships, method = "walk"),
signnet::balance_score(relationships, method = "frustration"))) %>%
dplyr::mutate(across(Results, round, 3))
# Make outtable
t2 <- knitr::kable(Balancedness,
align = "c",
format = "html") %>%
kableExtra::kable_minimal()
t2
# Okay, we get quite a different result with the Walk method based on
# Ernesto Estrada measure of balancedness. Let's explain why.
#***************************** Question 3 *************************************#
rm(list = ls())
# Back again with the relationships data
relationships <- migraph::ison_marvel_relationships
# Initial visualisation for relationships dataset shows three (unrealistic)
# isolates. Let's drop them.
# PS: Luke Cage and Iron Fist are friends in the MU
isolates <- which(degree(relationships) == 0)
relationships <- delete.vertices(relationships, isolates)
# DF for top 5 table output
constraintRelationDF <- dplyr::tibble(
name = igraph::get.vertex.attribute(relationships)$name,
constraint = as.numeric(migraph::node_constraint(relationships)),
betweenness = as.numeric(migraph::node_betweenness(relationships)))
# Perform inner join on node characteristics and centrality measures
z <- dplyr::inner_join(igraph::as_data_frame(relationships, "vertices"),
constraintRelationDF, "name")
# Get top 5 table
bottom5constraint <-
tail(dplyr::arrange(z, desc(constraint)), 5)[c(1,2,3, 11, 12)] %>%
dplyr::mutate(across(where(is.numeric), round, 3))
top5constraint <-
head(dplyr::arrange(z, desc(constraint)), 5)[c(1,2,3, 11, 12)] %>%
dplyr::mutate(across(where(is.numeric), round, 3))
top5constraint <- rbind(top5constraint, rep("...", 5))
out <- rbind(top5constraint, bottom5constraint)
rownames(out) <- c()
out$`Hero?` <- c("Yes", "Both", "Both", "No", "No", "...",
"Yes", "Yes", "Yes", "Yes", "Yes")
out <- dplyr::rename(out,
Name = name,
Betweenness = betweenness,
Constraint = constraint)
# Make outtable
kableExtra::kbl(out,
format = "html",
caption = "Head and Tail Nodes: Constraint",
align = "c") %>%
kableExtra::kable_material(full_width = F)
# *********************** Motivational Easter Egg *****************************#
dang::motivate(2)
rm(list = ls())
# Back again with the relationships data
relationships <- migraph::ison_marvel_relationships
# Initial visualisation for relationships dataset shows three (unrealistic)
# isolates. Let's drop them.
# PS: Luke Cage and Iron Fist are friends in the MU
isolates <- which(degree(relationships) == 0)
relationships <- delete.vertices(relationships, isolates)
# Set linetypes
edge_linetype <- ifelse(igraph::E(relationships)$sign >= 0,
"solid",
"dashed")
edge_colour <- ifelse(igraph::E(relationships)$sign >= 0,
"#0072B2",
"#E20020")
# Set layout
l <- ggraph::create_layout(relationships, layout = "fr")
# Plot graph
ggraph::ggraph(relationships, layout = l) +
ggraph::geom_edge_link0(alpha = 0.4,
edge_linetype = edge_linetype,
edge_colour = edge_colour) +
ggraph::geom_node_point(aes(color = PowerOrigin,
alpha = 0.5),
size = 3) +
ggraph::geom_node_label(aes(filter = name %in% c("Hulk", "Iron Man", "Thanos"),
label = name),
label.padding = 0.15,
label.size = 0,
repel = TRUE) +
ggplot2::guides(alpha = "none") +
ggplot2::labs(color = "Power Origin",
title = "Marvel Character Network: Examining the Signs",
subtitle = "Visualizing network balance",
caption = "Source: {migraph} (2021)") +
ggplot2::theme_minimal() +
ggplot2::theme(panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.line = element_blank(),
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.x = element_blank(),
plot.caption = element_text(color = "grey20",
face = "italic"),
plot.title = element_text(size = 14),
plot.subtitle = element_text(color = "grey20",
face = "italic",
size = 9))
References:
NB: we consider only the universe where Hulk doesn’t become a bad character.↩︎
- Posted on:
- December 15, 2021
- Length:
- 14 minute read, 2963 words