Using Social Network Analysis to Understand Data

Lecturer: Thomas Hills

Modality: In presence

Week 1: 12-16 August 2024

 

Workshop Contents and Objectives

Social network analysis is used to understand communities by investigating their structure. How individuals in communities are connected to one another can influence information flows, actor importance, and the overall behavior of the community. More generally, network analysis allows us to identify structure in our data, including key actors, hierarchies of relationships, brokers, groups and local communities, patterns of information flow, and the resilience of the community as a whole.

Networks are made up nodes connected by edges. Any data in a matrix format can be represented as a network. In social network analysis, the nodes are usually people. But nodes can be used to represent almost anything, such as cities, brands, online communities, scientific articles, political organizations, colors in paintings, emotions, historical events, or words in a language. This means that network analysis can be used to unlock and understand many kinds of data.

In this workshop, students will learn the basic concepts of social network analysis and extend its use to network analysis more broadly, including data analysis and network visualization. Students will learn the material in a practical hands-on fashion, largely using R.

If students have ongoing projects of their own, they will be able to investigate these and gain new insights into their own research. By the end of the workshop, students will have a vocabulary for understanding network analysis and should have the knowledge needed to understand most of the research in network analysis that they are likely to see in the social sciences.

Students will learn concepts like small world analysis (how structured is the network?), homophily (do similar nodes cluster together?), network closure (are nodes in the network in harmony with one another?), distance (how far away are objects in the network from one another?), clustering and community detection (what are the communities in my data?), and centrality (are some nodes more important than others?).

 

Workshop design

The course will alternate between lectures and interactive programming using pre-written code in R.

 

Detailed lecture plan (daily schedule)

Day 1.
Intro to network analysis, making and representing networks

Day 2.
Measuring things using networks

Day 3.
Generating networks, null hypotheses

Day 4.
Models and processes on networks

Day 5.
Advanced topics and short presentations from students

 

Class materials

All materials will be provided online.

 

Prerequisites

Students taking this workshop should have some experience with R and RStudio. There are a number of free or inexpensive online courses well worth the investment in time (e.g., Datacamp) that offer introductory courses in R that are sufficient prerequisites for this course. A general introductory book to statistics in R will also work (e.g., Dalgaard, P. 2008. Introductory statistics with R). Though the course will primarily use R, I will provide all the code. Therefore, this course can be a way to improve your R skills as well.

 

Recommended readings or preliminary material

  • Baronchelli, A., Ferrer-i-Cancho, R., Pastor-Satorras, R., Chater, N., & Christiansen, M. H. (2013). Networks in cognitive science. Trends in cognitive sciences, 17(7), 348-360.
  • Hills, T. 2024. Behavioral network science. Cambridge University Press. (contact author if not yet available)