Latent Class & Factor Analysis

Lecturer: Robin Samuel.

Week 1 (16 - 20 August 2021).

Description of the Workshop

This course contains an introduction to factor analysis and latent class analysis. Both analytical techniques allow detecting concepts or groups in your data that are not directly measurable or observable.

Factor analysis enables to reduce a large number of variables to a small set of factors. As an example, you might have a large set of variables you assume to cover various aspects of attitudes to immigration. Factor analysis will help you to uncover the relationship between these variables and to extract those hidden aspects (i.e., factors). The factors detected can be used for further analyses. For example, you might want to examine the relationship between social background and attitudes to immigration.

Latent class analysis allows categorizing many observations (e.g., people) into few groups. For example, you might want to detect different types of web users (i.e., latent classes) based on a set of variables that measure aspects of web usage (e.g., content preferences, time spent online, etc.). The established latent classes will not only allow for a fruitful description of the phenomenon of interest, but also enable follow-up analyses, for example, on how latent class membership (e.g., types of web users) is associated with life satisfaction.

We will cover factor analysis in the first half of the course and latent class analysis in the second half. Both halves follow this structure:

  1. Introduction to factor analysis/latent class analysis
  2. Hands-on session I
  3. Advanced applications of factor analysis/latent class analysis
  4. Hands-on session II
  5. Using factors/latent classes in follow-up analyses

The hands-on sessions will be held as structured and guided computer exercises. You will work on exercises on data provided by the course instructor. However, you are welcome to bring your own data as well. A particular focus of these sessions will be on the interpretation of results.

Upon completion of this course, students will have a good understanding of factor analysis and latent class analysis and how to run these models on their own data.

The course is designed to be introductory. The course can only be attended in full. It is not possible to attend only either the first half or second half.


Participants should be familiar with univariate and bivariate statistics. If you have never been exposed to bivariate correlation and chi-square (in the context of cross-tabs, also known as contingency tables) this is not the course for you. Ideally, you will have some elementary knowledge of (OLS) regression as well.

We will use the software R. R allows running both factor analyses and latent class analyses. A common procedure is to use the obtained factors and latent classes in further statistical analyses. R allows running almost any statistical model and is widely used in the social sciences. While some familiarity with R is recommended, this is not strictly necessary as long as you have some knowledge of working with other statistical software packages using syntax (e.g., Stata or SPSS) and are willing to learn. Here are some helpful materials for those who are new to R or feel they would benefit from a refresher:

Background Reading

  • Bartholomew, D. J., Steele, F., Galbraith, J., & Moustaki, I. (2008). Analysis of Multivariate Social Science Data. CRC Press.
    • Chapters 1, 7, 8, 9, and 10.