Understanding Correlated Random Effects Models

causal inference

Published

February 11, 2026

8 min read

Background

For decades, panel data analysis has largely revolved around a familiar dichotomy: fixed effects (FE) versus random effects (RE). More recently, generalized fixed effects and difference-in-differences designs have surged in popularity, particularly in causal inference. Yet between FE and RE lies a more general and conceptually illuminating framework: the correlated random effects (CRE) model. Although it receives less attention today, CRE remains a powerful tool for understanding the foundations of panel data methods.

Fixed effects models eliminate all time-invariant unobserved heterogeneity but sacrifice the ability to estimate the effects of time-invariant covariates. Random effects models, by contrast, retain those variables but rely on a strong assumption: that the unobserved individual-specific effects are uncorrelated with the regressors. When this assumption fails—as it often does—the RE estimator becomes biased. The correlated random effects (CRE), also known as the hybrid model, relaxes this assumption by explicitly modeling the potential correlation.

In this article, I examine the intuition behind the CRE model, explain how it bridges FE and RE, and show how it decomposes within- and between-unit variation. I conclude with a hands-on implementation in both R and Python to demonstrate how the model works in practice. The focus is on the linear versions of these models, and extending these ideas to nonlinear models is not always straightforward.

Notation

Let us consider a standard panel data setup where we observe units \(i=1,\dots,N\) over time periods \(t = 1, \dots, T\). The outcome is \(y_{it}\), and \(x_{it}\) is a vector of time-varying covariates.

The linear panel data model is:

\[ y_{it} = x_{it}'\beta + \alpha_i + \varepsilon_{it} \]

where \(\alpha_i\) is the individual-specific effect and \(\varepsilon_{it}\) is the idiosyncratic error term. Our goal is to consistently estimate the causal effect of time-varying regressors (a component of \(x_{it}\)) when unobserved heterogeneity may be correlated with them.

The core differences between FE and RE models lie in the way they handle \(\alpha_i\), and the assumptions they make about the relationship between \(\alpha_i\) and \(x_{it}\).

A Closer Look

Refresher on Fixed and Random Effects

In panel data models, the goal is often to account for unobserved heterogeneity across units (e.g., individuals, firms, regions). Two popular approaches to handle this are fixed effects (FE) and random effects (RE) models. Understanding these two approaches is critical before we dive into correlated random effects.

Fixed Effects (FE) Model

The fixed effects model controls for all time-invariant characteristics of the units by allowing each unit to have its own intercept. The key feature of FE models is that \(\alpha_i\) is treated as a set of unknown parameters to be estimated (or differenced out). Importantly, \(\alpha_i\) is allowed to be correlated with the regressors \(x_{it}\) (i.e., \(\text{Cov}(x_{it}, \alpha_i) \neq 0\)). This addresses endogeneity driven by time-invariant omitted variables, but it does not, by itself, resolve endogeneity arising from time-varying confounding, simultaneity, or reverse causality (which lives in \(\varepsilon_{it}\)).

Fixed effects estimation often proceeds by demeaning the data within each unit (also known as the “within transformation”), removing \(\alpha_i\):

\[ y_{it} - \bar{y}_i = (x_{it} - \bar{x}_i)'\beta + (\varepsilon_{it} - \bar{\varepsilon}_i), \]

where \(\bar{y}_i\) and \(\bar{x}_i\) are the within-unit means. This is convenient but comes at the cost of not estimating the time-invariant effects of the covariates, which can be of interest in many applications. Even if one attempts to consistently estimate the \(\alpha_i\)’s parameters, this is usually not feasible due to the relative short panels typically used in empirical work.

Fixed effects are especially popular in causal inference because they remove bias from any time-invariant omitted variables. They can be seen as a generalization of the familiar difference-in-differences (DiD) approach, which is just a special case of FE with two time periods and a treatment indicator. They can also fairly easily be extended to triple difference designs, staggered adoption designs, and other more complex causal inference settings.

An example would be an analysis of state-level minimum wage changes on employment outcomes. Different states adopted minimum wage changes at different times, so a simple difference-in-differences analysis would be inappropriate. However, a fixed effects model can be used to estimate the effect of the minimum wage on employment outcomes, holding constant the state-specific time-invariant characteristics (e.g., state-level demographics, permanent economic conditions, policy environment, etc.).

Random Effects (RE) Model

In the RE model, \(\alpha_i\) is treated as a random variable drawn from a distribution (usually assumed to be normal):

\[ \alpha_i \sim N(0, \sigma_\alpha^2). \]

The crucial assumption in RE models is:

\[ \text{Cov}(x_{it}, \alpha_i) = 0. \]

Equivalently, RE assumes \(E[\alpha_i \mid X_i] = 0\), where \(X_i = (x_{i1}, \dots, x_{iT})\). This allows for more efficient estimation through Generalized Least Squares (GLS), but if the assumption fails, the RE estimates will be biased and inconsistent. The RE model is not commonly used in causal inference because, unlike the FE model, it rules out correlation between covariates and time-invariant unobserved heterogeneity. In short, the FE model is robust but discards between-unit variation, while the RE model is more efficient but relies on a strong independence assumption between covariates and unobserved heterogeneity. The Hausman test evaluates whether the additional orthogonality restrictions imposed by the random effects model are supported by the data.

Correlated Random Effects (CRE) Model

Intuition

The correlated random effects (CRE) model differs from standard fixed and random effects by explicitly modeling the correlation between the unit-specific effects \(\alpha_i\) and the covariates \(x_{it}\). Instead of assuming independence (as in RE) or differencing out the effects entirely (as in FE), CRE includes the unit-level means of the covariates as additional regressors, allowing for consistent estimation while still retaining the ability to estimate time-invariant variables.

The correlated random effects (CRE) model offers a middle ground between FE and RE approaches. Traditional RE models assume that unobserved heterogeneity is uncorrelated with covariates. FE models remove all unit-level heterogeneity but cannot estimate time-invariant covariates. CRE models address these limitations by including group means of time-varying covariates, decomposing variation into within and between components. Instead of pretending the individual effect is unrelated to observed covariates, we model exactly how it is related — through the individual’s average covariate values.

Estimation and Inference

One way to motivate CRE (Mundlak) is to model the conditional mean of the unit effect as a function of unit-level covariate averages. In the linear case, write:

\[ \alpha_i = a + \gamma \bar{x}_i + u_i, \qquad E[u_i \mid X_i] = 0, \]

where \(\bar{x}_i\) is the individual mean of \(x_{it}\). Substituting into the outcome equation yields:

\[ y_{it} = \beta_0 + \beta_1 x_{it} + \gamma \bar{x}_i + u_i + \varepsilon_{it}, \]

In practice, you include the unit means for each time-varying regressor (and for any transformations/interactions you want the CRE adjustment to apply to). Estimation uses RE-style methods on this augmented specification; the mean terms absorb the part of \(\alpha_i\) that is correlated with \(X_i\), leaving \(u_i\) orthogonal. This also makes it easy to compare within and between effects (for a scalar \(x\), the between effect is \(\beta_1 + \gamma\)).

Advantages and Challenges

The CRE model offers several advantages. It allows estimation of time-invariant variables, decomposes effects into within- and between-unit components, improves efficiency under relaxed assumptions, and provides a diagnostic check on the plausibility of random effects assumptions.

It is well suited for repeated measures data where both time-varying and time-invariant predictors matter, especially when there is potential endogeneity between covariates and individual effects. Typical applications include policy evaluation, health research, and education studies.

However, CRE models still rely on the random intercept assumption, do not address endogeneity driven by time-varying unobservables (e.g., simultaneity or reverse causality), require care with interaction terms, and may produce biased estimates when the number of clusters is small.

An Example

library(plm)
library(dplyr)

set.seed(1988)
n <- 100
t <- 5
data <- data.frame(
  id = rep(1:n, each = t),
  time = rep(1:t, n)
)
data <- data %>%
  group_by(id) %>%
  mutate(
    z = rnorm(1),
    x = rnorm(n(), mean = z),
    alpha = 0.7 * z + rnorm(1),
    eps = rnorm(n(), sd = 1),
    y = 1 + 0.5 * x + alpha + eps
  )

pdata <- pdata.frame(data, index = c("id", "time"))

fe_model <- plm(y ~ x, data = pdata, model = "within")
re_model <- plm(y ~ x, data = pdata, model = "random")
pdata$mean_x <- ave(pdata$x, pdata$id, FUN = mean)
cre_model <- plm(y ~ x + mean_x, data = pdata, model = "random")

summary(fe_model)
summary(re_model)
summary(cre_model)

import numpy as np
import pandas as pd
import statsmodels.formula.api as smf

np.random.seed(1988)
n, t = 100, 5
df = pd.DataFrame({
    'id': np.repeat(np.arange(1, n+1), t),
    'time': np.tile(np.arange(1, t+1), n)
})

# Induce correlation between x_it and alpha_i via an id-level latent z_i
z = np.random.randn(n)
df['z'] = np.repeat(z, t)
df['x'] = df['z'] + np.random.randn(n*t)
alpha = 0.7 * z + np.random.randn(n)
df['alpha'] = np.repeat(alpha, t)

df['eps'] = np.random.randn(n*t)
df['y'] = 1 + 0.5 * df['x'] + df['alpha'] + df['eps']
df['mean_x'] = df.groupby('id')['x'].transform('mean')

# Random-intercept RE model
model_re = smf.mixedlm("y ~ x", df, groups=df["id"]).fit(reml=False)

# CRE (Mundlak) model: RE with unit means included
model_cre = smf.mixedlm("y ~ x + mean_x", df, groups=df["id"]).fit(reml=False)

print(model_re.summary())
print(model_cre.summary())

Bottom Line

CRE models relax the strict RE assumptions by modeling the correlation between unit effects and covariates.
They provide within and between estimates while allowing time-invariant variables.
Appropriate for longitudinal, multilevel, and policy evaluation studies.

Where to Learn More

“Microeconometrics: Methods and Applications” by one of my PhD advisors, Colin Cameron, and his long-time coauthor Trivedi, is a classic textbook on panel data models with which I have spent countless hours. It’s a great starting point for most of the material in my blog. Schunck (2013) provides a comprehensive overview of CRE models. Mundlak’s foundational work is essential for understanding the theoretical basis. Tools like R’s plm and Python’s statsmodels can implement these models with the correct transformations.

References

Cameron, A. C., & Trivedi, P. K. (2005). Microeconometrics: methods and applications. Cambridge university press.
Schunck, R. (2013). Within and between estimates in random-effects models: Advantages and drawbacks of correlated random effects and hybrid models. The Stata Journal, 13(1), 65-76.
Mundlak, Y. (1978). On the pooling of time series and cross section data. Econometrica, 46(1), 69–85.