Variability and Hierarchy: Mixed-Effects Intuition

Pooled vs Separate vs Hierarchical Thinking with Theoph

Learning Objectives

By the end of this lesson, you will be able to:

  • Explain why PK data are inherently hierarchical
  • Distinguish pooled, separate, and mixed-effects approaches
  • Understand random intercepts and random slopes conceptually
  • Recognize why repeated-measures data violate independence
  • Connect elimination-rate variability to random slopes

Key Ideas

  • PK concentration–time data contain repeated measures per subject.
  • Pooled models ignore between-subject variability.
  • Separate models ignore shared biological structure.
  • Mixed-effects models combine population structure with subject variability.
  • Random effects represent systematic subject-level deviation — not noise.

Worked Example 1: Visualizing Hierarchy (Theoph)

We return to the Theophylline dataset.

library(tidyverse)

data(Theoph)

terminal_data <- Theoph %>%
  filter(Time >= 4)

Plot log concentration vs time:

terminal_data %>%
  ggplot(aes(Time, log(conc), group = Subject)) +
  geom_point() +
  geom_line(alpha = 0.6) +
  labs(
    title = "Terminal Phase (Log Scale) — By Subject",
    x = "Time (h)",
    y = "log(Concentration)"
  )

Observe:

  • Approximately linear decline per subject
  • Different slopes across subjects
  • Different intercepts across subjects

This is hierarchy.


Worked Example 2: Pooled Model (Ignoring Hierarchy)

Suppose we fit:

\[ \log(C_{ij}) = \beta_0 + \beta_1 Time_{ij} + \epsilon_{ij} \]

Where:

  • \(i\) = subject index
  • \(j\) = observation within subject
  • \(\beta_0\) = common intercept
  • \(\beta_1\) = common slope
  • \(\epsilon_{ij}\) = residual error

This assumes:

  • One common slope (elimination rate)
  • One common intercept
  • Independent residuals

Conceptually simple — biologically unrealistic.


Worked Example 3: Separate Models (No Sharing)

Now imagine fitting one linear model per subject:

\[ \log(C_{ij}) = \beta_{0i} + \beta_{1i} Time_{ij} + \epsilon_{ij} \]

This respects subject variability but:

  • Uses little data per subject
  • Produces unstable estimates
  • Provides no population-level summary

Worked Example 4: Hierarchical (Mixed-Effects) Thinking

Mixed-effects modeling assumes:

Population parameters:

\[ \beta_0, \beta_1 \]

Subject-level deviations:

\[ \beta_{0i} = \beta_0 + b_{0i} \]

\[ \beta_{1i} = \beta_1 + b_{1i} \]

Where:

  • \(b_{0i}\) = random intercept (baseline variability)
  • \(b_{1i}\) = random slope (elimination rate variability)

This preserves shared structure while modeling variability.


Independence vs Correlated Random Effects

Once we allow subjects to have their own intercepts and slopes, we also need to ask whether those subject-level deviations are independent.

In mixed-effects notation, we often write:

\[ \begin{bmatrix} b_{0i} \\ b_{1i} \end{bmatrix} \sim N \left( \begin{bmatrix} 0 \\ 0 \end{bmatrix}, \Omega \right) \]

Here, \(\Omega\) is the variance-covariance matrix describing how subject-level random effects vary and relate to each other.

A diagonal covariance structure assumes no correlation between the random intercept and random slope:

\[ \Omega = \begin{bmatrix} \omega_0^2 & 0 \\ 0 & \omega_1^2 \end{bmatrix} \]

A full covariance structure allows them to be correlated:

\[ \Omega = \begin{bmatrix} \omega_0^2 & \rho \omega_0 \omega_1 \\ \rho \omega_0 \omega_1 & \omega_1^2 \end{bmatrix} \]

Conceptually:

  • Diagonal \(\Omega\) means subjects with higher intercepts are not assumed to have systematically different slopes.
  • Full \(\Omega\) means subjects with higher intercepts may also tend to have steeper or shallower slopes.

In PK terms, this asks whether subjects with higher apparent initial concentrations also tend to eliminate drug differently.

This is mainly a modeling choice about structure, not about whether variability exists at all.

In nlme, the default random-effects covariance structure is pdSymm, which allows a full variance-covariance matrix. If we want independent random effects, we can instead specify pdDiag in the next lesson when we fit models formally.


Biological Interpretation

In PK terms:

  • Random intercept → variability in apparent initial concentration
  • Random slope → variability in elimination rate constant (ke)
  • Residual error → measurement noise or model misspecification

Variability is not noise — it is biological heterogeneity.


Why Independence Fails

Repeated measures per subject create correlated errors.

If ignored:

  • Standard errors are underestimated
  • Confidence intervals are too narrow
  • Inference becomes unreliable

This is why mixed-effects models are necessary.


Strategies

  • Visualize subject-level trajectories.
  • Compare pooled vs subject-specific fits conceptually.
  • Think in layers: population → subject → observation.
  • Translate variability into biological language (e.g., elimination rate differences).

Common Mistakes

  • Treating repeated observations as independent
  • Assuming variability is just random noise
  • Thinking pooled models represent all subjects well
  • Fitting each subject separately without considering shared structure
  • Confusing residual error with between-subject variability
  • Assuming random effects are measurement error
  • Ignoring differences in subject-level slopes
  • Forgetting that hierarchy exists even before formal mixed-effects modeling

Practice Problems

  1. In one sentence, explain why pooled modeling underestimates uncertainty.
  2. Explain why fitting each subject separately wastes information.
  3. What PK parameter corresponds to a random slope in this context?
  4. Why is hierarchy the correct data-generating assumption?

Problem 1
Because repeated observations within subjects are correlated but treated as independent.

Problem 2
Because each subject’s data are limited, leading to unstable parameter estimates.

Problem 3
The elimination rate constant (ke).

Problem 4
Because PK data arise from population-level biology with subject-level variation and measurement noise.


Summary

  • PK data are hierarchical by nature.
  • Pooled models ignore variability.
  • Separate models ignore shared structure.
  • Mixed-effects models combine both ideas.
  • This prepares us to fit mixed-effects models formally in the next lesson.

  • Always visualize variability first.
  • Variability reflects biological differences.
  • Random slopes often correspond to elimination differences.
  • Mixed-effects modeling is population thinking with structure.