Case Study

An end-to-end pharmacometrics case study that mirrors real workflow: inventory → minimal unblock → assemble → visualize → fix → lock → model → simulate.
Tip

How to use this section:
These lessons are designed to be completed after the core Foundations units (wrangling, plotting, modeling).
You’ll reuse familiar tools — but in a more realistic setting where judgment and documentation matter.

What This Section Is (and Is Not)

This section focuses on workflow and defensible decision-making, not new syntax.

It is:

  • end-to-end and narrative-driven
  • based on a realistic PMx-style dataset
  • intentionally messy (by design)
  • focused on why decisions are made, not just how
  • structured like real project work: assemble early, diagnose visually, fix selectively, then model

It is not:

  • a place to introduce brand-new R fundamentals
  • a reference manual
  • a shortcut around earlier units
Note

All tools used here were introduced earlier in the course. If something feels unfamiliar, that’s your cue to revisit the corresponding Foundations lesson.


The Case Study Dataset and Folder Layout

All lessons in this section use the same synthetic study (“pk-study-01”).

Where the raw source files live

These lessons assume your case-study data are stored under the course folder (not the site root):

courses/foundations-r/data/

Within that folder, we use a strict separation:

  • data/source_clean/ → clean reference version (never edited)
  • data/source_issues/ → messy “as received” version (never edited)
  • data/derived/ → your intermediate exports (structurally cleaned, assembled, locked)
  • results/ → QC figures, logs, and model outputs

Files

You will work with three PMx-style inputs:

  • DM — demographics / covariates
  • EX — dosing events
  • PC — PK concentrations

Common real-world issues are intentionally included, such as:

  • missing covariates
  • inconsistent encodings (e.g., sex labels)
  • type problems (e.g., numeric stored as character)
  • duplicated observations
  • time issues and ordering problems
  • a covariate unit mismatch (e.g., one weight in lb instead of kg)

You are not expected to “fix everything automatically.”
You are expected to identify, justify, and document decisions.


Learning Objectives

By the end of this section, you will be able to:

  • Inventory raw PMx files and check design-level expectations.
  • Perform minimal structural QC to unblock assembly (types, keys, core fields).
  • Assemble an event-style dataset and derive TIME/NTIME.
  • Use visualization to discover data problems and prioritize fixes.
  • Apply only defensible corrections and document them in a QC log.
  • Lock a final modeling-ready dataset with disposition summaries and focused EDA.
  • Fit naive pooled and population models (nls(), nlme()) and interpret diagnostics.
  • Use the frozen population model to simulate new dosing scenarios.

Lessons in This Section

Case Study: From Raw Clinical Data to Modeling and Simulation

  1. Case Setup & Data Overview
    Inventory DM/EX/PC, validate expectations, and start a QC decision log.

  2. Structural QC & Minimal Fixes
    Fix only what blocks assembly (datetime parsing, type coercion, BLQ encoding).
    Do not remove duplicates/spikes yet.

  3. Preliminary Dataset Assembly
    Assemble first (including pivot_wider() on DM), derive TIME/NTIME, build an event-style dataset, and export a preliminary dataset for QC.

  4. Visualization-Driven QC on the Preliminary Dataset
    Let plots reveal duplicates, spikes, negative values, time issues, and covariate outliers (lb vs kg).
    No fixing yet — evidence only.

  5. Fixing Data Issues
    Apply defensible fixes (e.g., remove exact duplicates, convert lb → kg, drop rows with missing TIME) and document every decision.

  6. Final Dataset Lock & Exploratory Data Analysis
    Lock the final event-style dataset, create a disposition table, summarize covariates/BLQ, and export the modeling-ready analysis dataset.

  7. Naive Pooled Modeling (nls)
    Fit a single structural model to all subjects (excluding placebo), then use diagnostics to motivate hierarchy.

  8. Population Modeling with nlme()
    Fit a mixed-effects model, interpret fixed/random effects, run core diagnostics, and freeze the fitted model object for simulation.

  9. Simulation & Scenario Exploration
    Use the frozen model to simulate new doses/regimens and compute exposure metrics (Cmax, AUC) for decision-oriented interpretation.


How to Work Through the Case Study

Recommended approach:

  1. Read the narrative first (don’t rush to “fix”).
  2. Run the code locally if possible.
  3. Pause at decision points and ask:
    • What exactly is the evidence?
    • What would I do in a real project?
    • What assumptions am I making?
  4. Record decisions in your QC log before changing data.

There is rarely one “correct” answer — but there are defensible and indefensible ones.


A Note on Professional Judgment

In real PMx work:

  • data are imperfect
  • time is limited
  • decisions must be explained to others

This section is designed to model that reality.

If you can explain why you made a decision — with evidence — you are practicing PMx, not just R.

After completing this section, you should feel comfortable:

  • taking raw study data from collaborators
  • performing initial QC independently
  • preparing datasets for modeling teams
  • explaining your choices clearly and reproducibly
  • using models for simulation and decision support

These are core professional skills — and they matter as much as any model.