Learn how to understand, inspect, and prepare population PK datasets before model fitting.
Tip
Module goal: Before fitting models, learn how to understand the dataset, identify dosing and observation records, check data quality, and visualize PK profiles.
Module Overview
Population modeling starts with data.
Before writing an nlmixr2 model, we need to understand what the dataset contains, how dosing and observation records are encoded, and whether the data are suitable for modeling.
This module introduces the structure of pharmacometric modeling datasets and uses the theophylline dataset as the main teaching example.
We focus on practical questions:
What does each row represent?
Which rows are observations?
Which rows are dosing events?
Are IDs, times, doses, and concentrations reasonable?
What do the concentration-time profiles look like?
Are there obvious issues before modeling begins?
The goal is to build a clean, defensible starting point for structural PK modeling.
Learning Objectives
By the end of this module, you will be able to:
Describe the structure of a population modeling dataset.
Distinguish observation records from dosing records.
Interpret common modeling columns such as ID, TIME, DV, AMT, EVID, and MDV.
Load and inspect the course dataset.
Create subject-level concentration-time plots.
Use log-scale visualization to inspect PK profiles.
Identify common data issues before model fitting.
Prepare an analysis-ready dataset for later modeling lessons.
Lessons in This Module
Lesson 1: Understanding Modeling Datasets
This lesson introduces the basic structure of population modeling datasets, including observation rows, dosing rows, and common pharmacometric data columns.
Lesson 2: Loading and Exploring the Course Dataset
This lesson introduces the theophylline dataset used throughout the early course modules and teaches how to inspect its structure.
Lesson 3: Preparing Analysis-Ready Data
This lesson focuses on basic data cleaning and QC before modeling, including missing values, duplicates, dose records, and observation records.
Lesson 4: Exploratory PK Visualization
This lesson uses concentration-time profiles to understand subject-level behavior, dose patterns, and potential modeling challenges.
Lesson 5: From Data to Modeling Dataset
This lesson finalizes the prepared dataset and creates a reproducible modeling-ready object for the next module.
Dataset Used
The main dataset in this module is the theophylline dataset provided by nlmixr2data package.
The purpose of using theophylline early is continuity. The dataset is small, interpretable, and rich enough to teach key PK modeling concepts without overwhelming the learner.
Software Used
This module uses mostly data wrangling and visualization packages.