Mapping and Iterating Across Lists

Use purrr::map() and related helpers to iterate safely across lists, subjects, datasets, and models in PMx workflows.

Tip

What you’ll build today: safe, readable iteration patterns using map() to apply functions across subjects, datasets, and files — without writing loops.

Learning Objectives

By the end of this lesson, you will be able to:

Understand why lists are central to iteration workflows.
Use map() to apply a function across elements of a list.
Use type-stable variants (map_dbl(), map_chr(), etc.).
Combine iteration with group_split() and nest()-style workflows.
Recognize when iteration is more appropriate than mutate/grouping.

Setup

library(tidyverse)

Key Ideas

Everything we have done so far operates on columns or grouped tables.

purrr operates on lists.

Iteration becomes necessary when:

You have multiple datasets.
You want per-subject models.
You want to read multiple files.
You want to apply the same function repeatedly.

Warning

If you find yourself copying and pasting the same code for multiple objects, you likely need iteration.

Example Dataset

We’ll use a simple PK-style dataset.

pk <- tibble::tribble(
  ~ID, ~TIME, ~DV,
    1,  0.5,  2.1,
    1,  1.0,  3.8,
    2,  0.5,  1.6,
    2,  1.0,  2.9
)

pk

# A tibble: 4 × 3
     ID  TIME    DV
  <dbl> <dbl> <dbl>
1     1   0.5   2.1
2     1   1     3.8
3     2   0.5   1.6
4     2   1     2.9

Lists: The Foundation of map()

A list can store multiple objects.

lst <- list(a = 1:3, b = 4:6)
lst

$a
[1] 1 2 3

$b
[1] 4 5 6

Access elements:

lst$a

[1] 1 2 3

map() applies a function to each element of a list.

Worked Example 1: Simple map()

map(lst, mean)

$a
[1] 2

$b
[1] 5

This returns a list of results.

Worked Example 2: Type-Stable Mapping

Often we want a numeric vector instead of a list.

map_dbl(lst, mean)

a b 
2 5

Common variants:

map_dbl() → numeric
map_chr() → character
map_lgl() → logical
map_df() / map_dfr() → data frame

Worked Example 3: Per-Subject Summaries

Split dataset into a list by subject:

pk_list <- pk %>% group_split(ID)
pk_list

<list_of<
  tbl_df<
    ID  : double
    TIME: double
    DV  : double
  >
>[2]>
[[1]]
# A tibble: 2 × 3
     ID  TIME    DV
  <dbl> <dbl> <dbl>
1     1   0.5   2.1
2     1   1     3.8

[[2]]
# A tibble: 2 × 3
     ID  TIME    DV
  <dbl> <dbl> <dbl>
1     2   0.5   1.6
2     2   1     2.9

Compute mean DV per subject:

map_dbl(pk_list, ~ mean(.x$DV))

[1] 2.95 2.25

Worked Example 4: Returning Data Frames

Create subject-level summaries as a data frame:

map_dfr(pk_list, ~ summarise(.x, mean_DV = mean(DV)))

# A tibble: 2 × 1
  mean_DV
    <dbl>
1    2.95
2    2.25

Worked Example 5: Combining map() with mutate()

You can also create list-columns:

nested <- pk %>% group_nest(ID)

nested %>%
  mutate(mean_DV = map_dbl(data, ~ mean(.x$DV)))

# A tibble: 2 × 3
     ID               data mean_DV
  <dbl> <list<tibble[,2]>>   <dbl>
1     1            [2 × 2]    2.95
2     2            [2 × 2]    2.25

Strategies

Use map() when working with lists.
Use group_by() + summarise() when working within a single table.
Prefer type-stable versions (map_dbl(), etc.).
Start simple before nesting workflows.

Common Mistakes

Forgetting that map() returns a list.
Using map() when group_by() + summarise() would be simpler.
Choosing map() instead of a type-stable variant like map_dbl().
Confusing .x inside anonymous functions.
Creating list-columns without checking what each element contains.
Making nested workflows more complicated than needed.

Practice Problems

Create a list of numeric vectors and compute their means using map_dbl().
Split pk by ID and compute max DV per subject.
Return a data frame with subject-level min and max DV.
Create a nested dataset using group_nest().
Explain when map() is preferable to grouped summarise().

Step-by-Step Solutions

# 1
lst2 <- list(a = 1:5, b = 6:10)
map_dbl(lst2, mean)

a b 
3 8

# 2
pk %>%
  group_split(ID) %>%
  map_dbl(~ max(.x$DV))

[1] 3.8 2.9

# 3
pk %>%
  group_split(ID) %>%
  map_dfr(~ summarise(.x, min_DV = min(DV), max_DV = max(DV)))

# A tibble: 2 × 2
  min_DV max_DV
   <dbl>  <dbl>
1    2.1    3.8
2    1.6    2.9

# 4
pk %>% group_nest(ID)

# A tibble: 2 × 2
     ID               data
  <dbl> <list<tibble[,2]>>
1     1            [2 × 2]
2     2            [2 × 2]

# 5
# map() is preferable when working with lists or multiple separate objects.

Summary

In this lesson you learned how to:

Apply functions repeatedly using map().
Use type-stable variants for predictable output.
Combine iteration with grouped workflows.
Recognize when iteration simplifies repeated code.

Iteration unlocks scalable PMx workflows — especially for modeling and simulations.

Quick Tips

map() returns a list; use map_dbl() etc. for vectors.
Use iteration for repeated operations across objects.
Keep anonymous functions readable (~ mean(.x)).
Avoid iteration when grouped summarise is sufficient.