Packages and the Tidyverse

Understand how R packages work, why the tidyverse is the foundation for modern, readable PMx workflows, and how renv ensures reproducibility.
Tip

Big idea: Packages extend R.
The tidyverse gives you a shared grammar for data work in PMx.
renv protects that work by locking package versions.

Learning Objectives

By the end of this lesson, you will be able to:

  • Explain what an R package is.
  • Install and load packages correctly.
  • Understand what the tidyverse includes (and why).
  • Handle namespace conflicts safely.
  • Adopt package best practices for reproducible PMx work.
  • Understand when and how to use renv.

What Is an R Package?

An R package is a collection of:

  • functions
  • documentation
  • (sometimes) data

Base R comes with many packages, but most PMx workflows rely on additional packages.

Examples:

  • ggplot2 → visualization
  • dplyr → data manipulation
  • readr → reading/writing data
  • nlmixr2 → nonlinear mixed-effects modeling

Installing vs Loading Packages

These are not the same thing.

Install (once per computer)

install.packages("tidyverse")

You usually run this once, not in every script.


Load (every session)

library(tidyverse)

You must load packages every time you start a new R session.

Warning

Never put install.packages() inside analysis scripts. Installation is environment setup — not part of reproducible analysis code.


What Is the Tidyverse?

The tidyverse is a collection of R packages designed for data science workflows.
You can learn more at: https://www.tidyverse.org/

Key members you’ll use constantly:

  • dplyr → filtering, mutating, summarizing
  • ggplot2 → plotting
  • tidyr → reshaping data
  • readr → reading/writing files
  • tibble → modern data frames
  • purrr → functional helpers
  • stringr → string manipulation
  • lubridate → date-time manipulation

When you run:

library(tidyverse)

All of these are loaded at once.


Why Tidyverse for PMx?

The tidyverse emphasizes:

  • readable code
  • explicit data transformations
  • consistent verbs
  • fewer silent surprises

Compare:

pk[pk$DV > 2 & pk$TIME <= 1, ]

vs

pk %>% dplyr::filter(DV > 2, TIME <= 1)

The second reads closer to plain English and scales better in complex workflows.


Namespace Conflicts (Important)

Different packages can define functions with the same name.

Classic example:

filter()

Exists in both:

  • dplyr
  • stats

If there’s ambiguity, be explicit:

dplyr::filter(pk, DV > 2)
Note

Being explicit avoids confusion and makes scripts more robust.


Checking What’s Loaded

search()
 [1] ".GlobalEnv"        "package:lubridate" "package:forcats"  
 [4] "package:stringr"   "package:dplyr"     "package:purrr"    
 [7] "package:readr"     "package:tidyr"     "package:tibble"   
[10] "package:ggplot2"   "package:tidyverse" "package:stats"    
[13] "package:graphics"  "package:grDevices" "package:datasets" 
[16] "renv:shims"        "package:utils"     "package:methods"  
[19] "Autoloads"         "package:base"     

To see where a function comes from:

find("filter")
[1] "package:dplyr" "package:stats"

Package Versions and Reproducibility

Package updates can change behavior. In collaborative or regulated PMx work, this matters.

Best practices:

  • record package versions for important analyses
  • avoid unnecessary updates mid-project
  • use projects to isolate workflows

Using renv for Reproducible Projects

renv creates a project-specific package library and records exact package versions in a file called renv.lock.

Initialize once per project:

renv::init()

Record versions after installing packages:

renv::snapshot()

Restore exact versions later or on another machine:

renv::restore()

What to commit to Git:

  • renv.lock (always)
  • usually the renv/ folder (team preference varies)
Tip

For learning exercises, we’ll use install.packages() normally.
For real PMx projects, renv is strongly recommended.


A Minimal PMx Script Header (Packages)

A clean pattern:

library(tidyverse)

# other packages only if needed
# library(readxl)
# library(nlmixr2)

Load packages at the top, so dependencies are obvious.


Strategies

  • Install once, load every session.
  • Use library(tidyverse) as a default for PMx work.
  • Be explicit with package::function() when needed.
  • Load only packages you actually use.
  • Use renv for serious or collaborative projects.
  • Keep package loading at the top of scripts.

Practice Problems

  1. Install the tidyverse (if not already installed).
  2. Load the tidyverse.
  3. Use dplyr::filter() explicitly.
  4. Check which packages are attached.
  5. Initialize renv in a project and explain what renv.lock does.

install.packages("tidyverse")   # run once
library(tidyverse)

dplyr::filter(pk, DV > 2)

search()

renv::init()
renv::snapshot()

Summary

You now understand:

  • what packages are and why they matter
  • the difference between installing and loading
  • what the tidyverse includes
  • how to avoid namespace conflicts
  • why package versions affect reproducibility
  • how renv helps lock versions for PMx projects

The tidyverse will be the backbone of everything you do next.
renv ensures your work stays reproducible.


  • Don’t install packages inside scripts.
  • Load packages at the top.
  • Prefer tidyverse tools for clarity.
  • Be explicit when conflicts arise.
  • Use renv for any serious PMx project.