Skip to content

Curated CRAN Sources#

Enhanced Advanced

This section provides an overview of what curated-cran sources are, why they are useful, and how to use them. If you are already familiar with curated-cran sources, then reference the Quick Start section on creating your first curated subset of CRAN.

Overview#

Curated CRAN sources are based on a current mirror of CRAN. It can be useful to only include certain CRAN packages and versions within a source. This is especially helpful in secure, regulated environments, where only verified sets of packages are allowed.

Creating a Curated CRAN Source#

Terminal
$ rspm create source --name=subset --type=curated-cran
<< Source 'subset':
<<  Type:  Curated CRAN - 2024-04-17

Curated CRAN sources don't need to be pinned to a specific snapshot date at the time of creation; any date can be picked when adding packages with rspm update (described below). Once the source has been created, be sure to subscribe a repository to the source to make the packages available to users:

Terminal
# Create a repository:
$ rspm create repo --name=cran --type=r --description='Access Curated CRAN packages'
<< Repository: cran - Access Curated CRAN packages - R

# Subscribe a repository to the curated-cran source:
$ rspm subscribe --repo=cran --source=subset
<< Repository: cran
<< Sources:
<< --subset (Curated CRAN - 2024-04-17)

Including Packages in a Curated CRAN Source#

Packages are included in a curated-cran source by uploading a requirements.txt definition with rspm update. This section explains how a requirements file is defined and also discusses how to use a requirements file to include packages in a curated-cran source.

Requirements Files#

The requirements.txt format that Package Manager looks at is defined as:

Curated CRAN Requirements Format
[package name] [optionally: version constraints] [[optionally: extras]]

The extras block allows you to specify which types of related packages should be included along with the package itself.

The extras block is additive to the --include flag that the source was created with. See the section on source-level included packages to learn more.

The following values are valid for the extras block:

  • depends
  • imports
  • linking-to
  • suggests
  • all

Suggested dependency inclusions

Suggested dependencies, when included, are included only at a single level, not recursively; all other types are included recursively.

As an example, a requirements.txt file could look like:

requirements.txt
A3 >= 0.9.2 [all]
ggplot2 == 3.4.4
plumber [suggests]

-r requirements2.txt

This fetches and installs:

  • A3 with versions greater than or equal to 0.9.2 along with all related packages
  • Only ggplot2 version 3.4.4
  • All available versions of plumber along with suggested packages.
  • All packages from requirements2.txt.

Filtering by version is not supported for:

  • Curated CRAN sources created prior to Package Manager version 2024.04
  • Sources created using the --strict flag

By default, the depends, imports, and linking-to dependencies for each package are included. The default types of related packages to include can be configured per source.

See the section on source-level included packages to learn more.

As shown in the example above, a package doesn't need to have any version constraints defined. It can also have as many version constraints as needed. The versions made available to Package Manager depend on what is available at the snapshot date specified when updating the source.

All version parsing and matching criteria is based on PEP-440. Refer to the PEP-440 documentation for information on version formatting and constraints. For more information on the Requirements File Format, refer to pip's documentation.

requirements.txt limitations

Not everything defined in the Requirements File Format specification is supported in Package Manager.

The curated-cran source only parses package names, version ranges, and recursive file references. Any other definitions (e.g., option flags, environment markers) within an uploaded requirements.txt file is ignored.

The requirements.txt file also supports declaring multiple references of the same package with different version constraints:

requirements.txt
tidyr == 1.3.0
...
tidyr == 1.3.1

This will be treated as an OR operator, leading the curated-cran source to evaluate the defined version constraints as "tidyr == 1.3.0 or tidyr == 1.3.13".

In this example, Package Manager pulls in version 1.3.0 and version 1.3.1.

Excluding a package using multiple references

Use caution when referencing a package multiple times when using a != constraint. As an example:

requirements.txt
tidyr >= 1.1.0, < 1.3.0
...
tidyr != 1.2.1

This still includes version 1.2.1 because it is being evaluated as "tidyr >= 1.1.0, < 1.3.0 or tidyr != 1.2.1"

To guarantee that version 1.2.1 is excluded, include all version constraints on a single line so Package Manager evaluates all constraints together:

requirements.txt
tidyr >= 1.1.0, < 1.3.0, != 1.2.1

Filtering out package versions can break the R package graph, so it must be done with care.

Updating a Curated CRAN Source#

To make packages available in a Curated CRAN source, all that is necessary is to run rspm update with a requirements file for a specific CRAN snapshot date. Package Manager allows running a dry-run before committing the changes to the source:

Terminal
# Do a dry-run to visualize the changes to the source before doing them
$ rspm update --source=subset --file-in=/path/to/requirements.txt --snapshot=2024-04-17

Updating to the latest snapshot

To use the most recent snapshot available, omit the --snapshot flag from the dryrun command.

A preview of the changes is presented:

Output
rspm update --source=subset --file-in=requirements.txt --snapshot=2024-04-17

Packages from 'requirements.txt' to update source 'subset' at CRAN snapshot date '2024-04-17':

Name         Version              Action
A3           1.0.0                add
arrow        15.0.1               add
assertthat   0.2.1                add
base64enc    0.1-3                add
<truncated>

If the output above looks correct, execute this command again with the --commit and --snapshot=2024-04-17 flags to update the source with the new set of packages.

To commit the changes, repeat the command, adding the --commit flag:

Terminal
# Now commit the changes to the source:
$ rspm update --source=subset --file-in=/path/to/requirements.txt --snapshot=2024-04-17 --commit

The finalized contents of the source are then printed:

Output
rspm update --source=subset --file-in=requirements.txt --snapshot=2024-04-17 --commit

Successfully updated source 'subset' at CRAN snapshot date '2024-04-17' with the following packages from 'requirements.txt':

Name         Version              Action
A3           1.0.0                add
arrow        15.0.1               add
assertthat   0.2.1                add
base64enc    0.1-3                add
<truncated>

rspm update overwrite behavior

Running rspm update on a Curated CRAN source overwrites the source with only the packages defined in your requirements.txt file. However, previous snapshots of the source are still available with a pinned repo URL.

To update the source to a different snapshot date, use the update command again:

Terminal
# Update packages in a curated-cran source:
$ rspm update --source=subset --file-in=/path/to/requirements.txt --snapshot=2024-04-18 --commit

Curated CRAN sources can be pinned to any date for which Posit has a CRAN snapshot (typically, once per weekday). Curated CRAN sources also support using any date, regardless of the previously used snapshot dates. If the source was initially set to 2021-02-03, it can then be set to a later date with --snapshot=2022-06-01. If later you would like to pin it back to the original date used, that can be done by running rspm update again with --snapshot=2021-02-03.

Curated CRAN snapshots

This allows you to set the Curated CRAN source to any date where a CRAN snapshot has been taken on our servers. To pin to a version of a package that doesn't exist on CRAN anymore, pin to a date when the version of the package existed.

The snapshot date for non-strict sources can be moved both forwards and backwards in time.

Source-Level Included Packages#

A Curated CRAN source will automatically include the related depends, imports, and linking-to packages for each requested package.

You can change the types of related packages that are included when creating a source. This can be done by passing in the --include flag. For example:

Terminal
# Equivalent to the default setting
$ rspm create source --name=subset --type=curated-cran --include=depends,imports,linking-to

# Include the defaults plus `suggests`
$ rspm create source --name=subset --type=curated-cran --include=depends,imports,linking-to,suggests

# Don't include any related packages
$ rspm create source --name=subset --type=curated-cran --include=none

# Include all related packages:
$ rspm create source --name=subset --type=curated-cran --include=all

Editing --include

The value of the --include cannot be changed after source creation.

The following options are valid options for the --include flag:

  • depends
  • imports
  • linking-to
  • suggests
  • none
  • all

Suggested dependency inclusions

Suggested dependencies, when included, are included only at a single level, not recursively; all other types are included recursively.

Included related packages can also be configured per-package by adding an extras block to your packages in the requirements.txt file. For example:

The extras block is additive to the --include flag that the source was created with.

requirements.txt
A3 >= 0.9.2 [all]
ggplot2 == 3.4.4
plumber [depends, imports, linking-to, suggests]
  • A3 with versions greater than or equal to 0.9.2 plus all related packages
  • ggplot2 version 3.4.4 using the default inclusions
  • All versions of plumber plus depends, imports, linking-to, and suggests packages

Strict Curated CRAN#

Older Curated CRAN sources (those created before Package Manager version 2024.04) use a less permissive package version snapshot behavior. You can still create Curated CRAN sources using the previous behavior by using the --strict option when creating the source:

Terminal
$ rspm create source --name=subset --type=curated-cran --strict
<< Source 'subset':
<<  Type:  Curated CRAN - 2024-04-18 - Strict

Editing --strict

The value of the --strict cannot be changed after source creation.

Strict vs Default Sources#

When a source uses the new default archive policy, updates to the source incorporate all versions of the affected package to match the associated CRAN snapshot. The package versions associated with a curated snapshot will match the versions available for the associated CRAN snapshot, including current and archived packages.

When a source uses the --strict archive policy, any updates to the source incorporates only the current versions of the affected packages. Any previous package versions associated with the source at the previous transaction are recorded as archived packages.

Strict sources are useful for those who need more control over available versions without the ability to add version constraints. Non-strict sources are significantly more flexible.

rspm add vs rspm update#

The rspm add command is still supported to add packages to Curated CRAN sources, however it is recommended that rspm update is used instead. rspm add may be deprecated in a future release of Package Manager.

rspm add can be used to add individual packages to a Curated CRAN source:

Terminal
# Specify the top-level packages you want to add:
$ rspm add --packages=ggplot2,shiny --source=subset

The output provides information on all the packages that will be added. The proposal can be saved to a CSV file using the csv-out flag. The required dependencies for the named packages are automatically discovered and included. Optionally, use the --include-suggests flag to also discover and add suggested packages.

Output
Packages to update source 'subset' at CRAN snapshot date '2024-04-22':

Name         Version              Action
base64enc    0.1-3                add
bslib        0.7.0                add
cachem       1.0.8                add
cli          3.6.2                add
<truncated>

If the output above looks correct, execute this command again with the --commit and --snapshot=2024-04-22 flags to update the source with the new set of packages.

To commit the changes, repeat the command, adding the --commit flag.

Terminal
# Commit the top-level packages you want to add:
$ rspm add --packages=ggplot2,shiny --source=subset --snapshot=2020-07-06 --commit

rspm add also supports adding a large number of packages that are specified in a file. To do this, create a file containing one package name per line. For example, /tmp/packages.csv:

/tmp/packages.csv
plumber
shiny
ISLR

Then use the add command, this time using the --file-in flag:

Terminal
$ rspm add --file-in='/tmp/packages.csv' --source=subset