Skip to content

Package Ecosystem#

The R package ecosystem has a few key components.

Packages#

Packages are the primary extension mechanism for R and Python. They can be used to share functions, datasets, and documentation. A package can exist in a few states and the states are explained in the sections below.

Source#

A package is composed of a series of directories and files. The source of a package is just a top-level directory containing the components of the package. Package authors work with source packages during development. Git(Hub) repositories store source packages.

Bundle#

A bundled package is a package that's been compressed into a single file. By convention, package bundles in R use the extension .tar.gz while package bundles in Python use the .whl extension.

Binary#

A binary package is the result of building a source package for a specific operating system. Binary packages are single files that are ready for installation on their specific operating systems.

Installed#

An installed package is a binary package that has been decompressed into a package library and is ready for use by R.

Repositories#

Repositories organize packages for distribution to end users. Repositories contain package bundles and binaries that are organized in a specific way so that users can install packages from the repository using R's install.packages command. CRAN and Bioconductor are examples of R repositories.

Git(Hub)#

Many R package sources are stored in version controlled directories. A popular versioning tool is Git. GitHub, as an extension of Git, houses many package sources. The devtools R package includes convenience functions for installing packages from the package source contained on a Git repository, including GitHub. Used in this manner, git repositories and GitHub are one way to distribute R packages, but GitHub and Git repositories are not R package repositories.

Libraries#

End users of R typically interact with installed packages that live in libraries. Package libraries are just directories containing installed packages. When a package is requested by R, R searches the different library directories to find the installed package.

R libraries are very flexible. In the past, R users have set up libraries for specific projects or set up a system-wide library used across multiple projects. In multi-tenant servers it has been common to have both a system library shared by all users and user-specific libraries.

A best practice is to set up per-project libraries alongside a package cache.