Skip to content

pharmaR/pharmapkgs

Repository files navigation

Overview

pharmapkgs serves a dual purpose:

  1. Repository for R Packages: A curated repository of R packages that is structured similarly to CRAN but with additional metadata.

  2. R Package for Installation: An R package that allows users to install packages from this repository using specialized filtering options.

Repository Functionality

pharmapkgs provides a repository that users can point to without installing the pharmapkgs R package. This can be achieved by setting the repository to the pharmapkgs repository hosted on GitHub.

Setting the Repository

options(
    repos = "https://raw.githubusercontent.com/pharmaR/pharmapkgs/refs/heads/main/inst/repos",
    pkgType = "source"
)
available_packages <- available.packages()
nrow(available_packages)
#> [1] 7336

Once set, calling available.packages() will list the packages available in the pharmapkgs repository. The repository currently contains around 7,000 packages, a subset of CRAN’s nearly 30,000 packages.

Installation Limitations

The pharmapkgs repository differs from a traditional CRAN repository. It does not store package tarballs but instead relies on a download_url field that links to the CRAN mirror on GitHub. Base R’s install.packages() expects package tarballs within the same repository, so it does not work natively with pharmapkgs. Instead, users should use pak to install packages from this repository.

Installation via pak

options(
    repos = "https://raw.githubusercontent.com/pharmaR/pharmapkgs/refs/heads/main/inst/repos",
    pkgType = "source"
)
install.packages("pak")
pak::pkg_install("<package_name>")

The pak package understands the download_url field and fetches the package source accordingly.

Filtering Packages

pharmapkgs enhances the standard PACKAGES file by adding additional metadata. Users can define custom filters for selecting packages.

Example: Filtering by Dependencies

my_filters <- pharmapkgs::risk_filter(
    dependencies > 0.6
)
#> Pharmapkgs log level:DEBUG
#> Pharmapkgs: config values loaded
#> {
#>   "excluded_riskmetric_assessments": ["assess_covr_coverage", "assess_r_cmd_check"],
#>   "local_packages": "/tmp/RtmpdQIMZX/temp_libpath1354e8a6171/pharmapkgs/repos/src/contrib/PACKAGES",
#>   "project_path": "/workspaces/pharmapkgs",
#>   "local_base": "/tmp/RtmpdQIMZX/temp_libpath1354e8a6171/pharmapkgs/repos",
#>   "remote_base": "https://cloud.r-project.org/",
#>   "limit": 5
#> }
filtered_packages <- available.packages(filters = my_filters, fields = "dependencies")
nrow(filtered_packages)
#> [1] 3070

This command filters packages where the dependencies is greater than 0.6.

Notice how filtered_packages has less rows than available_packages.

Handling Dependencies

Since the pharmapkgs repository is incomplete, some dependencies may not be available. To mitigate this, users can specify multiple repositories so that missing packages are fetched from CRAN.

options(repos = c(Pharma = "https://raw.githubusercontent.com/pharmaR/pharmapkgs/refs/heads/main/inst/repos", CRAN = "https://cran.r-project.org"))

This setup allows pak to first check pharmapkgs, then fall back to CRAN if a package is missing.

Technical Details

Repository Structure

  • The repository contains a PACKAGES file with metadata.
  • Instead of hosting package tarballs, the download_url field points to the GitHub CRAN mirror.

Why Not Use CRAN Directly?

CRAN changes URLs for archived packages, making direct linking unreliable. The GitHub CRAN mirror is more stable as each package version is stored as a tagged release.

GitHub Actions

A GitHub Actions pipeline updates the repository and generates reports. However, report generation is secondary to package validation and is prone to failures. Incremental site builds in Quarto are not yet supported, requiring full site rebuilds.

Known Issues and Future Work

  • Package installation only works with pak, not install.packages().
  • The repository is incomplete and being updated slowly.
  • The GitHub Actions pipeline occasionally fails due to report generation issues.
  • Future work includes optimizing package validation and improving repository completeness.

Development

This repository is configured to be used with the devcontainer. You can still open it normally in RStudio or any other IDE, but the devcontainer provides a consistent development environment for all users.

To get started with the devcontainer, open the repository in Visual Studio Code and click on the “Reopen in Container” button that appears in the bottom right corner of the window.

Once the process is completed, you can start using it. If you prefer to use RStudio, you may start rstudio server by running rserver in the terminal. Then open the web browser and navigate to http://localhost:8787.

Devcontainer also allows you to easily open the project in Github Codespaces.

About

Work in progress - demo of integrations with a pharma repository

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Packages

No packages published

Contributors 5

Languages