jentre

jentre is a client for the NCBI’s Entrez API.

The Entrez API has many quirks. jentre attempts to deal with those while presenting a convenient interface. It is designed for bulk metadata fetching and link traversal, though also provides full access to other parts of Entrez, albeit with fewer helpers.

Features of jentre:

Provides objects representing sets of Entrez identifiers to avoid mixing them up
Batches requests behind the scenes when needed
Based on {httr2} and {xml2}

The {rentrez} package is more mature and might suit a broader set of applications.

Installation

You can install the development version of jentre like so:

# development version
install.packages('jentre', repos = c('https://cidm-ph.r-universe.dev', 'https://cloud.r-project.org'))

Example

library(jentre)

# Searches by default use the Entrez history server for efficiency:
results <- esearch("Corynebacterium diphtheriae[orgn]", "biosample")
#> ℹ eSearch query "\"Corynebacterium diphtheriae\"[Organism]" has 4124 results
results
#> <entrez@/biosample[1]>
#> [1] MCID_68491.#1[4124]

# The returned object keeps track of which database the UIDs belong to,
# and whether the UIDs are local or on the history server. This makes
# them easier and less error-prone to use:
links <- elink(results, "sra")
links
#> # A tibble: 1 × 3
#>   from            linkname      to         
#>   <list>          <chr>         <list>     
#> 1 <idlst [4,124]> biosample_sra <wbhst [1]>

# You can pull a list of UIDs from the history server:
ids <- as_id_list(links$to[[1]])

# This is a vector with some extra metadata attached, but can be
# subsetted normally:
head(ids, n = 10)
#> <entrez/sra[10]>
#>  [1] 38889263 38768719 38768704 38641044 38501020 38428896 38428225 38427762
#>  [9] 38401944 38401943
as.character(ids[4:8])
#> [1] "38641044" "38501020" "38428896" "38428225" "38427762"

# For endpoints with richer data, you can provide an function to
# extract data you care about from the XML document. The output is
# combined if multiple API requests are needed, so you can end up
# with a single combined data frame, list, or vector:
esummary(
  ids[1:20],
  .process = function(doc) {
    xml2::xml_find_all(doc, "//DocumentSummary/CreateDate") |> xml2::xml_text()
  }
)
#>  [1] "2025/05/29" "2025/05/21" "2025/05/21" "2025/05/14" "2025/05/05"
#>  [6] "2025/04/30" "2025/04/30" "2025/04/30" "2025/04/29" "2025/04/29"
#> [11] "2025/04/29" "2025/04/29" "2025/04/29" "2025/04/29" "2025/04/29"
#> [16] "2025/04/29" "2025/04/29" "2025/04/21" "2025/04/21" "2025/04/21"

# If needed, you can construct an arbitrary request:
req <- entrez_request("esearch.fcgi", db = "nucleotide", term = "biomol+trna[prop]")
# You'll need to execute and parse it yourself:
httr2::req_perform(req) |> httr2::resp_body_xml()
#> {xml_document}
#> <eSearchResult>
#> [1] <Count>1012</Count>
#> [2] <RetMax>20</RetMax>
#> [3] <RetStart>0</RetStart>
#> [4] <IdList>\n  <Id>2737963026</Id>\n  <Id>2586967820</Id>\n  <Id>2274792564< ...
#> [5] <TranslationSet/>
#> [6] <TranslationStack>\n  <TermSet>\n    <Term>biomol+trna[prop]</Term>\n     ...
#> [7] <QueryTranslation>biomol+trna[prop]</QueryTranslation>

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.github		.github
R		R
man		man
tests		tests
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
.lintr		.lintr
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md
_pkgdown.yml		_pkgdown.yml
jentre.Rproj		jentre.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Uh oh!

Repository files navigation

jentre

Installation

Example

About

Licenses found

Uh oh!

Releases

Languages

License

Licenses found

cidm-ph/jentre

Folders and files

Latest commit

History

Repository files navigation

jentre

Installation

Example

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Languages