Skip to content

Fails to install when there is a parent library path and do not recognize parent libsPath #732

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
latot opened this issue Jan 7, 2025 · 4 comments
Labels
feature a feature request or enhancement

Comments

@latot
Copy link

latot commented Jan 7, 2025

Hi, I'm using Ubuntu Noble, and I found pak fails to install when you have packages on /usr/local/lib/R/site-library.

Example:

For root:

sudo bash
R
pak::pkg_install("dplyr")

Now in a normal user:

R
pak::pkg_install("dplyr?source")
                                                                           
→ Will install 1 package.
→ The package (1.21 MB) is cached.
+ dplyr   1.1.4 [bld][cmp]
  
ℹ No downloads are needed, 1 pkg (1.21 MB) is cached
ℹ Building dplyr 1.1.4
✔ Built dplyr 1.1.4 (13s)                                       
Error:                                                            
! error in pak subprocess
Caused by error in `verify_extracted_package(filename, pkg_cache)`:
! /tmp/RtmpNBsw5r/file5b8627a64e601/dplyr_1.1.4_R_x86_64-pc-linux-gnu.tar.gz is not a valid R package, it is an empty archive.
Type .Last.error to see the more details.
> .Last.error
<callr_error/rlib_error_3_0/rlib_error/error>
Error: 
! error in pak subprocess
Caused by error in `verify_extracted_package(filename, pkg_cache)`:
! /tmp/RtmpNBsw5r/file5b8627a64e601/dplyr_1.1.4_R_x86_64-pc-linux-gnu.tar.gz is not a valid R package, it is an empty archive.
---
Backtrace:
1. pak::pkg_install("dplyr?source")
2. pak:::remote(function(...) get("pkg_install_do_plan", asNamespace("pak"))(...), …
3. err$throw(res$error)
---
Subprocess backtrace:
 1. base::withCallingHandlers(cli_message = function(msg) { …
 2. get("pkg_install_do_plan", asNamespace("pak"))(...)
 3. proposal$install()
 4. pkgdepends::install_package_plan(plan, lib = private$library, num_workers = nw, …
 5. base::withCallingHandlers({ …
 6. pkgdepends:::handle_events(state, events)
 7. pkgdepends:::handle_event(state, i)
 8. proc$get_result()
 9. processx:::process_get_result(self, private)
10. private$post_process()
11. pkgdepends:::install_extracted_binary(filename, lib_cache, pkg_cache, lib, …
12. pkgdepends:::verify_extracted_package(filename, pkg_cache)
13. base::throw(pkg_error("{.path {filename}} is not a valid R package, it is an empty archive.", …
14. | base::signalCondition(cond)
15. global (function (e) …

This is actually half of the problem, pak should check permissions, and if do not have them, should request to use a local directory.

The second half is if we have a new library pah, we can make a local one just trying to install dplyr normally:

install.packages("dplyr")
Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
Warning in install.packages("dplyr") :
  'lib = "/usr/local/lib/R/site-library"' is not writable
Would you like to use a personal library instead? (yes/No/cancel) yes
Would you like to create a personal library
‘/home/cit_16/R/x86_64-pc-linux-gnu-library/4.4’
to install packages into? (yes/No/cancel) yes
## CANCEL NOW

Now that the local folder exists, we can see the second issue, is that in case we have multiple paths on libPath, pak will ignore all of the others, install a package in root means will store on /usr/local/lib/R/site-library which also has other packages, but if we try to install again:

pak::pkg_install("dplyr")
                                                                          
→ Will install 16 packages.All 16 packages (6.16 MB) are cached.
+ cli          3.6.3  [bld][cmp]
+ dplyr        1.1.4  [bld][cmp]
+ fansi        1.0.6  [bld][cmp]
+ generics     0.1.3  [bld]
+ glue         1.8.0  [bld][cmp]
+ lifecycle    1.0.4  [bld]
+ magrittr     2.0.3  [bld][cmp]
+ pillar       1.10.1 [bld]
+ pkgconfig    2.0.3  [bld]
+ R6           2.5.1  [bld]
+ rlang        1.1.4  [bld][cmp]
+ tibble       3.2.1  [bld][cmp]
+ tidyselect   1.2.1  [bld][cmp]
+ utf8         1.2.4  [bld][cmp]
+ vctrs        0.6.5  [bld][cmp]
+ withr        3.0.2  [bld]
  
ℹ No downloads are needed, 16 pkgs (6.16 MB) are cachedBuilding cli 3.6.3Building fansi 1.0.6Building generics 0.1.3Building glue 1.8.0Building magrittr 2.0.3Building pkgconfig 2.0.3Building R6 2.5.1Building rlang 1.1.4
^Cstalling...

Pak will reinstall everything, even the packages that are already provided from a parent libPath, is ignoring all the installed libraries.

[1] "/home/cit_16/R/x86_64-pc-linux-gnu-library/4.4"
[2] "/usr/local/lib/R/site-library"                 
[3] "/usr/lib/R/site-library"                       
[4] "/usr/lib/R/library"

Tested on git Pak.

sessionInfo()
R version 4.4.2 (2024-10-31)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.1 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.12.0 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0

locale:
 [1] LC_CTYPE=es_ES.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=es_ES.UTF-8        LC_COLLATE=es_ES.UTF-8    
 [5] LC_MONETARY=es_ES.UTF-8    LC_MESSAGES=es_ES.UTF-8   
 [7] LC_PAPER=es_ES.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C       

time zone: Etc/UTC
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] processx_3.8.4 compiler_4.4.2 R6_2.5.1       cli_3.6.3      tools_4.4.2   
[6] callr_3.7.6    ps_1.8.1       pak_0.8.0 

Thx!

@gaborcsardi
Copy link
Member

Indeed, pak always installs everything into a single library, that is by design. AFAIR there is an issue to consider packages in other libraries as well.

@gaborcsardi gaborcsardi added the feature a feature request or enhancement label May 9, 2025
@latot
Copy link
Author

latot commented May 9, 2025

The reason we split this, is that some libraries are used by a lot of ppl, not all of them are tech ones, and even some libraries are not always easy to install, so we install in a shared place, so we can upgrade them one time are keep available to all ppl.

@gaborcsardi
Copy link
Member

I don't question your reasons. OTOH I am not super eager to support this use case because it is pretty error prone and makes it hard to follow what is installed where. E.g. pak::pkg_install(upgrade = TRUE) installs the latest versions. If it updates a package where should the update go? I guess it should install a new version into the "installation library", even if there is an older version in "another library." But then if "another library" comes first in the library path, then the one that pak just installed will never be used.

Sharing a package library among users also leads to unusual errors, e.g. if you update a package in the shared library, that probably breaks everybody's active R session. If you are on Windows, then you might not be able to update a package in the shared library if a user is using that package, etc

If you decide that it still make sense to create a shared library, that's fine, and we'll support that at some point, but it is not very high priority for me, because I am worried that people don't realize the problems with this setup.

@latot
Copy link
Author

latot commented May 9, 2025

I was trying to point out just use cases, so don't worry.

I agree with your points, I use this with that considerations, who uses this needs take considerations to do not break everything, in my case I upgrade the system and R packages in a specific time, when no one is using it.

At the same time, when we work in institutions for most ppl, "they must use the same package versions", so handle from a centralization place is ideal, only a tech ppl should be able to install and mix versions, obvs non-tech users should not upgrade nor install packages.

Thinking in this, maybe the ideal case is give support for this, but in case there is a higher install path, throw an error, and with a param force the installation if we know "what are we doing".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants