-
Notifications
You must be signed in to change notification settings - Fork 15
Description
@blaked8619 and I are using Jobflow-Remote where it SSHs into a remote machine to run calculations. The base Anaconda environment on the remote machine is clean and does not have Jobflow or Jobflow-Remote; however, the cms
environment on the remote machine does have all the necessary packages.
When I run jf project check --errors
from the local machine where the daemon is run, it reports back:
Worker stellar_vasp warning
Note: inconsistencies may be due to the proper python environment not being correctly loaded.
Missing packages: jobflow - 0.2.0, jobflow-remote - 0.1.7.
This warning is resolved once Jobflow and Jobflow-Remote are installed in the base environment on the remote machine. However, this caused a fair bit of confusion for us because the conda environment (cms
) we specify in the pre_run
had the right packages installed the whole time.
I'm not sure the best way to proceed. Either:
- If the base environment accessible upon SSH needs Jobflow and Jobflow-Remote, this should be specified.
- If the above is not the case, then the package check should not necessarily be done in the base environment.
Should we be specifying our Anaconda environment somewhere else in the worker definition?
Here is the relevant section of the YAML:
stellar_vasp:
type: remote
host: stellar.princeton.edu
user: asrosen
scheduler_type: slurm
work_dir: /scratch/gpfs/ROSENGROUP/asrosen/test
max_jobs: 1
pre_run: |
source ~/.bashrc
module load anaconda3/2024.10
conda activate cms
module load vasp/6.5.1
export QUACC_VASP_PARALLEL_CMD="srun -N 1 --ntasks-per-node 96"
export QUACC_WORKFLOW_ENGINE=jobflow
export QUACC_CREATE_UNIQUE_DIR=False
timeout_execute: 60
resources:
account: rosengroup
time: 04:00:00
nodes: 1
ntasks_per_node: 96
cpus_per_task: 1
mem: 900G