Skip to content

Can't call loops on SLURM #178

@wkc1986

Description

@wkc1986

Describe the bug
call-hiccups_input_hic failed, apparently because GPU resources not requested. Similar situation for call-delta.

OS/Platform

  • OS/Platform: Red Hat Enterprise Linux 8.6
  • Conda version: used Singularity
  • Pipeline version: v 1.15.1
  • Caper version: 2.2.3

Caper configuration file

backend=slurm

# SLURM partition. DEFINE ONLY IF REQUIRED BY YOUR CLUSTER'S POLICY.
# You must define it for Stanford Sherlock.
#slurm-partition=large-mem
slurm-partition=gpu

# SLURM account. DEFINE ONLY IF REQUIRED BY YOUR CLUSTER'S POLICY.
# You must define it for Stanford SCG.
slurm-account=

# Local directory for localized files and Cromwell's intermediate files.
# If not defined then Caper will make .caper_tmp/ on CWD or `local-out-dir`.
# /tmp is not recommended since Caper store localized data files here.
local-loc-dir=

cromwell=/gs/gsfs0/users/kuchang/.caper/cromwell_jar/cromwell-82.jar
womtool=/gs/gsfs0/users/kuchang/.caper/womtool_jar/womtool-82.jar

# following parts added by me
#
# SLURM resource parameters
slurm-leader-job-resource-param=-t 48:00:00 --mem 4G

# This parameter defines resource parameters for submitting WDL task to job engine.
# It is for HPC backends only (slurm, sge, pbs and lsf).
# It is not recommended to change it unless your cluster has custom resource settings.
# See https://github.com/ENCODE-DCC/caper/blob/master/docs/resource_param.md for details.
slurm-resource-param=-n 1 --ntasks-per-node=1 --cpus-per-task=${cpu} ${if defined(memory_mb) then "--mem=" else ""}${memory_mb}${if defined(memory_mb) then "M" else ""} ${if defined(time) then "--time=" else ""}${time*60} ${if defined(gpu) then "--gres=gpu:" else ""}${gpu} --time=28-0
#slurm-resource-param=-n 1 --ntasks-per-node=1 --cpus-per-task=1 --mem=10000M

Input JSON file

{
  "hic.assembly_name": "mm10",
  "hic.chrsz": "../data/mm10/encode/mm10_no_alt.chrom.sizes.tsv",
  "hic.input_hic": "hic/70f45f73-c0c0-42a4-95e0-8242ca9eef03/call-add_norm/shard-1/execution/inter_30.hic",
  "hic.reference_index": "/gs/gsfs0/user/kuchang/data/mm10/encode/ENCFF018NEO.tar.gz",
  "hic.restriction_enzymes": [
    "none"
  ],
  "hic.restriction_sites": "/gs/gsfs0/user/kuchang/data/mm10/ftp-arimagenomics.sdsc.edu/pub/JUICER_CUTSITE_FILES/mm10_GATC_GANTC.txt.gz",
  "hic.create_accessibility_track_ram_gb": 64
}

call-hiccups_input_hic/execution/stderr ends with

GPU/CUDA Installation Not Detected
Exiting HiCCUPS

Looking at call-hiccups_input_hic/execution/script.submit, the sbatch call doesn't have --gres=gpu:1 which I'm guessing would be necessary. Same with call-delta/execution/script.submit. The slurm-partition specified should in fact have GPUs.

In addition, call-delta/execution/stderr contains /usr/bin/python: can't find '__main__' module in ''

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions