Header left.png

Difference between revisions of "Slurm"

From Systems Group
Jump to: navigation, search
(SBATCH)
Line 61: Line 61:
  
 
  sbatch -p slurm-general-01 --account=slurmgeneral myprogram.sh      # batch script specifying which partition and account when not specified using a slurm directive within the script
 
  sbatch -p slurm-general-01 --account=slurmgeneral myprogram.sh      # batch script specifying which partition and account when not specified using a slurm directive within the script
 +
 +
 +
== Compute Resources ==
 +
The ODU CS department HPC cluster is comprised of multiple partitions where users can submit jobs. Each partition can only be accessed by users who are assigned to the partitions respective account. Not all partitions can be accessed by all users.
 +
 +
 +
{| class="wikitable"
 +
|+ Resources
 +
|-
 +
! Cluster !! Partition !! Account
 +
|-
 +
| slurm-cluster || slurm-general-01 || slurmgeneral
 +
|-
 +
| slurm-cluster || slurm-general-02 || slurmgeneral
 +
|-
 +
| slurm-cluster || haoresearch || shaoresearch
 +
|-
 +
| slurm-cluster || lusiliresearch || lliresearch
 +
|-
 +
| slurm-cluster || wangresearch || fwangresearch
 +
|}

Revision as of 18:37, 4 August 2022

Slurm is an open-source job scheduler for Linux and Unix-like kernels.

SRUN

srun is used to submit jobs for execution in real time. Also used to create job steps.


srun example

srun --pty /bin/bash                                                   # shell on compute job / default account is used when not specified 
srun -p slurm-general-01 --account=slurmgeneral --pty /bin/bash        # shell on compute job / specifying which partition and account (applicable if user is assigned multiple accounts)

SBATCH

Sbatch is a command used to submit jobs via batch scripts to SLURM.

SLURM Directives

SLURM directives are job flags that constrain the job to the conditions specified. Directives can be identified by the syntax `#SBATCH <flag>'.

Flags
Resource Syntax Example Description
Account --account=slurmgeneral --account=slurmgeneral entity which resources are charged to. available accounts
Partition --partition=slurm-general-01 --partition=slurm-general-01 where job resources are allocated. available partitions
Job Name --job-name=<filename> --job-name=testprogram name of job to be queued
Task --ntask=<number> --ntask=2 useful for commands to be ran in parallel
Memory --mem=<size>[units] --mem=1gb memory to be allocated for job
Output --output=<filename> --output=testprogram.log name of job output file
Time --time= --time=01:00:00 time limit for job


batch script example

#!/bin/bash -l                             # login shell (required for lmod)                             
#SBATCH --job-name=testprogram             # job name
#SBATCH --partition=slurm-general-01       # specifying which partition to run job on
#SBATCH --account=slurmgeneral             # only applicable if user is assigned multiple accounts
#SBATCH --ntasks=1                         # commands to run in parallel
#SBATCH --mem=1gb                          # request 1gb of memory
#SBATCH --output=testprogram.lob           # output and error log

date
sleep 10
module use /mnt/lmod_modules/Linux/
module load miniconda3
someProgram.py
date


submitting a job using sbatch

sbatch myprogram.sh                                                  # queue job using a batch script 
sbatch -p slurm-general-01 --account=slurmgeneral myprogram.sh       # batch script specifying which partition and account when not specified using a slurm directive within the script


Compute Resources

The ODU CS department HPC cluster is comprised of multiple partitions where users can submit jobs. Each partition can only be accessed by users who are assigned to the partitions respective account. Not all partitions can be accessed by all users.


Resources
Cluster Partition Account
slurm-cluster slurm-general-01 slurmgeneral
slurm-cluster slurm-general-02 slurmgeneral
slurm-cluster haoresearch shaoresearch
slurm-cluster lusiliresearch lliresearch
slurm-cluster wangresearch fwangresearch