Launch » Job »
Slurm Job Scheduling System
The principle of a resource manager is to allocate compute cores according to your needs.
You interact with the resource manager using the following commands:
- sbatch: submit a job to a queue (called partitions in Slurm)
- scancel: cancel a job
- squeue: view running or pending jobs
- sinfo: view queue/partition status
- srun: run a command immediately
How to Launch a Job
To launch a job, create a script where you request resources and call your program. Example:
#!/bin/bash
#SBATCH -J test # Job name
#SBATCH -o job.%j.out # Stdout output file (%j = job ID)
#SBATCH -N 2 # Number of nodes requested
#SBATCH -n 16 # Number of MPI tasks
#SBATCH -t 01:30:00 # Run time (hh:mm:ss)
# Launch the MPI-based executable
prun ./a.out
Submit the script with the sbatch command:
[test@sms ~]$ sbatch job.mpi
Submitted batch job 339
The system will return a job ID, which you can use with commands like scancel or squeue.
Monitoring Job Status
You can monitor the status of your job using:
- List all active jobs:
- squeue
- Show detailed status of a specific job:
- scontrol show job <jobID>
