Run Kaldi on Slurm via Apptainer
Because it was necessary
Originally run:
docker run -it --shm-size=8g --gpus all \
-v"/shared/joregan/kaldi_swe/data:/opt/kaldi/egs/sprakbanken_swe/s5/data" \
-v"/shared/joregan/kaldi_swe/exp:/opt/kaldi/egs/sprakbanken_swe/s5/exp" \
-v"/shared/joregan/kaldi_swe/mfcc:/opt/kaldi/egs/sprakbanken_swe/s5/mfcc" kaldiasr/kaldi
(Failed) attempt with docker:
#!/bin/bash
#SBATCH --job-name=kaldi_docker
#SBATCH --output=kaldi_docker_%j.out
#SBATCH --error=kaldi_docker_%j.err
#SBATCH --nodelist=deepspeech
#SBATCH --gres=gpu:1
#SBATCH --cpus-per-task=16
#SBATCH --mem=1M # workaround due to misreported memory
#SBATCH --time=2-00:00:00 # 2 days max runtime
echo "Starting job on $SLURMD_NODENAME at $(date)"
echo "Launching Kaldi Docker container..."
docker run --shm-size=8g --gpus all \
-v "/shared/joregan/kaldi_swe/data:/opt/kaldi/egs/sprakbanken_swe/s5/data" \
-v "/shared/joregan/kaldi_swe/exp:/opt/kaldi/egs/sprakbanken_swe/s5/exp" \
-v "/shared/joregan/kaldi_swe/mfcc:/opt/kaldi/egs/sprakbanken_swe/s5/mfcc" \
kaldiasr/kaldi bash -c "cd /opt/kaldi/egs/sprakbanken_swe/s5 && ./run.sh"
echo "Job completed at $(date)"
$ cat kaldi_docker_683.err
docker: permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/create": dial unix /var/run/docker.sock: connect: permission denied.
See 'docker run --help'.
Convert for apptainer:
singularity pull docker://jimregan/kaldi_swe
Apptainer:
#!/bin/bash
#SBATCH --job-name=kaldi_apptainer
#SBATCH --output=kaldi_apptainer_%j.out
#SBATCH --error=kaldi_apptainer_%j.err
#SBATCH --nodelist=deepspeech
#SBATCH --gres=gpu:1
#SBATCH --cpus-per-task=16
#SBATCH --mem=64G
#SBATCH --time=2-00:00:00
# Optional: load Apptainer module if needed on your system
module load apptainer # or singularity if that's the older name
echo "Job started on $SLURMD_NODENAME at $(date)"
echo "Using GPU(s): ${SLURM_STEP_GPUS:-$SLURM_JOB_GPUS}"
# Run the container with GPU access and mounted volumes
apptainer exec --nv \
--bind /shared/joregan/kaldi_swe/data:/opt/kaldi/egs/sprakbanken_swe/s5/data \
--bind /shared/joregan/kaldi_swe/exp:/opt/kaldi/egs/sprakbanken_swe/s5/exp \
--bind /shared/joregan/kaldi_swe/mfcc:/opt/kaldi/egs/sprakbanken_swe/s5/mfcc \
kaldi_swe.sif \
bash -c "cd /opt/kaldi/egs/sprakbanken_swe/s5 && ./run.sh"
echo "Job finished at $(date)"