You can easily transcribe content on RC supercomputers using OpenAI’s “Whisper” Speech to Text model. You can accomplish it with a few commands. Here is an example transcribing a sample wav file with GPU acceleration (required for this mamba env):

Sample Example

$ salloc -G 1 -p htc --mem=20G -t 240
$ module load mamba/latest
$ source activate openai-whisper
$ wget https://voiceage.com/wbsamples/in_mono/Conference.wav
$ whisper Conference.wav 
[00:00.000 --> 00:01.000]  This is Peter.
[00:01.000 --> 00:02.000]  This is Johnny.
[00:02.000 --> 00:03.000]  Kenny.
[00:03.000 --> 00:04.000]  Good job.
[00:04.000 --> 00:04.880]  We just wanted to take a minute to thank you.

Using SBATCH

$ cat whisp.sh 
#!/bin/bash

#SBATCH -G 1
#SBATCH -p htc
#SBATCH --mem=20G
#SBATCH -t 4

module load mamba/latest
source activate openai-whisper
whisper --language=en $1 > $1_transcription.txt

And then to submit:

sbatch whisp.sh Conference.wav

Final result:

$ cat Conference.wav_transcription.txt 
[00:00.000 --> 00:01.000]  This is Peter.
[00:01.000 --> 00:02.000]  This is Johnny.
[00:02.000 --> 00:03.000]  Kenny.
[00:03.000 --> 00:04.000]  Good job.
[00:04.000 --> 00:04.880]  We just wanted to take a minute to thank you.

Research Computing

OpenAI Whisper Transcription

Sample Example

Using SBATCH

Related content