OpenAI Whisper Transcription

You can easily transcribe content on RC supercomputers using OpenAI’s “Whisper” Speech to Text model. You can accomplish it with a few commands. Here is an example transcribing a sample wav file with GPU acceleration (required for this mamba env):

Sample Example

$ salloc -G 1 -p htc --mem=20G -t 240 $ module load mamba/latest $ source activate openai-whisper $ wget https://voiceage.com/wbsamples/in_mono/Conference.wav $ whisper Conference.wav [00:00.000 --> 00:01.000] This is Peter. [00:01.000 --> 00:02.000] This is Johnny. [00:02.000 --> 00:03.000] Kenny. [00:03.000 --> 00:04.000] Good job. [00:04.000 --> 00:04.880] We just wanted to take a minute to thank you.

Using SBATCH

$ cat whisp.sh #!/bin/bash #SBATCH -G 1 #SBATCH -p htc #SBATCH --mem=20G #SBATCH -t 4 module load mamba/latest source activate openai-whisper whisper --language=en $1 > $1_transcription.txt

And then to submit:

sbatch whisp.sh Conference.wav

Final result:

$ cat Conference.wav_transcription.txt [00:00.000 --> 00:01.000] This is Peter. [00:01.000 --> 00:02.000] This is Johnny. [00:02.000 --> 00:03.000] Kenny. [00:03.000 --> 00:04.000] Good job. [00:04.000 --> 00:04.880] We just wanted to take a minute to thank you.