/
OpenAI Whisper Transcription
OpenAI Whisper Transcription
You can easily transcribe content on RC supercomputers using OpenAI’s “Whisper” Speech to Text model. You can accomplish it with a few commands. Here is an example transcribing a sample wav file with GPU acceleration (required for this mamba env):
Sample Example
$ salloc -G 1 -p htc --mem=20G -t 240
$ module load mamba/latest
$ source activate openai-whisper
$ wget https://voiceage.com/wbsamples/in_mono/Conference.wav
$ whisper Conference.wav
[00:00.000 --> 00:01.000] This is Peter.
[00:01.000 --> 00:02.000] This is Johnny.
[00:02.000 --> 00:03.000] Kenny.
[00:03.000 --> 00:04.000] Good job.
[00:04.000 --> 00:04.880] We just wanted to take a minute to thank you.
Using SBATCH
$ cat whisp.sh
#!/bin/bash
#SBATCH -G 1
#SBATCH -p htc
#SBATCH --mem=20G
#SBATCH -t 4
module load mamba/latest
source activate openai-whisper
whisper --language=en $1 > $1_transcription.txt
And then to submit:
sbatch whisp.sh Conference.wav
Final result:
$ cat Conference.wav_transcription.txt
[00:00.000 --> 00:01.000] This is Peter.
[00:01.000 --> 00:02.000] This is Johnny.
[00:02.000 --> 00:03.000] Kenny.
[00:03.000 --> 00:04.000] Good job.
[00:04.000 --> 00:04.880] We just wanted to take a minute to thank you.
, multiple selections available,
Related content
2021 Research Computing Expo
2021 Research Computing Expo
More like this
Research Computing Expo
Research Computing Expo
More like this
Educational Opportunities and Workshops
Educational Opportunities and Workshops
More like this
AI Ignition: Fueling Your Knowledge 2024
AI Ignition: Fueling Your Knowledge 2024
More like this
Using Abaqus on Supercomputers
Using Abaqus on Supercomputers
More like this
Annual Research Computing Town Hall held Feb. 23, 2021
Annual Research Computing Town Hall held Feb. 23, 2021
More like this