asr

Automatic Speech Recognition

This filter uses PocketSphinx for speech recognition. To enable compilation of this filter, you need to configure FFmpeg with --enable-pocketsphinx.

It accepts the following options:

rate: Set sampling rate of input audio. Defaults is 16000. This need to match speech models, otherwise one will get poor results.
hmm: Set dictionary containing acoustic model files.
dict: Set pronunciation dictionary.
lm: Set language model file.
lmctl: Set language model set.
lmname: Set which language model to use.
logfn: Set output for log messages.

The filter exports recognized speech as the frame metadata lavfi.asr.text.

Visualize your ffmpeg command with ffmpeg-graph