# MPAI-MMC Automatic Speech Recognition

This code refers to the implementation of the MMC-ASR, as described in the [AIM](https://mpai.community/standards/mpai-mmc/v2-2/ai-modules/automatic-speech-recognition/).

The code takes Speech Objects from MMC-AUS and generates Text Segments (called text transcripts). It uses the whisper-large-v3 model to convert an input Speech Object (speaker’s turn) into a Text Segment (here called text transcript). Disfluencies (e.g., repetitions, repairs, filled pauses) are often omitted. The Whisper reference document is available.

The MMC-ASR Reference Software is found at the MPAI gitlab site. Use of this AI Modules is for developers who are familiar with Python, Docker, RabbitMQ, and downloading models from HuggingFace. The Reference Software contains:

    1. src: a folder with the Python code implementing the AIM
    2. Dockerfile: a Docker file containing only the libraries required to build the Docker image and run the container
    3. requirements.txt: dependencies installed in the Docker image
    4. README.md: commands for cloning https://huggingface.co/openai/whisper-large-v3

Library: https://github.com/linto-ai/whisper-timestamped

How to download the aforesaid model:
```
cd $PATH_SHARED
mkdir models
cd models
mkdir mmc_asr
cd mmc_asr
git lfs install
git clone https://huggingface.co/openai/whisper-large-v3
```

