# MPAI-MMC Answer to Multimodal Question This code refers to the implementation of the MMC-AMQ, as described in the [AIW](https://mpai.community/standards/mpai-mmc/v2-2/ai-workflows/answer-to-multimodal-question/) ## Guide to the AMQ code 1. Manages input files and parameters: Speech Object, Visual Object, Text Object 2. Executes the AIW to perform the Answer to Multimodal Question on each individual pair of Speech/Text and Visual Object. 3. Outputs the answer as Speech Object and Text Object. The OSD-AMQ Reference Software is found at the NNW gitlab site. It contains: 1. The python code implementing the AIW. 2. The required libraries are: pytorch, transformers (HuggingFace), datasets (HuggingFace), soundfile, and pillow ## Installation Code was designed and tested on an Ubuntu 20.04 operating system using anaconda 23.7.2 and Python 3.9. An environment with all the necessary libraries can be created using: ```bash conda create --name --file requirements.txt ```