MemeMQA: Multimodal Question Answering for Memes via Rationale-Based Inferencing

Published in ACL’24 (Findings), 2024

Recommended citation: Siddhant Agarwal, Shivam Sharma, Preslav Nakov, and Tanmoy Chakraborty. 2024. MemeMQA: Multimodal Question Answering for Memes via Rationale-Based Inferencing. In Findings of the Association for Computational Linguistics: ACL 2024, Bangkok, Thailand. Association for Computational Linguistics. https://arxiv.org/abs/2405.11215

Download paper here

Memes, widely used for humor and propaganda, need exploration for potential harm. Previous studies have focused on detecting harm and providing explanations in closed settings. We introduce MemeMQA, a multimodal question-answering framework designed to provide accurate responses and coherent explanations for structured questions about memes. We present MemeMQACorpus, a dataset with 1,880 questions related to 1,122 memes, including answer-explanation pairs. Our proposed ARSENAL framework, leveraging LLMs, outperforms baselines by ~18% in answer prediction accuracy and shows superior text generation in lexical and semantic alignment. We evaluate ARSENAL's robustness through diverse question sets and modality-specific assessments, enhancing our understanding of meme interpretation in multimodal communication.

Recommended citation: Siddhant Agarwal, Shivam Sharma, Preslav Nakov, and Tanmoy Chakraborty. 2024. MemeMQA: Multimodal Question Answering for Memes via Rationale-Based Inferencing. In Findings of the Association for Computational Linguistics: ACL 2024, Bangkok, Thailand. Association for Computational Linguistics.