MemeMQA: Multimodal Question Answering for Memes via Rationale-Based Inferencing
Published in ACL’24 (Findings), 2024
Recommended citation: Siddhant Agarwal, Shivam Sharma, Preslav Nakov, and Tanmoy Chakraborty. 2024. MemeMQA: Multimodal Question Answering for Memes via Rationale-Based Inferencing. In Findings of the Association for Computational Linguistics: ACL 2024, Bangkok, Thailand. Association for Computational Linguistics. https://arxiv.org/abs/2405.11215
Memes, widely used for humor and propaganda, need exploration for potential harm. Previous studies have focused on detecting harm and providing explanations in closed settings. We introduce MemeMQA, a multimodal question-answering framework designed to provide accurate responses and coherent explanations for structured questions about memes. We present MemeMQACorpus, a dataset with 1,880 questions related to 1,122 memes, including answer-explanation pairs. Our proposed ARSENAL framework, leveraging LLMs, outperforms baselines by ~18% in answer prediction accuracy and shows superior text generation in lexical and semantic alignment. We evaluate ARSENAL's robustness through diverse question sets and modality-specific assessments, enhancing our understanding of meme interpretation in multimodal communication.
Recommended citation: Siddhant Agarwal, Shivam Sharma, Preslav Nakov, and Tanmoy Chakraborty. 2024. MemeMQA: Multimodal Question Answering for Memes via Rationale-Based Inferencing. In Findings of the Association for Computational Linguistics: ACL 2024, Bangkok, Thailand. Association for Computational Linguistics.