View a PDF of the paper titled Noise-powered Multi-modal Knowledge Graph Representation Framework, by Zhuo Chen and 7 other authors
Abstract:The rise of Multi-modal Pre-training highlights the necessity for a unified Multi-Modal Knowledge Graph (MMKG) representation learning framework. Such a framework is essential for embedding structured knowledge into multi-modal Large Language Models effectively, alleviating issues like knowledge misconceptions and multi-modal hallucinations. In this work, we explore the efficacy of models in accurately embedding entities within MMKGs through two pivotal tasks: Multi-modal Knowledge Graph Completion (MKGC) and Multi-modal Entity Alignment (MMEA). Building on this foundation, we propose a novel SNAG method that utilizes a Transformer-based architecture equipped with modality-level noise masking to robustly integrate multi-modal entity features in KGs. By incorporating specific training objectives for both MKGC and MMEA, our approach achieves SOTA performance across a total of ten datasets, demonstrating its versatility. Moreover, SNAG can not only function as a standalone model but also enhance other existing methods, providing stable performance improvements. Code and data are available at this https URL.
Submission history
From: Zhuo Chen [view email]
[v1]
Mon, 11 Mar 2024 15:48:43 UTC (469 KB)
[v2]
Wed, 20 Mar 2024 10:02:54 UTC (470 KB)
[v3]
Sat, 30 Nov 2024 04:53:04 UTC (441 KB)
Source link
#Noisepowered #Multimodal #Knowledge #Graph #Representation #Framework