FLEX-CLIP: Feature-Level GEneration Network Enhanced CLIP for X-shot Cross-modal Retrieval

arXiv:2411.17454v1 Announce Kind: cross
Summary: Given a question from one modality, few-shot cross-modal retrieval (CMR) retrieves semantically comparable cases in one other modality with the goal area together with lessons which can be disjoint from the supply area. In contrast with classical few-shot CMR strategies, vision-language pretraining strategies like CLIP have proven nice few-shot or zero-shot studying efficiency. Nonetheless, they nonetheless undergo challenges as a result of (1) the characteristic degradation encountered within the goal area and (2) the acute information imbalance. To deal with these points, we suggest FLEX-CLIP, a novel Function-level Era Community Enhanced CLIP. FLEX-CLIP contains two coaching levels. In multimodal characteristic era, we suggest a composite multimodal VAE-GAN community to seize actual characteristic distribution patterns and generate pseudo samples based mostly on CLIP options, addressing information imbalance. For frequent house projection, we develop a gate residual community to fuse CLIP options with projected options, decreasing characteristic degradation in X-shot eventualities. Experimental outcomes on 4 benchmark datasets present a 7%-15% enchancment over state-of-the-art strategies, with ablation research demonstrating enhancement of CLIP options.

Source link

#FLEXCLIP #FeatureLevel #GEneration #Community #Enhanced #CLIP #Xshot #Crossmodal #Retrieval

FLEX-CLIP: Feature-Level GEneration Network Enhanced CLIP for X-shot Cross-modal Retrieval

Recent Posts

“I don’t want to just do Private Division 2.0”: Blake Rochkind on Lyrical Games

Maybank signs RM1bn digital transformation deal with Microsoft

Context Engineering — A Comprehensive Hands-On Tutorial with DSPy

In trial, people lost twice as much weight by ditching ultraprocessed food

Life After the Atomic Blast, as Told by Hiroshima’s Survivors

A glimpse into OpenAI’s largest ambitions

Nvidia rejects US demand for backdoors in AI chips

Nuclear Experts Say Mixing AI and Nuclear Weapons Is Inevitable

ChatGPT Now Issuing Warnings to Users Who Seem Obsessed

Charter Planes and Bidding Wars: How Bitcoin Miners Raced to Beat Trump’s Tariffs