Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessOpenAI, Anthropic eye new AI safety solution - News.azGoogle News: AI SafetyChatGPT comes to CarPlay with iOS 26.4, supports voice-only interaction - The Times of IndiaGoogle News: ChatGPTFair decisions, clear reasons: Creating Fuzzy AI with fairness built in from the start - Asia Research News |Google News: Machine LearningWhy Vera cofounder Yaniv Bernstein was surprised when he said he was giving up AI - Startup DailyGoogle News: Machine LearningReact Native Background Task Processing Methods (2026)DEV CommunityFlutter AI Virtual Try-On: 6-Week Build, Zero BSDEV CommunityHow to Choose the Best Speech-to-text API for Voice AgentsHackernoon AIDetecting Bots in 2026: IP Intelligence + Email Validation in One API CallDEV CommunityExtremism Researchers Pivot to AI Industry’s Trust and Safety Gaps - Startup FortuneGoogle News: AI SafetyI built 2 free web tools to solve problems that annoyed me — here's what I learnedDEV CommunityHow to Build Production Ready AgentScope Workflows with ReAct Agents, Custom Tools, Multi-Agent Debate, Structured Output and Concurrent PipelinesMarkTechPost🌐 Beyond One Data Source: Building Scalable Data Pipelines in Power BIDEV CommunityBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessOpenAI, Anthropic eye new AI safety solution - News.azGoogle News: AI SafetyChatGPT comes to CarPlay with iOS 26.4, supports voice-only interaction - The Times of IndiaGoogle News: ChatGPTFair decisions, clear reasons: Creating Fuzzy AI with fairness built in from the start - Asia Research News |Google News: Machine LearningWhy Vera cofounder Yaniv Bernstein was surprised when he said he was giving up AI - Startup DailyGoogle News: Machine LearningReact Native Background Task Processing Methods (2026)DEV CommunityFlutter AI Virtual Try-On: 6-Week Build, Zero BSDEV CommunityHow to Choose the Best Speech-to-text API for Voice AgentsHackernoon AIDetecting Bots in 2026: IP Intelligence + Email Validation in One API CallDEV CommunityExtremism Researchers Pivot to AI Industry’s Trust and Safety Gaps - Startup FortuneGoogle News: AI SafetyI built 2 free web tools to solve problems that annoyed me — here's what I learnedDEV CommunityHow to Build Production Ready AgentScope Workflows with ReAct Agents, Custom Tools, Multi-Agent Debate, Structured Output and Concurrent PipelinesMarkTechPost🌐 Beyond One Data Source: Building Scalable Data Pipelines in Power BIDEV Community

$\phi$-DPO: Fairness Direct Preference Optimization Approach to Continual Learning in Large Multimodal Models

arXivby [Submitted on 26 Feb 2026 (v1), last revised 30 Mar 2026 (this version, v2)]March 31, 20262 min read1 views
Source Quiz

arXiv:2602.22601v2 Announce Type: replace Abstract: Fairness in Continual Learning for Large Multimodal Models (LMMs) is an emerging yet underexplored challenge, particularly in the presence of imbalanced data distributions that can lead to biased model updates and suboptimal performance across tasks. While recent continual learning studies have made progress in addressing catastrophic forgetting, the problem of fairness caused the imbalanced data remains largely underexplored. This paper presents a novel Fairness Direct Preference Optimization (FaiDPO or $\phi$-DPO) framework for continual le — Thanh-Dat Truong, Huu-Thien Tran, Jackson Cothren, Bhiksha Raj, Khoa Luu

View PDF HTML (experimental)

Abstract:Fairness in Continual Learning for Large Multimodal Models (LMMs) is an emerging yet underexplored challenge, particularly in the presence of imbalanced data distributions that can lead to biased model updates and suboptimal performance across tasks. While recent continual learning studies have made progress in addressing catastrophic forgetting, the problem of fairness caused the imbalanced data remains largely underexplored. This paper presents a novel Fairness Direct Preference Optimization (FaiDPO or $\phi$-DPO) framework for continual learning in LMMs. In particular, we first propose a new continual learning paradigm based on Direct Preference Optimization (DPO) to mitigate catastrophic forgetting by aligning learning with pairwise preference signals. Then, we identify the limitations of conventional DPO in imbalanced data and present a new $\phi$-DPO loss that explicitly addresses distributional biases. We provide a comprehensive theoretical analysis demonstrating that our approach addresses both forgetting and data imbalance. Additionally, to enable $\phi$-DPO-based continual learning, we construct pairwise preference annotations for existing benchmarks in the context of continual learning. Extensive experiments and ablation studies show the proposed $\phi$-DPO achieves State-of-the-Art performance across multiple benchmarks, outperforming prior continual learning methods of LMMs.

Comments: Accepted to CVPR'26

Subjects:

Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2602.22601 [cs.LG]

(or arXiv:2602.22601v2 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2602.22601

arXiv-issued DOI via DataCite

Submission history

From: Thanh-Dat Truong [view email] [v1] Thu, 26 Feb 2026 04:14:33 UTC (5,971 KB) [v2] Mon, 30 Mar 2026 15:16:05 UTC (5,962 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
$\phi$-DPO:…researchpaperarxivmachine-lea…deep-learni…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 238 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers