Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessMy forays into cyborgism: theory, pt. 1LessWrongAI Is a Threat to Everything the American People Hold Dear – Bernie Sanders OpEdHacker News AI TopIgnore AI FOMO – For NowHacker News AI TopThe Engineer as Reader: Why Literature Skills Matter for Software Engineers in the Age of AIMedium AIApex Protocol – An open MCP-based standard for AI agent tradingHacker News AI TopWhen Enterprises Build an Agent OS, the Operating Model Must Change TooMedium AIBuilding a RAG-Powered Smart AI Chatbot for E-commerce application using LangChainMedium AIIntelligence isn’t genetic it’s something to be built part 2Medium AIWhich AI Tool Should You Use for What?Medium AIAI and Authority: What Happens When Writing No Longer Proves ExpertiseMedium AIThe One-Person Unicorn Is Impossible Until AI Outputs Are Officially RecognizedMedium AIShow HN: hot or not for .ai websitesHacker News AI TopBlack Hat USADark ReadingBlack Hat AsiaAI BusinessMy forays into cyborgism: theory, pt. 1LessWrongAI Is a Threat to Everything the American People Hold Dear – Bernie Sanders OpEdHacker News AI TopIgnore AI FOMO – For NowHacker News AI TopThe Engineer as Reader: Why Literature Skills Matter for Software Engineers in the Age of AIMedium AIApex Protocol – An open MCP-based standard for AI agent tradingHacker News AI TopWhen Enterprises Build an Agent OS, the Operating Model Must Change TooMedium AIBuilding a RAG-Powered Smart AI Chatbot for E-commerce application using LangChainMedium AIIntelligence isn’t genetic it’s something to be built part 2Medium AIWhich AI Tool Should You Use for What?Medium AIAI and Authority: What Happens When Writing No Longer Proves ExpertiseMedium AIThe One-Person Unicorn Is Impossible Until AI Outputs Are Officially RecognizedMedium AIShow HN: hot or not for .ai websitesHacker News AI Top
AI NEWS HUBbyEIGENVECTOREigenvector

PLOT: Enhancing Preference Learning via Optimal Transport

arXiv cs.CLby Liang Zhu, Yuelin Bai, Xiankun Ren, Jiaxi Yang, Lei Zhang, Feiteng Fang, Hamid Alinejad-Rokny, Minghuan Tan, Min YangApril 4, 20261 min read0 views
Source Quiz

arXiv:2604.01837v1 Announce Type: new Abstract: Preference learning in Large Language Models (LLMs) has advanced significantly, yet existing methods remain limited by modest performance gains, high computational costs, hyperparameter sensitivity, and insufficient modeling of global token-level relationships. We introduce PLOT, which enhances Preference Learning in fine-tuning-based alignment through a token-level loss derived from Optimal Transport. By formulating preference learning as an Optimal Transport Problem, PLOT aligns model outputs with human preferences while preserving the original distribution of LLMs, ensuring stability and robustness. Furthermore, PLOT leverages token embeddings to capture semantic relationships, enabling globally informed optimization. Experiments across tw

View PDF HTML (experimental)

Abstract:Preference learning in Large Language Models (LLMs) has advanced significantly, yet existing methods remain limited by modest performance gains, high computational costs, hyperparameter sensitivity, and insufficient modeling of global token-level relationships. We introduce PLOT, which enhances Preference Learning in fine-tuning-based alignment through a token-level loss derived from Optimal Transport. By formulating preference learning as an Optimal Transport Problem, PLOT aligns model outputs with human preferences while preserving the original distribution of LLMs, ensuring stability and robustness. Furthermore, PLOT leverages token embeddings to capture semantic relationships, enabling globally informed optimization. Experiments across two preference categories - Human Values and Logic & Problem Solving - spanning seven subpreferences demonstrate that PLOT consistently improves alignment performance while maintaining fluency and coherence. These results substantiate optimal transport as a principled methodology for preference learning, establishing a theoretically grounded framework that provides new insights for preference learning of LLMs.

Subjects:

Computation and Language (cs.CL)

Cite as: arXiv:2604.01837 [cs.CL]

(or arXiv:2604.01837v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2604.01837

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Liang Zhu [view email] [v1] Thu, 2 Apr 2026 09:51:56 UTC (79 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
PLOT: Enhan…modellanguage mo…announceinsightglobalalignmentarXiv cs.CL

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 170 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!