Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessGuest column: Is travel retail ready for agentic AI? - Moodie Davitt ReportGNews AI agenticWhy AI transparency is the key to richer instruction - University BusinessGoogle News: Generative AIMy parents spent all their money on my sister's rehab — now they want me to pay for it. How do I say no?Business InsiderOkta's CEO says all AI agents need a kill switchBusiness InsiderYour Divorce Attorney Wants You to Stop Using ChatGPT: Family Law, AI, and the Privilege You’re Giving Away - Ward and Smith, P.A.Google News: AIUkrainian troops showed 'greater tactical imagination' than Western trainers, British officer says, pointing to their ambush tacticsBusiness InsiderThese professors built AI tools that ask questions, instead of giving answers - The Washington PostGoogle News: AIOpenAI officially confirms mega-funding round and ChatGPT super appThe DecoderAnnouncing Doublehaven with Reflections on HumourLessWrong AIThe ghost of DOGE is still haunting Social SecurityBusiness InsiderIs the US headed for a recession? The Iran war could tip the balance.Business InsiderThe surprise winners of Trump's immigration warsAxios TechBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessGuest column: Is travel retail ready for agentic AI? - Moodie Davitt ReportGNews AI agenticWhy AI transparency is the key to richer instruction - University BusinessGoogle News: Generative AIMy parents spent all their money on my sister's rehab — now they want me to pay for it. How do I say no?Business InsiderOkta's CEO says all AI agents need a kill switchBusiness InsiderYour Divorce Attorney Wants You to Stop Using ChatGPT: Family Law, AI, and the Privilege You’re Giving Away - Ward and Smith, P.A.Google News: AIUkrainian troops showed 'greater tactical imagination' than Western trainers, British officer says, pointing to their ambush tacticsBusiness InsiderThese professors built AI tools that ask questions, instead of giving answers - The Washington PostGoogle News: AIOpenAI officially confirms mega-funding round and ChatGPT super appThe DecoderAnnouncing Doublehaven with Reflections on HumourLessWrong AIThe ghost of DOGE is still haunting Social SecurityBusiness InsiderIs the US headed for a recession? The Iran war could tip the balance.Business InsiderThe surprise winners of Trump's immigration warsAxios Tech

Pioneering Perceptual Video Fluency Assessment: A Novel Task with Benchmark Dataset and Baseline

arXivMarch 30, 202610 min read0 views
Source Quiz

arXiv:2603.26055v1 Announce Type: new Abstract: Accurately estimating humans' subjective feedback on video fluency, e.g., motion consistency and frame continuity, is crucial for various applications like streaming and gaming. Yet, it has long been overlooked, as prior arts have focused on solving it in the video quality assessment (VQA) task, merely as a sub-dimension of overall quality. In this work, we conduct pilot experiments and reveal that current VQA predictions largely underrepresent fluency, thereby limiting their applicability. To this end, we pioneer Video Fluency Assessment (VFA) a — Qizhi Xie, Kun Yuan, Yunpeng Qu, Ming Sun, Chao Zhou, Jihong Zhu

View PDF HTML (experimental)

Abstract:Accurately estimating humans' subjective feedback on video fluency, e.g., motion consistency and frame continuity, is crucial for various applications like streaming and gaming. Yet, it has long been overlooked, as prior arts have focused on solving it in the video quality assessment (VQA) task, merely as a sub-dimension of overall quality. In this work, we conduct pilot experiments and reveal that current VQA predictions largely underrepresent fluency, thereby limiting their applicability. To this end, we pioneer Video Fluency Assessment (VFA) as a standalone perceptual task focused on the temporal dimension. To advance VFA research, 1) we construct a fluency-oriented dataset, FluVid, comprising 4,606 in-the-wild videos with balanced fluency distribution, featuring the first-ever scoring criteria and human study for VFA. 2) We develop a large-scale benchmark of 23 methods, the most comprehensive one thus far on FluVid, gathering insights for VFA-tailored model designs. 3) We propose a baseline model called FluNet, which deploys temporal permuted self-attention (T-PSA) to enrich input fluency information and enhance long-range inter-frame interactions. Our work not only achieves state-of-the-art performance but, more importantly, offers the community a roadmap to explore solutions for VFA.

Comments: 14 pages, 6 figures. Accepted by CVPR 2026 findings track

Subjects:

Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2603.26055 [cs.CV]

(or arXiv:2603.26055v1 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2603.26055

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Qizhi Xie [view email] [v1] Fri, 27 Mar 2026 03:47:41 UTC (9,684 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Pioneering …researchpaperarxivcomputer-vi…image-recog…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 269 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers