Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessLinkedIn is secretly scanning your browser for 6,000 extensions, and you weren’t toldThe Next Web AIHigh-Risk Authors — Malicious Accounts — 2026-04-05Dev.to AIAutomating Your Playtest Triage with AIDev.to AIEcosystem Health Index — 2026-04-05Dev.to AIAudit Coverage Report — 2026-04-05Dev.to AIThreat Deep Dive — Attack Categories — 2026-04-05Dev.to AIFastest Growing Skills — Download Surge — 2026-04-05Dev.to AINewly Discovered Skills This Week — 2026-04-05Dev.to AISkill Category Distribution — 2026-04-05Dev.to AIRising Authors — Clean Track Records — 2026-04-05Dev.to AII Made My AI CEO Keep a Public Diary. Here's What 42 Sessions of $0 Revenue Looks Like.Dev.to AIThe Sequence Radar #837: Last Week in AI: From Model Releases to Market StructureTheSequenceBlack Hat USADark ReadingBlack Hat AsiaAI BusinessLinkedIn is secretly scanning your browser for 6,000 extensions, and you weren’t toldThe Next Web AIHigh-Risk Authors — Malicious Accounts — 2026-04-05Dev.to AIAutomating Your Playtest Triage with AIDev.to AIEcosystem Health Index — 2026-04-05Dev.to AIAudit Coverage Report — 2026-04-05Dev.to AIThreat Deep Dive — Attack Categories — 2026-04-05Dev.to AIFastest Growing Skills — Download Surge — 2026-04-05Dev.to AINewly Discovered Skills This Week — 2026-04-05Dev.to AISkill Category Distribution — 2026-04-05Dev.to AIRising Authors — Clean Track Records — 2026-04-05Dev.to AII Made My AI CEO Keep a Public Diary. Here's What 42 Sessions of $0 Revenue Looks Like.Dev.to AIThe Sequence Radar #837: Last Week in AI: From Model Releases to Market StructureTheSequence
AI NEWS HUBbyEIGENVECTOREigenvector

"It's trained by non-disabled people": Evaluating How Image Quality Affects Product Captioning with Vision-Language Models

arXiv cs.HCby Kapil Garg, Xinru Tang, Jimin Heo, Dwayne R. Morgan, Darren Gergle, Erik B. Sudderth, Anne Marie PiperApril 1, 20261 min read0 views
Source Quiz

arXiv:2511.08917v3 Announce Type: replace Abstract: Vision-Language Models (VLMs) are increasingly used by blind and low-vision (BLV) people to identify and understand products in their everyday lives, such as food, personal care items, and household goods. Despite their prevalence, we lack an empirical understanding of how common image quality issues--such as blur, misframing, and rotation--affect the accuracy of VLM-generated captions and whether the resulting captions meet BLV people's information needs. Based on a survey of 86 BLV participants, we develop an annotated dataset of 1,859 product images from BLV people to systematically evaluate how image quality issues affect VLM-generated captions. While the best VLM achieves 98% accuracy on images with no quality issues, accuracy drops

View PDF HTML (experimental)

Abstract:Vision-Language Models (VLMs) are increasingly used by blind and low-vision (BLV) people to identify and understand products in their everyday lives, such as food, personal care items, and household goods. Despite their prevalence, we lack an empirical understanding of how common image quality issues--such as blur, misframing, and rotation--affect the accuracy of VLM-generated captions and whether the resulting captions meet BLV people's information needs. Based on a survey of 86 BLV participants, we develop an annotated dataset of 1,859 product images from BLV people to systematically evaluate how image quality issues affect VLM-generated captions. While the best VLM achieves 98% accuracy on images with no quality issues, accuracy drops to 75% overall when quality issues are present, worsening considerably as issues compound. We discuss the need for model evaluations that center on disabled people's experiences throughout the process and offer concrete recommendations for HCI and ML researchers to make VLMs more reliable for BLV people.

Comments: Published at CHI 2026; Honorable Mention for Best Paper (Top 5%). Dataset available at: this https URL

Subjects:

Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2511.08917 [cs.HC]

(or arXiv:2511.08917v3 [cs.HC] for this version)

https://doi.org/10.48550/arXiv.2511.08917

arXiv-issued DOI via DataCite

Related DOI:

https://doi.org/10.1145/3772318.3791309

DOI(s) linking to related resources

Submission history

From: Kapil Garg [view email] [v1] Wed, 12 Nov 2025 02:54:13 UTC (29,971 KB) [v2] Sat, 22 Nov 2025 22:58:28 UTC (11,977 KB) [v3] Tue, 31 Mar 2026 11:56:00 UTC (12,339 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
"It's train…modellanguage mo…announceproductvaluationsurveyarXiv cs.HC

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 155 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!