Live
Black Hat USADark ReadingBlack Hat AsiaAI Business5 AI-powered consulting startups to watchBusiness InsiderWhat Teens Are Doing With Those Role-Playing Chatbots - The New York TimesGoogle News: AIOCSF explained: The shared data language security teams have been missingVentureBeat AIdark ilanlesswrong.comMicrosoft Is Going Multi-Model with Copilot. Does the Enterprise King Win Again? - The Motley FoolGNews AI MicrosoftShow HN: Running local OpenClaw together with remote agents in an open networkHacker NewsA folk musician became a target for AI fakes and a copyright trollThe Verge AIWhat Teens Are Doing With Those Role-Playing ChatbotsNYT TechnologyChicken-Free Egg Whiteslesswrong.comDesktop Canary v2.1.48-canary.35LobeChat ReleasesPlease someone recommend me a good model for Linux Mint + 12 GB RAM + 3 GB VRAM + GTX 1050 setup.Reddit r/LocalLLaMABest Artificial Intelligence Stocks To Add to Your Watchlist - April 4th - MarketBeatGoogle News: AIBlack Hat USADark ReadingBlack Hat AsiaAI Business5 AI-powered consulting startups to watchBusiness InsiderWhat Teens Are Doing With Those Role-Playing Chatbots - The New York TimesGoogle News: AIOCSF explained: The shared data language security teams have been missingVentureBeat AIdark ilanlesswrong.comMicrosoft Is Going Multi-Model with Copilot. Does the Enterprise King Win Again? - The Motley FoolGNews AI MicrosoftShow HN: Running local OpenClaw together with remote agents in an open networkHacker NewsA folk musician became a target for AI fakes and a copyright trollThe Verge AIWhat Teens Are Doing With Those Role-Playing ChatbotsNYT TechnologyChicken-Free Egg Whiteslesswrong.comDesktop Canary v2.1.48-canary.35LobeChat ReleasesPlease someone recommend me a good model for Linux Mint + 12 GB RAM + 3 GB VRAM + GTX 1050 setup.Reddit r/LocalLLaMABest Artificial Intelligence Stocks To Add to Your Watchlist - April 4th - MarketBeatGoogle News: AI
AI NEWS HUBbyEIGENVECTOREigenvector

VueBuds: Visual Intelligence with Wireless Earbuds

arXiv cs.HCby Maruchi Kim, Rasya Fawwaz, Zhi Yang Lim, Brinda Moudgalya, Hexi Wang, Yuanhao Zeng, Shyamnath GollakotaApril 1, 20261 min read0 views
Source Quiz

arXiv:2603.29095v1 Announce Type: new Abstract: Despite their ubiquity, wireless earbuds remain audio-centric due to size and power constraints. We present VueBuds, the first camera-integrated wireless earbuds for egocentric vision, capable of operating within stringent power and form-factor limits. Each VueBud embeds a camera into a Sony WF-1000XM3 to stream visual data over Bluetooth to a host device for on-device vision language model (VLM) processing. We show analytically and empirically that while each camera's field of view is partially occluded by the face, the combined binocular perspective provides comprehensive forward coverage. By integrating VueBuds with VLMs, we build an end-to-end system for real-time scene understanding, translation, visual reasoning, and text reading; all f

View PDF

Abstract:Despite their ubiquity, wireless earbuds remain audio-centric due to size and power constraints. We present VueBuds, the first camera-integrated wireless earbuds for egocentric vision, capable of operating within stringent power and form-factor limits. Each VueBud embeds a camera into a Sony WF-1000XM3 to stream visual data over Bluetooth to a host device for on-device vision language model (VLM) processing. We show analytically and empirically that while each camera's field of view is partially occluded by the face, the combined binocular perspective provides comprehensive forward coverage. By integrating VueBuds with VLMs, we build an end-to-end system for real-time scene understanding, translation, visual reasoning, and text reading; all from low-resolution monochrome cameras drawing under 5mW through on-demand activation. Through online and in-person user studies with 90 participants, we compare VueBuds against smart glasses across 17 visual question-answering tasks, and show that our system achieves response quality on par with Ray-Ban Meta. Our work establishes low-power camera-equipped earbuds as a compelling platform for visual intelligence, bringing rapidly advancing VLM capabilities to one of the most ubiquitous wearable form factors.

Comments: CHI 2026

Subjects:

Human-Computer Interaction (cs.HC)

Cite as: arXiv:2603.29095 [cs.HC]

(or arXiv:2603.29095v1 [cs.HC] for this version)

https://doi.org/10.48550/arXiv.2603.29095

arXiv-issued DOI via DataCite (pending registration)

Related DOI:

https://doi.org/10.1145/3772318.3791322

DOI(s) linking to related resources

Submission history

From: Maruchi Kim [view email] [v1] Tue, 31 Mar 2026 00:29:33 UTC (19,793 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modellanguage modelannounce

Knowledge Map

Knowledge Map
TopicsEntitiesSource
VueBuds: Vi…modellanguage mo…announceplatformperspectivereasoningarXiv cs.HC

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 148 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!