Models training announce feature prediction multimodal arxiv

JEPA-MSAC: A Joint-Embedding Predictive Architecture for Multimodal Sensing-Assisted Communications

arXiv eess.SPby [Submitted on 31 Mar 2026]April 1, 20262 min read2 views

🧒Explain Like I'm 5Simple language

Hey there, little explorer! Imagine you have a special toy robot that helps your walkie-talkie talk super well, even when you're running around!

Scientists made a new smart helper called JEPA-MSAC. It's like a super-smart detective for your walkie-talkie!

This detective watches everything around it – like how loud sounds are or where things are hiding. It learns to guess what will happen next, like predicting where your friend will run.

Then, it helps your walkie-talkie send messages faster and clearer! It's like giving your walkie-talkie superpowers to always know the best way to talk, no matter what! So cool!

arXiv:2603.29796v1 Announce Type: new Abstract: Future wireless systems increasingly require predictive and transferable representations that can support multiple physical-layer (PHY) tasks under dynamic environments. However, most existing supervised learning-based methods are designed for a single task, which leads to high adaptation cost. To address this issue, we propose a joint-embedding predictive architecture for multimodal sensing-assisted communications (JEPA-MSAC), a self-supervised multimodal predictive representation learning framework for wireless environments. The proposed framework first maps multimodal sensing and communication measurements into a unified token space, and then pretrains a shared backbone using temporal block-masked JEPA to learn a predictive latent space th

View PDF HTML (experimental)

Abstract:Future wireless systems increasingly require predictive and transferable representations that can support multiple physical-layer (PHY) tasks under dynamic environments. However, most existing supervised learning-based methods are designed for a single task, which leads to high adaptation cost. To address this issue, we propose a joint-embedding predictive architecture for multimodal sensing-assisted communications (JEPA-MSAC), a self-supervised multimodal predictive representation learning framework for wireless environments. The proposed framework first maps multimodal sensing and communication measurements into a unified token space, and then pretrains a shared backbone using temporal block-masked JEPA to learn a predictive latent space that captures environment dynamics and cross-modal dependencies. After pretraining, the backbone is frozen and reused as a general future-feature generator, on top of which lightweight task heads are trained for localization, beam prediction, and received signal strength indicator (RSSI) prediction. Extensive experiments show the latent state supports accurate multi-task prediction with low adaptation cost. Additionally, ablation studies reveal its scaling behavior and the impact of key pretraining setups.

Comments: 13 pages, 10 figures

Subjects:

Signal Processing (eess.SP)

Cite as: arXiv:2603.29796 [eess.SP]

(or arXiv:2603.29796v1 [eess.SP] for this version)

https://doi.org/10.48550/arXiv.2603.29796

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Can Zheng [view email] [v1] Tue, 31 Mar 2026 14:29:42 UTC (2,203 KB)

Original source

arXiv eess.SP

https://arxiv.org/abs/2603.29796

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

trainingannouncefeature

ReleasesLive

AIs can now often do massive easy-to-verify SWE tasks and I've updated towards shorter timelines

I've recently updated towards substantially shorter AI timelines and much faster progress in some areas. [1] The largest updates I've made are (1) an almost 2x higher probability of full AI R&D automation by EOY 2028 (I'm now a bit below 30% [2] while I was previously expecting around 15% ; my guesses are pretty reflectively unstable) and (2) I expect much stronger short-term performance on massive and pretty difficult but easy-and-cheap-to-verify software engineering (SWE) tasks that don't require that much novel ideation [3] . For instance, I expect that by EOY 2026, AIs will have a 50%-reliability [4] time horizon of years to decades on reasonably difficult easy-and-cheap-to-verify SWE tasks that don't require much ideation (while the high reliability—for instance, 90%—time horizon will

LessWrong

26m34 minutes ago

Laws & RegulationFresh

The World Cup could be a breakout moment for drone defense tech

As the threat of drone attacks grows, the federal government is turning this summer into a proving ground for U.S. efforts to shore up aerial defenses at events like the World Cup. It may also serve as a launchpad for defense tech firms hoping to sell systems designed to intercept unmanned aerial vehicles. “Out of the World Cup, you’ll see the baseline for what law enforcement and critical infrastructure sites will then buy at scale,” says Jon Gruen, CEO of Fortem Technologies, which signed a multimillion-dollar deal to provide artificial intelligence systems, radar, and drone interdiction technology to U.S. cities hosting the tournament. “You’re going to see how it worked, and see how it all fits together.” A run of mega-events over the next few years, including this summer’s World Cup, e

Fast Company Tech

7mabout 6 hours ago

ProductsFresh

Samsung Brings S26 AI Features To S25 - Let's Data Science

Samsung Brings S26 AI Features To S25 Let's Data Science

GNews AI Samsung

1mabout 5 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 231 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

ModelsLive

Claude Subscribers Now Have to Pay to Use OpenClaw

OpenClaw developer Peter Steinberger had initially worked with Anthropic but moved the popular personal agent system to OpenAI.

AI Business

1mabout 2 hours ago

ModelsFresh

Claude Code is unusable for complex engineering tasks with the Feb updates

Comments

Hacker News

1mabout 3 hours ago

Models

Intel Gaudi 2 Remains Only Benchmarked Alternative to NV H100 for GenAI Performance - newsroom.intel.com

Intel Gaudi 2 Remains Only Benchmarked Alternative to NV H100 for GenAI Performance newsroom.intel.com

Google News - Intel AI Gaudi

1mabout 2 years ago

ModelsLive

Anthropic Ranks 5th in the AI Race According to AI Itself

The Paradox: Claude Is the Best AI Model, But Anthropic Ranks 5th in AI Visibility Everyone in the AI world seems to agree on one thing: Claude is exceptional. Developers praise its reasoning. Writers love its nuance. Researchers trust its accuracy. And yet, when we asked AI models to recommend AI companies, Anthropic barely made the top half of the list. That's not an opinion. That's data. We ran a four-day tracking study across 7 AI companies and 7 AI models , measuring how often each company appeared in AI-generated answers. The results were humbling — at least for Anthropic fans. OpenAI topped the chart at 82.85. No surprise. ChatGPT colonized public consciousness before most people knew what a large language model was. Brand ubiquity has a compounding effect, and OpenAI has been compo

Dev.to AI

3mabout 1 hour ago