Releases model announce available product safety meta-learning

Efficient Bilevel Optimization with KFAC-Based Hypergradients

arXiv cs.LGby Disen Liao, Felix Dangel, Yaoliang YuApril 1, 20261 min read0 views

arXiv:2603.29108v1 Announce Type: new Abstract: Bilevel optimization (BO) is widely applicable to many machine learning problems. Scaling BO, however, requires repeatedly computing hypergradients, which involves solving inverse Hessian-vector products (IHVPs). In practice, these operations are often approximated using crude surrogates such as one-step gradient unrolling or identity/short Neumann expansions, which discard curvature information. We build on implicit function theorem-based algorithms and propose to incorporate Kronecker-factored approximate curvature (KFAC), yielding curvature-aware hypergradients with a better performance efficiency trade-off than Conjugate Gradient (CG) or Neumann methods and consistently outperforming unrolling. We evaluate this approach across diverse tas

View PDF HTML (experimental)

Abstract:Bilevel optimization (BO) is widely applicable to many machine learning problems. Scaling BO, however, requires repeatedly computing hypergradients, which involves solving inverse Hessian-vector products (IHVPs). In practice, these operations are often approximated using crude surrogates such as one-step gradient unrolling or identity/short Neumann expansions, which discard curvature information. We build on implicit function theorem-based algorithms and propose to incorporate Kronecker-factored approximate curvature (KFAC), yielding curvature-aware hypergradients with a better performance efficiency trade-off than Conjugate Gradient (CG) or Neumann methods and consistently outperforming unrolling. We evaluate this approach across diverse tasks, including meta-learning and AI safety problems. On models up to BERT, we show that curvature information is valuable at scale, and KFAC can provide it with only modest memory and runtime overhead. Our implementation is available at this https URL.

Comments: 25 pages, AISTATS 2026

Subjects:

Machine Learning (cs.LG)

Cite as: arXiv:2603.29108 [cs.LG]

(or arXiv:2603.29108v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.29108

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Disen Liao [view email] [v1] Tue, 31 Mar 2026 00:54:31 UTC (2,262 KB)

Original source

arXiv cs.LG

https://arxiv.org/abs/2603.29108

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modelannounceavailable

ProductsFresh

Google’s free offline dictation app just made paying $15 a month for Wispr Flow hard to justify

In short: Google has quietly released an iOS app called Google AI Edge Eloquent, a free, offline-first voice dictation tool that transcribes speech in real time, strips filler words automatically, and transforms raw dictation into polished text without requiring an internet connection. The app runs on Gemma-based on-device ASR models, offers an optional cloud mode using [ ] This story continues at The Next Web

The Next Web Neural

1mabout 3 hours ago

ModelsLive

Willitrun: benchmark-backed CLI to check whether ML models fit/run on your hardware

I built willitrun, a small CLI that tries to answer a question I kept running into with local/edge ML: will this model actually fit and run on my hardware? It uses benchmark data when available and falls back to lightweight estimation otherwise. One thing I wanted from the start was support for Hugging Face model IDs directly , so you can point the tool at a model from the Hub instead of manually entering all metadata yourself. The goal right now is not to be perfect, but to be useful enough to filter out obviously bad choices before spending time downloading or testing models manually. GitHub: GitHub - smoothyy3/willitrun: CLI to tell you if an ML model will fit and run on your device, using real benchmarks + lightweight estimation. · GitHub PyPI: willitrun · PyPI It is still early, and I

discuss.huggingface.co

1mabout 2 hours ago

ModelsFresh

Sources: OpenAI, Anthropic, and Google are sharing information via the Frontier Model Forum to detect adversarial distillation attempts that violate their ToS (Bloomberg)

Bloomberg : Sources: OpenAI, Anthropic, and Google are sharing information via the Frontier Model Forum to detect adversarial distillation attempts that violate their ToS Rivals OpenAI, Anthropic PBC, and Alphabet Inc.'s Google have begun working together to try to clamp down on Chinese competitors extracting results

Techmeme

1mabout 2 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 177 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Releases

ReleasesFresh

Iran threatens to destroy OpenAI’s $30bn Stargate data centre in Abu Dhabi

In short: Iran’s Islamic Revolutionary Guard Corps has released a video threatening “complete and utter annihilation” of OpenAI’s $30bn Stargate AI campus in Abu Dhabi, singling out the facility by name for the first time and warning it will strike if the US proceeds with threatened attacks on Iranian civilian infrastructure. A senior officer in Iran’s [ ] This story continues at The Next Web

The Next Web Neural

1mabout 3 hours ago

ReleasesLive

Filing: Broadcom agrees to produce future versions of Google s TPUs and expands its Anthropic deal to give the startup access to ~3.5 GW of computing capacity (Jordan Novet/CNBC)

Jordan Novet / CNBC : Filing: Broadcom agrees to produce future versions of Google's TPUs and expands its Anthropic deal to give the startup access to ~3.5 GW of computing capacity - Broadcom said it agreed to produce future versions of Google's artificial intelligence chips,

Techmeme

1mabout 2 hours ago

ReleasesLive

Tennibot launches Partner V2, its latest robotic tennis ball machine

Attendees at the Robotics Summit Expo in Boston will get a chance to interact with Tennibot's technology firsthand. The post Tennibot launches Partner V2, its latest robotic tennis ball machine appeared first on The Robot Report .

The Robot Report

1m18 minutes ago

ReleasesLive

CNBC s The China Connection newsletter: Why AI isn t replacing jobs in China (yet)

Rapid adoption of artificial intelligence has many in Silicon Valley reeling from news of layoffs. Engineers in China appear to be more insulated.

CNBC Technology

1mabout 1 hour ago