trunk/f2ebf8ce7c86a061a181e8da9e5c0a6150955e0a: [xpu][fix] Fix DeviceOpOverrides registered incorrectly (#178959)

PyTorch Releasesby pytorchApril 3, 20262 min read1 views

Motivation The current initialization logic for DeviceOpOverrides relies on checking whether device_op_overrides_dict is empty: DeviceOpOverrides: assert isinstance(device, str), type(device) if not device_op_overrides_dict: from . import ( # noqa: F401 # noqa: F401 cpu_device_op_overrides, mps_device_op_overrides, ) from .cuda import device_op_overrides # noqa: F401 from .mtia import device_op_overrides as mtia_op_overrides # noqa: F401 from .xpu import device_op_overrides as xpu_op_overrides # noqa: F401 if device not in device_op_overrides_dict: # For backends like TPU that only need no-op overrides (Pallas handles codegen) from .cpu_device_op_overrides import CpuDeviceOpOverrides register_device_op_overrides(device, CpuDeviceOpOverrides()) return device_op_overrides_dict[device]"> def

# Motivation The current initialization logic for DeviceOpOverrides relies on checking whether

# Motivation The current initialization logic for DeviceOpOverrides relies on checking whether

device_op_overrides_dict is empty:

python

def get_device_op_overrides(device: str) -> DeviceOpOverrides:
 assert isinstance(device, str), type(device)

 if not device_op_overrides_dict:
 from . import ( # noqa: F401 # noqa: F401
 cpu_device_op_overrides,
 mps_device_op_overrides,
 )
 from .cuda import device_op_overrides # noqa: F401
 from .mtia import device_op_overrides as mtia_op_overrides # noqa: F401
 from .xpu import device_op_overrides as xpu_op_overrides # noqa: F401

 if device not in device_op_overrides_dict:
 # For backends like TPU that only need no-op overrides (Pallas handles codegen)
 from .cpu_device_op_overrides import CpuDeviceOpOverrides

 register_device_op_overrides(device, CpuDeviceOpOverrides())

 return device_op_overrides_dict[device]

def get_device_op_overrides(device: str) -> DeviceOpOverrides:
 assert isinstance(device, str), type(device)

 if not device_op_overrides_dict:
 from . import ( # noqa: F401 # noqa: F401
 cpu_device_op_overrides,
 mps_device_op_overrides,
 )
 from .cuda import device_op_overrides # noqa: F401
 from .mtia import device_op_overrides as mtia_op_overrides # noqa: F401
 from .xpu import device_op_overrides as xpu_op_overrides # noqa: F401

 if device not in device_op_overrides_dict:
 # For backends like TPU that only need no-op overrides (Pallas handles codegen)
 from .cpu_device_op_overrides import CpuDeviceOpOverrides

 register_device_op_overrides(device, CpuDeviceOpOverrides())

 return device_op_overrides_dict[device]

This approach is fragile because it assumes no overrides are registered prior to calling get_device_op_overrides. However, if register_device_op_overrides is invoked independently (e.g., in tests or other modules), the dictionary may become partially populated before full initialization occurs.

In such cases, the lazy initialization block is skipped, and some backends (e.g., XPU) never register their corresponding DeviceOpOverrides. As a result, the system silently falls back to CpuDeviceOpOverrides, leading to incorrect behavior.

This issue has already caused multiple failures in XPU CI, particularly in test_gpu_cpp_wrapper.py. For example, PR #175385 unintentionally registers CUDADeviceOpOverrides early, making device_op_overrides_dict non-empty and preventing XPU overrides from being registered.

The silent fallback to CPU overrides makes these issues difficult to detect and debug.

Solution

To make initialization robust and deterministic, introduce a dedicated flag _device_op_overrides_initialized to explicitly track whether all device overrides have been fully registered._

Additional Context

fix https://github.com/pytorch/pytorch/issues/178857 fix https://github.com/pytorch/pytorch/issues/178761 fix https://github.com/pytorch/pytorch/issues/178753 fix https://github.com/pytorch/pytorch/issues/178855, etc...

Pull Request resolved: https://github.com/pytorch/pytorch/pull/178959 Approved by: https://github.com/jansel`

Assets 2

Original source

PyTorch Releases

https://github.com/pytorch/pytorch/releases/tag/trunk%2Ff2ebf8ce7c86a061a181e8da9e5c0a6150955e0a

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

github

ProductsLive

"Be Anything You Want" — OK, Here's How (Technically)

This is a submission for the DEV April Fools Challenge What I Built "I Want To Be..." is a life advice generator that takes your dreams and fulfills them — literally. Want to be rich? Change your name to Richard. Want to be a ninja? Wear all black and move slightly too quietly. People will get the idea. Want to be a cat? Knock something off a table and maintain eye contact. Cat energy. It's a genie who passed the bar exam for loopholes. You asked, we delivered. Technically. 44 categories of deadpan, literally-correct life hacks — from "astronaut" to "wizard" to "left alone" — plus 24 universal fallback answers for the truly original dreamers. Every answer is technically true. None of them are helpful. All of them are stamped 100% LEGIT ADVICE . Demo Try it live on GitHub Pages Type in your

DEV Community

4m30 minutes ago

Open Source AILive

🔥 sponsors/atilaahmettaner

Advanced TradingView MCP Server for AI-powered market analysis. Real-time crypto & stock screening, technical indicators, Bollinger Band intelligence, and candlestick patterns. Works with Claude Desktop & AI assistants. Multi-exchange support (Binance, KuCoin, Bybit+). Open source trading toolkit. — Trending on GitHub today with 38 new stars.

GitHub Trending

1m18 minutes ago

Open Source AILive

🔥 google-ai-edge/LiteRT-LM

google-ai-edge/LiteRT-LM is trending on GitHub today with 113 new stars.

GitHub Trending

3m18 minutes ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 161 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Products

ProductsRecent

Apple at 50: Three products that changed how we live - and three that really didn't

On the tech giant's 50th year, we ask analysts to give their top three Apple successes and misses

BBC Technology

1mabout 13 hours ago

ProductsLive

Cortex Code in Snowflake: How to Use It Without Burning Credits

Snowflake Cortex Code (CoCo) is like an AI assistant inside Snowsight (and CLI also). You can ask it to write SQL, create dbt models, explore data, help in ML work, and even do some admin tasks. But one thing people don’t realise early — this tool is powerful, but also costly if used wrongly. Bad prompts → more tokens → more credits → surprise bill. Prompt Engineering (this directly impacts cost) CoCo works on token consumption. what you type → counted 2. what it replies → counted If your prompt is vague → more tool calls → more cost. Example: Bad: Help me with my data Good: Create staging model for RAW.SALES.ORDERS with not_null on ORDER_ID Best Practices: Use full table names 2. Be clear about output 3. Keep prompts small 4. Provide business logic upfront 5. Use AGENTS.md for consistency

Towards AI

3m40 minutes ago

ProductsLive

The Stack Nobody Recommended

The most common question I got after publishing Part 1 was some variation of "why did you pick X instead of Y?" So this post is about that. Every major technology choice, what I actually considered, where I was right, and where I got lucky. I'll be upfront: some of these were informed decisions. Some were "I already know this tool, and I need to move fast." Both are valid, but they lead to different trade-offs down the line. The Backend: FastAPI I come from JavaScript and TypeScript. Years of React on the frontend, Express and Fastify on the backend. When I decided this project would be Python, because that's where the AI/ML ecosystem lives, I needed something that didn't feel foreign. FastAPI clicked immediately. The async/await model, the decorator-based routing, and type hints that actu

DEV Community

9mabout 1 hour ago

ProductsLive

Best Form Backend for Job Applications and Event Registrations in 2026

If you're collecting job applications or event registrations online, you've probably hit the same wall. Either you're overpaying for a tool like Typeform or JotForm, or you're cobbling together a Google Form that looks unprofessional and gives you zero control over where your data goes. In this article, I'll walk through the best form backends for job applications and event registrations in 2026, covering price, features, file upload support, and which one is right for your use case. Why the Right Form Backend Matters for Applications and Registrations A contact form getting 10 submissions a month is simple. A job application form getting 500 submissions a month is a different problem entirely. You need: File uploads: Candidates submit resumes, cover letters, and portfolios. High submissio

DEV Community

7mabout 1 hour ago