Live
Black Hat USADark ReadingBlack Hat AsiaAI Businesstrunk/3c9726cdf76b01c44fac8473c2f3d6d11249099e: Replace erase idiom for map/set with erase_if (#179373)PyTorch ReleasesBig Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.Dev.to AII Can't Write Code. But I Built a 100,000-Line Terminal IDE on My Phone.Dev.to AII Built a Free AI Tool That Turns One Blog Post Into 30 Pieces of ContentDev.to AILoop Neighborhood Markets Deploys AI Agents to Store AssociatesDev.to AIHow to Use Claude Code for Security Audits: The Script That Found a 23-Year-Old Linux BugDev.to AIAnthropic says Claude Code subscribers will need to pay extra for OpenClaw usageTechCrunch AIWhy Your Agent Works Great in Demos But Fails in ProductionDev.to AIЯ протестировал 8 бесплатных аналогов ChatGPT на русскомDev.to AINew Rowhammer attack can grant kernel-level control on Nvidia workstation GPUsTechSpotHow the JavaScript Event Loop Creates the Illusion of MultithreadingDev.to AIShowDev: I Built an AI-Powered "Viral Reel Idea Machine" (Custom PHP + Gemini AI) 🚀Dev.to AIBlack Hat USADark ReadingBlack Hat AsiaAI Businesstrunk/3c9726cdf76b01c44fac8473c2f3d6d11249099e: Replace erase idiom for map/set with erase_if (#179373)PyTorch ReleasesBig Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.Dev.to AII Can't Write Code. But I Built a 100,000-Line Terminal IDE on My Phone.Dev.to AII Built a Free AI Tool That Turns One Blog Post Into 30 Pieces of ContentDev.to AILoop Neighborhood Markets Deploys AI Agents to Store AssociatesDev.to AIHow to Use Claude Code for Security Audits: The Script That Found a 23-Year-Old Linux BugDev.to AIAnthropic says Claude Code subscribers will need to pay extra for OpenClaw usageTechCrunch AIWhy Your Agent Works Great in Demos But Fails in ProductionDev.to AIЯ протестировал 8 бесплатных аналогов ChatGPT на русскомDev.to AINew Rowhammer attack can grant kernel-level control on Nvidia workstation GPUsTechSpotHow the JavaScript Event Loop Creates the Illusion of MultithreadingDev.to AIShowDev: I Built an AI-Powered "Viral Reel Idea Machine" (Custom PHP + Gemini AI) 🚀Dev.to AI
AI NEWS HUBbyEIGENVECTOREigenvector

Neocloud Pioneer CoreWeave All In on Inference

AI Businessby Shaun SutnerApril 2, 20261 min read0 views
Source Quiz

After making a name for itself as a GPU-as-a-service vendor, CoreWeave is evolving -- again.

4 Min Read

Michael M. Santiago/Staff via Getty Images

Inference is everything.

That aphorism and way of looking at AI infrastructure have been appearing frequently in AI circles lately.

Now, CoreWeave, the cryptocurrency startup turned major neocloud player, with a close relationship with AI chip giant Nvidia, has started to pivot toward one of the fastest-growing trends in AI -- inference.

The vendor operates some 40 AI data centers -- largely populated by Nvidia GPUs -- and serves dozens of major customers, including generative AI vendors OpenAI, Cohere and ElevenLabs; enterprises and tech vendors such as Siemens, Mercado Libre, Salesforce and Databricks; and AI platforms Perplexity, Cursor and Runway.

Putting Inference to Use

"Inference is the way to monetize AI," Chen Goldberg, executive vice president of product and engineering at CoreWeave, said during an online media roundtable earlier this week. "We are seeing that with our customer base, no matter if it's enterprise AI, AI labs or AI platforms, customers are looking for different methods to run inference. That's what we've been doing."

Related:Nvidia Invests $2B In Custom Chip Vendor Marvell Technology

Propelling the demand for inference is the dramatic surge in agentic AI interest. Many AI users are interested in using autonomous agents that lean heavily on the reasoning capabilities of large language models. And reasoning largely relies on inference, with agents drawing new conclusions and acting independently rather than regurgitating information from huge, pretrained LLMs.

"Instead of a single query … we have a new category of agents, which [do] a long-running task. [Agents] can complete more complicated tasks, maybe with multiple queries," Goldberg said.

Applications that are increasingly using agentic AI and inference include coding, engineering, physical AI, call centers and drug discovery, she noted.

Speed and Older GPUs

Meanwhile, CoreWeave is touting recent top performance in compute processing speed benchmarks on the independent MLPerf Training benchmark suite from the MLCommons consortium using Nvidia Grace Blackwell architectures to run two popular, powerful reasoning models: DeepSeek-R1 and OpenAI's smaller open-weight gpt-oss-120b.

That speed is important for extracting the most performance from earlier-generation GPUs, said Shadi Saba, senior director of AI/ML infrastructure at CoreWeave, during the roundtable.

With Nvidia and other chip vendors rapidly releasing newer generations of GPUs, industry observers have raised financial concerns about depreciating GPUs as faster, more capable chips arrive on the market.

"Compared with older generations, the same model will squeeze the most from whatever Nvidia is giving between generations," Saba said, noting that CoreWeave uses its own software stack to optimize performance from GPUs and CPUs, which are becoming more popular for inference tasks.

Related:Meta Ups Texas AI Data Center Investment From $1.5B to $10B

CoreWeave's strategy of wringing usable production from older GPUs, while also upgrading to the latest chips, is effective, said Steven Dickens, an analyst at HyperFrame Research.

"You've got to look at it as a sort of portfolio construction, in the same way you do your stock portfolio. You want some things that earn you money from dividends, and then you want some high growth stocks," Dickens said, adding that the vendor can provide reliable inference compute with older chips. "The same thing with CoreWeave. They have some H100 chips that are probably three or four years old. Those are still in the portfolio and still earning money."

The strategy, however, isn't unique and is also employed by neocloud competitors including Nebius, Lambda, OVH and QumulusAI.

The Neocloud Market

Dickens said the ability of neocloud vendors to use their software stacks to optimize the performance of older chips and to move workloads to the most cost-effective GPUs and other chips is the vendors' specialty.

Related:Bezos’ Blue Origin joins race to put AI data centers in space

"That's the secret sauce of a neocloud, their ability to portfolio manage their GPU fleet and then be able to move workloads to optimize," he said. "Everybody's going to say they want their stuff to run on the latest and greatest. Very few workloads actually need to work on the latest and greatest."

As for the neocloud market landscape, Dickens said it is starting to shake out to a handful of major players.

While there were some 150 neocloud startups 18 months ago, he said he sees that number winnowing down to 10 or so dominant players over the next five years.

"Winner-takes-most is how I see this industry panning out, not winner-takes-all," Dickens said. "It's not going to be that there's no more business for Lambda, Nebius and OVH. There's obviously going to be business for those guys, and CoreWeave is going to be one of those names."

About the Author

Senior News Director, AI Business

Shaun Sutner, a journalist with more than 25 years of daily newspaper experience and 11 years at Informa TechTarget as an editor and writer, directs news coverage for AI Business. He was previously a senior news and features writer covering health IT and HR software at TechTarget and a senior news director overseeing coverage of AI, business analytics, data management and government tech regulation.

Sutner's newspaper career included investigative reporting and covering the Massachusetts State House and politics for the Worcester Telegram & Gazette. He has written about snow sports as a T&G columnist and correspondent for 20 years. Sutner's interests also include tennis, standup paddleboarding, cooking and popular music.

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

service

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Neocloud Pi…serviceAI Business

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 140 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Products