Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessGwen by Penguin AI Transforming Healthcare AI - SNS InsiderGNews AI healthcareYouTube blasted by hundreds of experts over ‘AI slop’ videos served up to kids - Fast CompanyGoogle News: Generative AICan AI Agents Automate Scientific Discovery? - genengnews.comGNews AI agenticWatch Meet OpenClaw: The AI Craze Sweeping China - Bloomberg.comGNews AI ChinaAI Will Drive Scalable Cyberattacks in 2026: Google Cloud - Mexico Business NewsGNews AI cybersecurityMeta: Here's Why I Believe Market Is Underestimating Meta's AI Monetization Story - Seeking AlphaGNews AI MetaBest last-minute Amazon Spring Sale tablet deals 2026ZDNet Big DataSpotlight: Advancing Responsible AI Use - Urban InstituteGoogle News: AIIs AI denying your insurance claim? It's happening more than you think - The Palm Beach PostGNews AI healthcareI've worked on cruise ships for years. Here are 6 things passengers should pack and 5 they shouldn't.Business InsiderHarnessing Data for Decisions: How USARIEM’s Data and Decision Sciences Program is Enhancing Soldier Readiness Through Advanced Analytics - DVIDSGoogle News: AIBuilding a RAG Pipeline From Scratch With LangChain + Pinecone + Claude: A Real ImplementationDEV CommunityBlack Hat USADark ReadingBlack Hat AsiaAI BusinessGwen by Penguin AI Transforming Healthcare AI - SNS InsiderGNews AI healthcareYouTube blasted by hundreds of experts over ‘AI slop’ videos served up to kids - Fast CompanyGoogle News: Generative AICan AI Agents Automate Scientific Discovery? - genengnews.comGNews AI agenticWatch Meet OpenClaw: The AI Craze Sweeping China - Bloomberg.comGNews AI ChinaAI Will Drive Scalable Cyberattacks in 2026: Google Cloud - Mexico Business NewsGNews AI cybersecurityMeta: Here's Why I Believe Market Is Underestimating Meta's AI Monetization Story - Seeking AlphaGNews AI MetaBest last-minute Amazon Spring Sale tablet deals 2026ZDNet Big DataSpotlight: Advancing Responsible AI Use - Urban InstituteGoogle News: AIIs AI denying your insurance claim? It's happening more than you think - The Palm Beach PostGNews AI healthcareI've worked on cruise ships for years. Here are 6 things passengers should pack and 5 they shouldn't.Business InsiderHarnessing Data for Decisions: How USARIEM’s Data and Decision Sciences Program is Enhancing Soldier Readiness Through Advanced Analytics - DVIDSGoogle News: AIBuilding a RAG Pipeline From Scratch With LangChain + Pinecone + Claude: A Real ImplementationDEV Community

QuitoBench: A High-Quality Open Time Series Forecasting Benchmark

arXivMarch 30, 202610 min read0 views
Source Quiz

arXiv:2603.26017v1 Announce Type: new Abstract: Time series forecasting is critical across finance, healthcare, and cloud computing, yet progress is constrained by a fundamental bottleneck: the scarcity of large-scale, high-quality benchmarks. To address this gap, we introduce \textsc{QuitoBench}, a regime-balanced benchmark for time series forecasting with coverage across eight trend$\times$seasonality$\times$forecastability (TSF) regimes, designed to capture forecasting-relevant properties rather than application-defined domain labels. The benchmark is built upon \textsc{Quito}, a billion-sc — Siqiao Xue, Zhaoyang Zhu, Wei Zhang, Rongyao Cai, Rui Wang, Yixiang Mu, Fan Zhou, Jianguo Li, Peng Di, Hang Yu

View PDF HTML (experimental)

Abstract:Time series forecasting is critical across finance, healthcare, and cloud computing, yet progress is constrained by a fundamental bottleneck: the scarcity of large-scale, high-quality benchmarks. To address this gap, we introduce \textsc{QuitoBench}, a regime-balanced benchmark for time series forecasting with coverage across eight trend$\times$seasonality$\times$forecastability (TSF) regimes, designed to capture forecasting-relevant properties rather than application-defined domain labels. The benchmark is built upon \textsc{Quito}, a billion-scale time series corpus of application traffic from Alipay spanning nine business domains. Benchmarking 10 models from deep learning, foundation models, and statistical baselines across 232,200 evaluation instances, we report four key findings: (i) a context-length crossover where deep learning models lead at short context ($L=96$) but foundation models dominate at long context ($L \ge 576$); (ii) forecastability is the dominant difficulty driver, producing a $3.64 \times$ MAE gap across regimes; (iii) deep learning models match or surpass foundation models at $59 \times$ fewer parameters; and (iv) scaling the amount of training data provides substantially greater benefit than scaling model size for both model families. These findings are validated by strong cross-benchmark and cross-metric consistency. Our open-source release enables reproducible, regime-aware evaluation for time series forecasting research.

Comments: project site: this https URL

Subjects:

Machine Learning (cs.LG)

Cite as: arXiv:2603.26017 [cs.LG]

(or arXiv:2603.26017v1 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2603.26017

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Siqiao Xue [view email] [v1] Fri, 27 Mar 2026 02:24:34 UTC (448 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
QuitoBench:…researchpaperarxivmachine-lea…deep-learni…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 139 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers