Research Papers benchmark announce open-source company study paper

On computing and the complexity of computing higher-order $U$-statistics, exactly

arXiv stat.MLby Xingyu Chen, Ruiqi Zhang, Lin LiuApril 1, 20262 min read0 views

arXiv:2508.12627v2 Announce Type: replace Abstract: Higher-order $U$-statistics abound in fields such as statistics, machine learning, and computer science, but are known to be highly time-consuming to compute in practice. Despite their widespread appearance, a comprehensive study of their computational complexity is surprisingly lacking. This paper aims to fill this gap by presenting several results related to the computational aspect of $U$-statistics. First, we derive a useful decomposition from a $m$-th order $U$-statistic to a linear combination of $V$-statistics with orders not exceeding $m$, which are generally more feasible to compute. Second, we explore the connection between exactly computing $V$-statistics and Einstein summation, a tool often used in computational mathematics an

View PDF HTML (experimental)

Abstract:Higher-order $U$-statistics abound in fields such as statistics, machine learning, and computer science, but are known to be highly time-consuming to compute in practice. Despite their widespread appearance, a comprehensive study of their computational complexity is surprisingly lacking. This paper aims to fill this gap by presenting several results related to the computational aspect of $U$-statistics. First, we derive a useful decomposition from a $m$-th order $U$-statistic to a linear combination of $V$-statistics with orders not exceeding $m$, which are generally more feasible to compute. Second, we explore the connection between exactly computing $V$-statistics and Einstein summation, a tool often used in computational mathematics and quantum computing to accelerate tensor computations. Third, we provide an optimistic estimate of the time complexity for exactly computing $U$-statistics, based on the treewidth of a particular graph associated with the $U$-statistic kernel. The above ingredients lead to (1) a new, much more runtime-efficient algorithm to exactly compute general higher-order $U$-statistics, and (2) a more streamlined characterization of runtime complexity of computing $U$-statistics. We develop an accompanying open-source package called \texttt{u-stats} in both Python (this https URL) and R (this https URL). We demonstrate through three examples in statistics that \texttt{u-stats} achieves impressive runtime performance compared to existing benchmarks. This paper also aspires to achieve two goals: (1) to capture the interest of researchers in both statistics and other related areas to further advance the algorithmic development of $U$-statistics and (2) to lift the burden of implementing higher-order $U$-statistics from practitioners.

Comments: Comments are welcome! 71 pages, 8 tables, 5 figures. An accompanying Python package is available at: this https URL or this https URL

Subjects:

Machine Learning (stat.ML); Data Structures and Algorithms (cs.DS); Numerical Analysis (math.NA); Computation (stat.CO); Methodology (stat.ME)

Cite as: arXiv:2508.12627 [stat.ML]

(or arXiv:2508.12627v2 [stat.ML] for this version)

https://doi.org/10.48550/arXiv.2508.12627

arXiv-issued DOI via DataCite

Submission history

From: Ruiqi Zhang [view email] [v1] Mon, 18 Aug 2025 05:01:10 UTC (49 KB) [v2] Tue, 31 Mar 2026 06:15:41 UTC (68 KB)

Original source

arXiv stat.ML

https://arxiv.org/abs/2508.12627

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

benchmarkannounceopen-source

Analyst NewsLive

AI #162: Visions of Mythos

Anthropic had some problem with leaks this week. We learned that they are sitting on a new larger-than-Opus AI model, Mythos, that they believe offers a step change in cyber capabilities. We also got a full leak of the source for Claude Code. Oh, and Axios was compromised, on the heels of LiteLLM. This looks to be getting a lot more common. Defense beats offense in most cases, but offense is getting a lot more shots on goal than it used to. The AI Doc: Or How I Became an Aplocayloptimist came out this week. I gave it 4.5/5 stars, and I think the world would be better off if more people saw it. I am not generally a fan of documentary movies, but this is probably my new favorite, replacing The King of Kong: A Fistful of Quarters. There was also the usual background hum of quite a lot of thin

LessWrong AI

88m9 minutes ago

ReleasesRecent

Noom acquires compounding pharmacy to expand beyond weight loss

Weight-loss company Noom has acquired 503A pharmacy Tailor Made Compounding, which Noom said will allow it to expand its behavior change programs and move beyond weight health. Tailor Made provides sterile and non-sterile compounding through its pharmacy practice focused on aging. It offers compounded drugs, including hormone replacement therapies and peptide therapies, as well as pharmacist-formulated supplements and cosmetics.

MobiHealthNews

1mabout 20 hours ago

Analyst NewsFresh

Korean hospitals outpace global peers in digital maturity: pilot study

South Korean hospitals assessed in a pilot digital maturity study scored higher in the HIMSS Digital Health Indicator than global averages, according to a joint report by the Korea Health Industry Development Institute and Healthcare Information and Management Systems Society. Last year in April, KHIDI signed a memorandum of understanding with HIMSS on the Korea Digital Health Indicator (Ko-DHI), an initiative to assess the digital maturity of Korean hospitals.

Healthcare IT News AI

1mabout 6 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 180 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research PapersFresh

NIH funds AI project to advance Alzheimer’s research and treatment - News-Medical

NIH funds AI project to advance Alzheimer’s research and treatment News-Medical

GNews AI drug discovery

1mabout 9 hours ago

Research PapersFresh

Beyond Metadata: Multimodal, Policy-Aware Detection of YouTube Scam Videos

arXiv:2509.23418v2 Announce Type: replace Abstract: YouTube is a major platform for information and entertainment, but its wide accessibility also makes it attractive for scammers to upload deceptive or malicious content. Prior detection approaches rely largely on textual or statistical metadata, such as titles, descriptions, view counts, or likes, which are effective in many cases but can be evaded through benign-looking text, manipulated statistics, or other obfuscation strategies (e.g., 'Leetspeak'), while ignoring visual cues. In this study, we systematically investigate multimodal approaches for detecting YouTube scams. Our dataset consolidates established scam categories and augments them with full-length videos and policy-grounded reasoning annotations. Experiments show that a text-

arXiv cs.CR

2mabout 8 hours ago

Research PapersFresh

Online Flow Time Minimization: Tight Bounds for Non-Preemptive Algorithms

arXiv:2511.03485v3 Announce Type: replace Abstract: This paper studies the online scheduling problem of minimizing total flow time for $n$ jobs on $m$ identical machines. A classical $\Omega(n)$ lower bound shows that no deterministic single-machine algorithm can beat the trivial greedy, even when $n$ is known in advance. However, this barrier is specific to deterministic algorithms on a single machine, leaving open what randomization, multiple machines, or the kill-and-restart capability can achieve. We give a nearly complete answer. For randomized non-preemptive algorithms, we establish a tight $\Theta(\sqrt{n/m})$ competitive ratio, which also improves the best offline approximation to $O(\sqrt{n/m})$. For deterministic non-preemptive algorithms on multiple machines, we prove an $O(n/m^

arXiv cs.DS

2mabout 8 hours ago

Research PapersFresh

On the average-case complexity landscape for Tensor-Isomorphism-complete problems over finite fields

arXiv:2604.00591v1 Announce Type: cross Abstract: In Grochow and Qiao (SIAM J. Comput., 2021), the complexity class Tensor Isomorphism (TI) was introduced and isomorphism problems for groups, algebras, and polynomials were shown to be TI-complete. In this paper, we study average-case algorithms for several TI-complete problems over finite fields, including algebra isomorphism, matrix code conjugacy, and $4$-tensor isomorphism. Our main results are as follows. Over the finite field of order $q$, we devise (1) average-case polynomial-time algorithms for algebra isomorphism and matrix code conjugacy that succeed in a $1/\Theta(q)$ fraction of inputs and (2) an average-case polynomial-time algorithm for the $4$-tensor isomorphism that succeeds in a $1/q^{\Theta(1)}$ fraction of inputs. Prior t

arXiv cs.DS

2mabout 8 hours ago