Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessFrom False Positives to Real Risk: AI‑Driven Compliance in Modern UC - UC TodayGoogle News: Generative AIClaude Code Leak: What Went Wrong at Anthropic? - AI MagazineGoogle News: ClaudeU.S. Reportedly Seeking Access To Three Additional Bases In Greenland, The First Expansion In DecadesInternational Business TimesAnthropic's Claude Code source code got accidentally leaked - qz.comGoogle News: ClaudeAI’s Biggest Opportunity Lies in the 92% of Work It Hasn’t Touched - PYMNTS.comGoogle News: AIWhy is gaming becoming so expensive? The answer is found in AI - The GuardianGoogle News: AIChoosing the Right Model is Hard. Maintaining Accuracy is Harder.AI YouTube Channel 24A YouTuber channeled his distaste for the PS5’s design into slick console coversThe Verge AILess than a month: StrictlyVC San Francisco brings leaders from TDK Ventures, Replit, and more togetherTechCrunch AIThe Strange, Shaky Alliance Taking on Trump and His Big Tech Friends - PoliticoGoogle News: AI SafetyI Asked ChatGPT If It Was A Psychopath—Here’s What It Said - ForbesGoogle News: ChatGPTGoogle’s TurboQuant Marks A Turning Point In AI’s Evolution - ForbesGoogle News: LLMBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessFrom False Positives to Real Risk: AI‑Driven Compliance in Modern UC - UC TodayGoogle News: Generative AIClaude Code Leak: What Went Wrong at Anthropic? - AI MagazineGoogle News: ClaudeU.S. Reportedly Seeking Access To Three Additional Bases In Greenland, The First Expansion In DecadesInternational Business TimesAnthropic's Claude Code source code got accidentally leaked - qz.comGoogle News: ClaudeAI’s Biggest Opportunity Lies in the 92% of Work It Hasn’t Touched - PYMNTS.comGoogle News: AIWhy is gaming becoming so expensive? The answer is found in AI - The GuardianGoogle News: AIChoosing the Right Model is Hard. Maintaining Accuracy is Harder.AI YouTube Channel 24A YouTuber channeled his distaste for the PS5’s design into slick console coversThe Verge AILess than a month: StrictlyVC San Francisco brings leaders from TDK Ventures, Replit, and more togetherTechCrunch AIThe Strange, Shaky Alliance Taking on Trump and His Big Tech Friends - PoliticoGoogle News: AI SafetyI Asked ChatGPT If It Was A Psychopath—Here’s What It Said - ForbesGoogle News: ChatGPTGoogle’s TurboQuant Marks A Turning Point In AI’s Evolution - ForbesGoogle News: LLM

Differentially Private Linear Regression and Synthetic Data Generation with Statistical Guarantees

arXivMarch 31, 202610 min read0 views
Source Quiz

arXiv:2510.16974v3 Announce Type: replace Abstract: In the social sciences, small- to medium-scale datasets are common, and linear regression is canonical. In privacy-aware settings, much work has focused on differentially private (DP) linear regression, but mostly on point estimation with limited attention to uncertainty quantification. Meanwhile, synthetic data generation (SDG) is increasingly important for reproducibility studies, yet current DP linear regression methods do not readily support it. Mainstream DP-SDG approaches either are tailored to discrete or discretized data, making them — Shurong Lin, Aleksandra Slavkovi\'c, Deekshith Reddy Bhoomireddy

View PDF HTML (experimental)

Abstract:In the social sciences, small- to medium-scale datasets are common, and linear regression is canonical. In privacy-aware settings, much work has focused on differentially private (DP) linear regression, but mostly on point estimation with limited attention to uncertainty quantification. Meanwhile, synthetic data generation (SDG) is increasingly important for reproducibility studies, yet current DP linear regression methods do not readily support it. Mainstream DP-SDG approaches either are tailored to discrete or discretized data, making them less suitable for analyses involving continuous variables, or rely on deep learning models that require large datasets, limiting their use for the smaller-scale data typical in social science. We propose a method for linear regression with valid inference under Gaussian DP. It includes a bias-corrected estimator with asymptotic confidence intervals (CIs) and a general SDG procedure such that the corresponding regression on the synthetic data matches our DP linear regression procedure. Our approach is effective in small- to moderate-dimensional settings. Experiments show that our method (1) improves accuracy over existing methods for DP linear regression, (2) provides valid CIs, and (3) produces more reliable synthetic data for downstream statistical and machine learning tasks than current DP synthesizers.

Subjects:

Machine Learning (cs.LG); Machine Learning (stat.ML)

Cite as: arXiv:2510.16974 [cs.LG]

(or arXiv:2510.16974v3 [cs.LG] for this version)

https://doi.org/10.48550/arXiv.2510.16974

arXiv-issued DOI via DataCite

Submission history

From: Shurong Lin [view email] [v1] Sun, 19 Oct 2025 19:30:41 UTC (61 KB) [v2] Sun, 8 Feb 2026 02:26:54 UTC (61 KB) [v3] Sat, 28 Mar 2026 20:26:46 UTC (61 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Differentia…researchpaperarxivmachine-lea…deep-learni…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 185 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers