Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessHow to Test Discord Webhooks with HookCapDEV CommunitySaaS Pricing Models Decoded: What Per-Seat, Usage-Based, and Flat-Rate Really Cost YouDEV CommunityClaude Code hooks: intercept every tool call before it runsDEV CommunityHow to Test Twilio Webhooks with HookCapDEV CommunityI'm an AI Agent That Built Its Own Training Data PipelineDEV CommunityMy React Portfolio SEO Checklist: From 0 to Rich Results in 48 HoursDEV CommunityWhy AI Agents Need a Trust Layer (And How We Built One)DEV CommunityBuilding a scoring engine with pure TypeScript functions (no ML, no backend)DEV Community🚀 I Vibecoded an AI Interview Simulator in 1 Hour using Gemini + GroqDEV CommunityWebhook Best Practices: Retry Logic, Idempotency, and Error HandlingDEV CommunityObservabilidade de agentes de IA com LangChain4jDEV CommunityI Ranked on Google's First Page in 6 Weeks — Here's Every SEO Tactic I Used (Part 2)DEV CommunityBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessHow to Test Discord Webhooks with HookCapDEV CommunitySaaS Pricing Models Decoded: What Per-Seat, Usage-Based, and Flat-Rate Really Cost YouDEV CommunityClaude Code hooks: intercept every tool call before it runsDEV CommunityHow to Test Twilio Webhooks with HookCapDEV CommunityI'm an AI Agent That Built Its Own Training Data PipelineDEV CommunityMy React Portfolio SEO Checklist: From 0 to Rich Results in 48 HoursDEV CommunityWhy AI Agents Need a Trust Layer (And How We Built One)DEV CommunityBuilding a scoring engine with pure TypeScript functions (no ML, no backend)DEV Community🚀 I Vibecoded an AI Interview Simulator in 1 Hour using Gemini + GroqDEV CommunityWebhook Best Practices: Retry Logic, Idempotency, and Error HandlingDEV CommunityObservabilidade de agentes de IA com LangChain4jDEV CommunityI Ranked on Google's First Page in 6 Weeks — Here's Every SEO Tactic I Used (Part 2)DEV Community

Identifiable Deep Latent Variable Models for MNAR Data

arXivby [Submitted on 25 Mar 2026 (v1), last revised 27 Mar 2026 (this version, v2)]March 30, 20262 min read2 views
Source Quiz

arXiv:2603.24771v2 Announce Type: replace-cross Abstract: Missing data is a ubiquitous challenge in data analysis, often leading to biased and inaccurate results. Traditional imputation methods usually assume that the missingness mechanism is missing-at-random (MAR), where the missingness is independent of the missing values themselves. This assumption is frequently violated in real-world scenarios, prompted by recent advances in imputation methods using deep learning to address this challenge. However, these methods neglect the crucial issue of nonparametric identifiability in missing-not-at- — Huiming Xie, Fei Xue, Xiao Wang

View PDF HTML (experimental)

Abstract:Missing data is a ubiquitous challenge in data analysis, often leading to biased and inaccurate results. Traditional imputation methods usually assume that the missingness mechanism is missing-at-random (MAR), where the missingness is independent of the missing values themselves. This assumption is frequently violated in real-world scenarios, prompted by recent advances in imputation methods using deep learning to address this challenge. However, these methods neglect the crucial issue of nonparametric identifiability in missing-not-at-random (MNAR) data, which can lead to biased and unreliable results. This paper seeks to bridge this gap by proposing a novel framework based on deep latent variable models for MNAR data. Building on the assumption of conditional no self-censoring given latent variables, we establish the identifiability of the data distribution. This crucial theoretical result guarantees the feasibility of our approach. To effectively estimate unknown parameters, we develop an efficient algorithm utilizing importance-weighted autoencoders. We demonstrate, both theoretically and empirically, that our estimation process accurately recovers the ground-truth joint distribution under specific regularity conditions. Extensive simulation studies and real-world data experiments showcase the advantages of our proposed method compared to various classical and state-of-the-art approaches to missing data imputation.

Subjects:

Methodology (stat.ME); Machine Learning (stat.ML)

Cite as: arXiv:2603.24771 [stat.ME]

(or arXiv:2603.24771v2 [stat.ME] for this version)

https://doi.org/10.48550/arXiv.2603.24771

arXiv-issued DOI via DataCite

Submission history

From: Huiming Xie [view email] [v1] Wed, 25 Mar 2026 19:42:28 UTC (2,652 KB) [v2] Fri, 27 Mar 2026 02:53:11 UTC (2,652 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Identifiabl…researchpaperarxivstatisticsmachine-lea…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 203 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers