Skullptor: High Fidelity 3D Head Reconstruction in Seconds with Multi-View Normal Prediction
arXiv:2602.21100v3 Announce Type: replace Abstract: Reconstructing high-fidelity 3D head geometry from images is critical for a wide range of applications, yet existing methods face fundamental limitations. Traditional photogrammetry achieves exceptional detail but requires extensive camera arrays (25-200+ views), substantial computation, and manual cleanup in challenging areas like facial hair. Recent alternatives present a fundamental trade-off: foundation models enable efficient single-image reconstruction but lack fine geometric detail, while optimization-based methods achieve higher fidel — No\'e Artru, Rukhshanda Hussain, Emeline Got, Alexandre Messier, David B. Lindell, Abdallah Dib
View PDF HTML (experimental)
Abstract:Reconstructing high-fidelity 3D head geometry from images is critical for a wide range of applications, yet existing methods face fundamental limitations. Traditional photogrammetry achieves exceptional detail but requires extensive camera arrays (25-200+ views), substantial computation, and manual cleanup in challenging areas like facial hair. Recent alternatives present a fundamental trade-off: foundation models enable efficient single-image reconstruction but lack fine geometric detail, while optimization-based methods achieve higher fidelity but require dense views and expensive computation. We bridge this gap with a hybrid approach that combines the strengths of both paradigms. Our method introduces a multi-view surface normal prediction model that extends monocular foundation models with cross-view attention to produce geometrically consistent normals in a feed-forward pass. We then leverage these predictions as strong geometric priors within an inverse rendering optimization framework to recover high-frequency surface details. Our approach outperforms state-of-the-art single-image and multi-view methods, achieving high-fidelity reconstruction on par with dense-view photogrammetry while reducing camera requirements and computational cost.
Comments: For our project page, see this https URL
Subjects:
Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
MSC classes: 68U05 (Primary) 68T45 (Secondary)
ACM classes: I.2.10; I.3.5; I.4.5; I.5.1; I.5.4
Cite as: arXiv:2602.21100 [cs.CV]
(or arXiv:2602.21100v3 [cs.CV] for this version)
https://doi.org/10.48550/arXiv.2602.21100
arXiv-issued DOI via DataCite
Submission history
From: Noé Artru [view email] [v1] Tue, 24 Feb 2026 17:02:11 UTC (23,646 KB) [v2] Wed, 4 Mar 2026 18:04:31 UTC (28,226 KB) [v3] Fri, 27 Mar 2026 15:07:45 UTC (28,223 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!