Research Papers research paper arxiv computer-vision image-recognition

Skullptor: High Fidelity 3D Head Reconstruction in Seconds with Multi-View Normal Prediction

arXivby [Submitted on 24 Feb 2026 (v1), last revised 27 Mar 2026 (this version, v3)]March 30, 20262 min read1 views

arXiv:2602.21100v3 Announce Type: replace Abstract: Reconstructing high-fidelity 3D head geometry from images is critical for a wide range of applications, yet existing methods face fundamental limitations. Traditional photogrammetry achieves exceptional detail but requires extensive camera arrays (25-200+ views), substantial computation, and manual cleanup in challenging areas like facial hair. Recent alternatives present a fundamental trade-off: foundation models enable efficient single-image reconstruction but lack fine geometric detail, while optimization-based methods achieve higher fidel — No\'e Artru, Rukhshanda Hussain, Emeline Got, Alexandre Messier, David B. Lindell, Abdallah Dib

View PDF HTML (experimental)

Abstract:Reconstructing high-fidelity 3D head geometry from images is critical for a wide range of applications, yet existing methods face fundamental limitations. Traditional photogrammetry achieves exceptional detail but requires extensive camera arrays (25-200+ views), substantial computation, and manual cleanup in challenging areas like facial hair. Recent alternatives present a fundamental trade-off: foundation models enable efficient single-image reconstruction but lack fine geometric detail, while optimization-based methods achieve higher fidelity but require dense views and expensive computation. We bridge this gap with a hybrid approach that combines the strengths of both paradigms. Our method introduces a multi-view surface normal prediction model that extends monocular foundation models with cross-view attention to produce geometrically consistent normals in a feed-forward pass. We then leverage these predictions as strong geometric priors within an inverse rendering optimization framework to recover high-frequency surface details. Our approach outperforms state-of-the-art single-image and multi-view methods, achieving high-fidelity reconstruction on par with dense-view photogrammetry while reducing camera requirements and computational cost.

Comments: For our project page, see this https URL

Subjects:

Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

MSC classes: 68U05 (Primary) 68T45 (Secondary)

ACM classes: I.2.10; I.3.5; I.4.5; I.5.1; I.5.4

Cite as: arXiv:2602.21100 [cs.CV]

(or arXiv:2602.21100v3 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2602.21100

arXiv-issued DOI via DataCite

Submission history

From: Noé Artru [view email] [v1] Tue, 24 Feb 2026 17:02:11 UTC (23,646 KB) [v2] Wed, 4 Mar 2026 18:04:31 UTC (28,226 KB) [v3] Fri, 27 Mar 2026 15:07:45 UTC (28,223 KB)

Original source

arXiv

https://arxiv.org/abs/2602.21100

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Building knowledge graph…

Discussion

No comments yet — be the first to share your thoughts!