Research Papers research paper arxiv ai artificial-intelligence

The Effective Depth Paradox: Evaluating the Relationship between Architectural Topology and Trainability in Deep CNNs

arXivMarch 30, 202610 min read0 views

arXiv:2602.13298v2 Announce Type: replace-cross Abstract: This paper investigates the relationship between convolutional neural network (CNN) and image recognition performance through a comparative study of the VGG, ResNet and GoogLeNet architectural families. By evaluating these models under a unified experimental framework on upscaled CIFAR-10 data, we isolate the effects of depth from confounding implementation variables. We introduce a formal distinction between nominal depth ($D_{\mathrm{nom}}$), the total count of weight-bearing layers, and effective depth ($D_{\mathrm{eff}}$), an operat — Manfred M. Fischer, Joshua Pitts

View PDF HTML (experimental)

Abstract:This paper investigates the relationship between convolutional neural network (CNN) and image recognition performance through a comparative study of the VGG, ResNet and GoogLeNet architectural families. By evaluating these models under a unified experimental framework on upscaled CIFAR-10 data, we isolate the effects of depth from confounding implementation variables. We introduce a formal distinction between nominal depth ($D_{\mathrm{nom}}$), the total count of weight-bearing layers, and effective depth ($D_{\mathrm{eff}}$), an operational metric representing the expected number of sequential transformations encountered along all feasible forward paths. As derived in Section 3, $D_{\mathrm{eff}}$ is computed through topology-specific proxies: as the total sequential count for plain networks, the arithmetic mean of minimum and maximum path lengths for residual structures, and the sum of average branch depths for multi-branch modules. Our empirical results demonstrate that while sequential architectures such as VGG suffer from diminishing returns and severe gradient attenuation as $D_{\mathrm{nom}}$ increases, architectures with identity shortcuts or branching modules maintain optimization stability. This stability is achieved by decoupling $D_{\mathrm{eff}}$ from $D_{\mathrm{nom}}$, thus ensuring a manageable functional depth for gradient propagation. We conclude that effective depth serves as a superior predictor of a network's scaling potential and practical trainability compared to traditional layer counts, providing a principled framework for future architectural innovation.

Subjects:

Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

Cite as: arXiv:2602.13298 [cs.CV]

(or arXiv:2602.13298v2 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2602.13298

arXiv-issued DOI via DataCite

Submission history

From: Manfred M. Fischer [view email] [v1] Mon, 9 Feb 2026 10:14:15 UTC (161 KB) [v2] Fri, 27 Mar 2026 09:02:37 UTC (152 KB)

Original source

arXiv

https://arxiv.org/abs/2602.13298

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Releases

SMART launches new Wearable Imaging for Transforming Elderly Care research group

WITEC is working to develop the first wearable ultrasound imaging system to monitor chronic conditions in real-time, with the goal of enabling earlier detection and timely intervention.

MIT AI News

1mabout 2 months ago

Products

3 Questions: Using AI to accelerate the discovery and design of therapeutic drugs

Professor James Collins discusses how collaboration has been central to his research into combining computational predictions with new experimental platforms.