Live
Black Hat USADark ReadingBlack Hat AsiaAI Businessciflow/trunk/177707PyTorch ReleasesShow HN: Vibooks – Local-first bookkeeping software built for AI agentsHacker News AI Topciflow/torchtitan/179381: Update on "[wip][dynamo] Reduce special casing for namedtuple objects"PyTorch Releasesciflow/trunk/179003: Thread compile_region_name through AOTAutograd cache hit pathPyTorch ReleasesOne year ago DeepSeek R1 was 25 times bigger than Gemma 4Reddit r/LocalLLaMACrack ML Interviews with Confidence: ML Model Development (20 Q&A)Towards AILooking for smallest VLM for NSFW image detector (atleast 5 it/s on CPU)Reddit r/LocalLLaMACoreWeave Stock Analysis: Buy or Sell This Nvidia-Backed AI Stock? - The Motley FoolGNews AI NVIDIAIntel Arc B70 Benchmarks/Comparison to Nvidia RTX 4070 SuperReddit r/LocalLLaMAI Gave Claude Access to My Desktop Outlook Without Touching the Microsoft APITowards AIHermes agent might be the best open source agent for local models right nowReddit r/LocalLLaMABanning All Anthropic EmployeesHacker NewsBlack Hat USADark ReadingBlack Hat AsiaAI Businessciflow/trunk/177707PyTorch ReleasesShow HN: Vibooks – Local-first bookkeeping software built for AI agentsHacker News AI Topciflow/torchtitan/179381: Update on "[wip][dynamo] Reduce special casing for namedtuple objects"PyTorch Releasesciflow/trunk/179003: Thread compile_region_name through AOTAutograd cache hit pathPyTorch ReleasesOne year ago DeepSeek R1 was 25 times bigger than Gemma 4Reddit r/LocalLLaMACrack ML Interviews with Confidence: ML Model Development (20 Q&A)Towards AILooking for smallest VLM for NSFW image detector (atleast 5 it/s on CPU)Reddit r/LocalLLaMACoreWeave Stock Analysis: Buy or Sell This Nvidia-Backed AI Stock? - The Motley FoolGNews AI NVIDIAIntel Arc B70 Benchmarks/Comparison to Nvidia RTX 4070 SuperReddit r/LocalLLaMAI Gave Claude Access to My Desktop Outlook Without Touching the Microsoft APITowards AIHermes agent might be the best open source agent for local models right nowReddit r/LocalLLaMABanning All Anthropic EmployeesHacker News
AI NEWS HUBbyEIGENVECTOREigenvector

SkinGPT-X: A Self-Evolving Collaborative Multi-Agent System for Transparent and Trustworthy Dermatological Diagnosis

arXivby [Submitted on 27 Mar 2026]March 30, 20262 min read1 views
Source Quiz

arXiv:2603.26122v1 Announce Type: cross Abstract: While recent advancements in Large Language Models have significantly advanced dermatological diagnosis, monolithic LLMs frequently struggle with fine-grained, large-scale multi-class diagnostic tasks and rare skin disease diagnosis owing to training data sparsity, while also lacking the interpretability and traceability essential for clinical reasoning. Although multi-agent systems can offer more transparent and explainable diagnostics, existing frameworks are primarily concentrated on Visual Question Answering and conversational tasks, and th — Zhangtianyi Chen, Yuhao Shen, Florensia Widjaja, Yan Xu, Liyuan Sun, Zijian Wang, Hongyi Chen, Wufei Dai, Juexiao Zhou

View PDF HTML (experimental)

Abstract:While recent advancements in Large Language Models have significantly advanced dermatological diagnosis, monolithic LLMs frequently struggle with fine-grained, large-scale multi-class diagnostic tasks and rare skin disease diagnosis owing to training data sparsity, while also lacking the interpretability and traceability essential for clinical reasoning. Although multi-agent systems can offer more transparent and explainable diagnostics, existing frameworks are primarily concentrated on Visual Question Answering and conversational tasks, and their heavy reliance on static knowledge bases restricts adaptability in complex real-world clinical settings. Here, we present SkinGPT-X, a multimodal collaborative multi-agent system for dermatological diagnosis integrated with a self-evolving dermatological memory mechanism. By simulating the diagnostic workflow of dermatologists and enabling continuous memory evolution, SkinGPT-X delivers transparent and trustworthy diagnostics for the management of complex and rare dermatological cases. To validate the robustness of SkinGPT-X, we design a three-tier comparative experiment. First, we benchmark SkinGPT-X against four state-of-the-art LLMs across four public datasets, demonstrating its state-of-the-art performance with a +9.6% accuracy improvement on DDI31 and +13% weighted F1 gain on Dermnet over the state-of-the-art model. Second, we construct a large-scale multi-class dataset covering 498 distinct dermatological categories to evaluate its fine-grained classification capabilities. Finally, we curate the rare skin disease dataset, the first benchmark to address the scarcity of clinical rare skin diseases which contains 564 clinical samples with eight rare dermatological diseases. On this dataset, SkinGPT-X achieves a +9.8% accuracy improvement, a +7.1% weighted F1 improvement, a +10% Cohen's Kappa improvement.

Subjects:

Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

Cite as: arXiv:2603.26122 [cs.CV]

(or arXiv:2603.26122v1 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2603.26122

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Zhangtianyi Chen [view email] [v1] Fri, 27 Mar 2026 07:14:05 UTC (5,820 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
SkinGPT-X: …researchpaperarxivaiartificial-…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 210 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers