Models llama model language model announce open-source application

AutoPK: Leveraging LLMs and a Hybrid Similarity Metric for Advanced Retrieval of Pharmacokinetic Data from Complex Tables and Documents

arXiv cs.DBby [Submitted on 26 Sep 2025 (v1), last revised 2 Apr 2026 (this version, v2)]April 3, 20262 min read1 views

arXiv:2510.00039v2 Announce Type: replace Abstract: Pharmacokinetics (PK) plays a critical role in drug development and regulatory decision-making for human and veterinary medicine, directly affecting public health through drug safety and efficacy assessments. However, PK data are often embedded in complex, heterogeneous tables with variable structures and inconsistent terminologies, posing significant challenges for automated PK data retrieval and standardization. AutoPK, a novel two-stage framework for accurate and scalable extraction of PK data from complex scientific tables. In the first stage, AutoPK identifies and extracts PK parameter variants using large language models (LLMs), a hybrid similarity metric, and LLM-based validation. The second stage filters relevant rows, converts th

View PDF HTML (experimental)

Abstract:Pharmacokinetics (PK) plays a critical role in drug development and regulatory decision-making for human and veterinary medicine, directly affecting public health through drug safety and efficacy assessments. However, PK data are often embedded in complex, heterogeneous tables with variable structures and inconsistent terminologies, posing significant challenges for automated PK data retrieval and standardization. AutoPK, a novel two-stage framework for accurate and scalable extraction of PK data from complex scientific tables. In the first stage, AutoPK identifies and extracts PK parameter variants using large language models (LLMs), a hybrid similarity metric, and LLM-based validation. The second stage filters relevant rows, converts the table into a key-value text format, and uses an LLM to reconstruct a standardized table. Evaluated on a real-world dataset of 605 PK tables, including captions and footnotes, AutoPK shows significant improvements in precision and recall over direct LLM baselines. For instance, AutoPK with LLaMA 3.1-70B achieved an F1-score of 0.92 on half-life and 0.91 on clearance parameters, outperforming direct use of LLaMA 3.1-70B by margins of 0.10 and 0.21, respectively. Smaller models such as Gemma 3-27B and Phi 3-12B with AutoPK achieved 2-7 fold F1 gains over their direct use, with Gemma's hallucination rates reduced from 60-95% down to 8-14%. Notably, AutoPK enabled open-source models like Gemma 3-27B to outperform commercial systems such as GPT-4o Mini on several PK parameters. AutoPK enables scalable and high-confidence PK data extraction, making it well-suited for critical applications in veterinary pharmacology, drug safety monitoring, and public health decision-making, while addressing heterogeneous table structures and terminology and demonstrating generalizability across key PK parameters. Code and data: this https URL

Comments: Published in IEEE ICTAI 2025

Subjects:

Databases (cs.DB); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)

Cite as: arXiv:2510.00039 [cs.DB]

(or arXiv:2510.00039v2 [cs.DB] for this version)

https://doi.org/10.48550/arXiv.2510.00039

arXiv-issued DOI via DataCite

Submission history

From: Hossein Sholehrasa [view email] [v1] Fri, 26 Sep 2025 22:05:32 UTC (709 KB) [v2] Thu, 2 Apr 2026 17:48:52 UTC (706 KB)

Original source

arXiv cs.DB

https://arxiv.org/abs/2510.00039

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

llamamodellanguage model

ProductsLive

China reveals military capabilities in new space solar power plant design

A senior Chinese scientist has outlined the potential military applications of space-based solar power technology, offering a rare glimpse into how energy beamed from orbit could also support surveillance and electronic warfare. Duan Baoyan, a leading architect of China’s “Zhuri” space solar power initiative, wrote in a paper published in Scientia Sinica Informationis last month, that his team had revamped the design of the giant orbital infrastructure. In addition to energy transmission, the...

SCMP Tech (Asia AI)

1m26 minutes ago

ModelsLive

Tiny AIs, Finally Ready? Toward Affordable AIs.

Bonsai’s 1-bit model is worth talking about Continue reading on Medium »

Medium AI

1m26 minutes ago

ReleasesLive

My AI Pendant Turned Voice Memos Into Two Shipped Projects

Voice journalling with an AI wearable pendant and structured prompts turned my scattered thinking into two shipped open-source projects. Continue reading on Medium »

Medium AI

1m23 minutes ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 146 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

ModelsLive

I let Gemini in Google Maps plan my day and it went surprisingly well

You may be familiar with Gemini as the thing that's in every Google service you use - whether you want it or not. While it's been a constant, sometimes unwelcome presence in Gmail for at least the past year, it's a relatively new addition to Maps. And you know what? It's kind of great. To [ ]

The Verge AI

1m26 minutes ago

ModelsLive

What Should You Learn to Build Something Like ChatGPT?

Artificial Intelligence is one of the fastest-growing technologies in the world today. Many students and developers dream of building… Continue reading on Medium »

Medium AI

1m30 minutes ago

ModelsLive

Tiny AIs, Finally Ready? Toward Affordable AIs.

Bonsai’s 1-bit model is worth talking about Continue reading on Medium »

Medium AI

1m26 minutes ago

ModelsLive

Unlocking the Depths of Acting: A Journey Through Methodologies

Unlocking the Depths of Acting: A Journey Through Methodologies Acting is often perceived as a simple act of imitation or surface-level performance. Many believe that to act is merely to mimic emotions or behaviors seen in others. This misconception can lead to a shallow understanding of what it truly means to embody a character. However, effective acting training relies on structured methodologies that delve far beyond the superficial. It is through these techniques that actors cultivate a character's inner truth and external behavior, transforming mere performance into a profound art form. The Misconceptions of Acting Before we dive into the methodologies, it’s essential to address common misconceptions surrounding the craft: Acting is Just Mimicry : Many assume that acting is merely abo

DEV Community

4mabout 1 hour ago