Models llama model language model announce application service

Performance Evaluation of LLMs in Automated RDF Knowledge Graph Generation

arXiv cs.IRby [Submitted on 6 Feb 2026]April 1, 20262 min read1 views

arXiv:2603.29878v1 Announce Type: new Abstract: Cloud systems generate large, heterogeneous log data containing critical infrastructure, application, and security information. Transforming these logs into RDF triples enables their integration into knowledge graphs, improving interpretability, root-cause analysis, and cross-service reasoning beyond what raw logs allow. Large Language Models (LLMs) offer a promising approach to automate RDF knowledge graph generation; however, their effectiveness on complex cloud logs remains largely unexplored. In this paper, we evaluate multiple LLM architectures and prompting strategies for automated RDF extraction using a controlled framework with two pipelines for systematically processing semi-structured log data. The extraction pipeline integrates mul

View PDF

Abstract:Cloud systems generate large, heterogeneous log data containing critical infrastructure, application, and security information. Transforming these logs into RDF triples enables their integration into knowledge graphs, improving interpretability, root-cause analysis, and cross-service reasoning beyond what raw logs allow. Large Language Models (LLMs) offer a promising approach to automate RDF knowledge graph generation; however, their effectiveness on complex cloud logs remains largely unexplored. In this paper, we evaluate multiple LLM architectures and prompting strategies for automated RDF extraction using a controlled framework with two pipelines for systematically processing semi-structured log data. The extraction pipeline integrates multiple LLMs to identify relevant entities and relationships, automatically generating subject-predicate-object triples. These outputs are evaluated using a dedicated validation pipeline with both syntactic and semantic metrics to assess accuracy, completeness, and quality. Due to the lack of public ground-truth datasets, we created a reference Log-to-KG dataset from OpenStack logs using manual annotation and ontology-driven methods, enabling objective baseline. Our analysis shows that Few-Shot learning is the most effective strategy, with Llama achieving a 99.35% F1 score and 100% valid RDF output while Qwen, NuExtract, and Gemma also perform well under Few-Shot prompting, with Chain-of-Thought approaches maintaining similar accuracy. One-Shot prompting offers a lighter but effective alternative, while Zero-Shot and advanced strategies such as Tree-of-Thought, Self-Critique, and Generate-Multiple perform substantially worse. These results highlight the importance of contextual examples and prompt design for accurate RDF extraction and reveal model-specific limitations across LLM architectures.

Comments: submitted to journal

Subjects:

Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Distributed, Parallel, and Cluster Computing (cs.DC)

Cite as: arXiv:2603.29878 [cs.IR]

(or arXiv:2603.29878v1 [cs.IR] for this version)

https://doi.org/10.48550/arXiv.2603.29878

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Ionut Anghel [view email] [v1] Fri, 6 Feb 2026 06:30:35 UTC (1,170 KB)

Original source

arXiv cs.IR

https://arxiv.org/abs/2603.29878

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

llamamodellanguage model

Open Source AILive

v0.20.0-rc0: Merge pull request #42 from ollama/jmorganca/gemma4-ggml-improvements

gemma4: fix MoE fused gate_up split and multiline tool-call arg parsing

Ollama Releases

1m32 minutes ago

ModelsLive

Microsoft launches ‘mid-class’ AI model as compute limits bite

Tech giant’s AI chief says it will have the resources to build frontier systems later this year

Financial Times Tech

1mabout 1 hour ago

ReleasesLive

United adds TSA wait times to its mobile app

If you're flying United Airlines, you'll now have a better idea of when you need to get to the airport to make your flight. Yesterday the airline announced several updates to its iOS and Android mobile apps including estimated security wait times at all of United's US hub airports as a result of the ongoing [ ]

The Verge AI

1mabout 1 hour ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 174 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

Performance Evaluation of LLMs in Automated RDF Knowledge Graph Generation

Submission history

Daily AI Digest

More about

v0.20.0-rc0: Merge pull request #42 from ollama/jmorganca/gemma4-ggml-improvements

Microsoft launches ‘mid-class’ AI model as compute limits bite

United adds TSA wait times to its mobile app

Knowledge Map

Connected Articles — Knowledge Graph

Discussion

More in Models

Microsoft launches ‘mid-class’ AI model as compute limits bite

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - wsj.com

Exclusive | The Sudden Fall of OpenAI’s Most Hyped Product Since ChatGPT - wsj.com

Exclusive | The Sudden Fall of OpenAI’s Most Hyped Product Since ChatGPT - wsj.com