Research Papers research paper arxiv ai artificial-intelligence

Methods for Knowledge Graph Construction from Text Collections: Development and Applications

arXivby [Submitted on 26 Mar 2026]March 30, 20262 min read1 views

arXiv:2603.25862v1 Announce Type: cross Abstract: Virtually every sector of society is experiencing a dramatic growth in the volume of unstructured textual data that is generated and published, from news and social media online interactions, through open access scholarly communications and observational data in the form of digital health records and online drug reviews. The volume and variety of data across all this range of domains has created both unprecedented opportunities and pressing challenges for extracting actionable knowledge for several application scenarios. However, the extraction — Vanni Zavarella

View PDF HTML (experimental)

Abstract:Virtually every sector of society is experiencing a dramatic growth in the volume of unstructured textual data that is generated and published, from news and social media online interactions, through open access scholarly communications and observational data in the form of digital health records and online drug reviews. The volume and variety of data across all this range of domains has created both unprecedented opportunities and pressing challenges for extracting actionable knowledge for several application scenarios. However, the extraction of rich semantic knowledge demands the deployment of scalable and flexible automatic methods adaptable across text genres and schema specifications. Moreover, the full potential of these data can only be unlocked by coupling information extraction methods with Semantic Web techniques for the construction of full-fledged Knowledge Graphs, that are semantically transparent, explainable by design and interoperable. In this thesis, we experiment with the application of Natural Language Processing, Machine Learning and Generative AI methods, powered by Semantic Web best practices, to the automatic construction of Knowledge Graphs from large text corpora, in three use case applications: the analysis of the Digital Transformation discourse in the global news and social media platforms; the mapping and trend analysis of recent research in the Architecture, Engineering, Construction and Operations domain from a large corpus of publications; the generation of causal relation graphs of biomedical entities from electronic health records and patient-authored drug reviews. The contributions of this thesis to the research community are in terms of benchmark evaluation results, the design of customized algorithms and the creation of data resources in the form of Knowledge Graphs, together with data analysis results built on top of them.

Subjects:

Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

Cite as: arXiv:2603.25862 [cs.CL]

(or arXiv:2603.25862v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2603.25862

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Vanni Zavarella [view email] [v1] Thu, 26 Mar 2026 19:36:00 UTC (9,351 KB)

Original source

arXiv

https://arxiv.org/abs/2603.25862

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

ModelsLive

Scaling Agentic Memory to 5 Billion Vectors via Binary Quantization and Dynamic Wavelet Matrices

In a study, a new “dynamic wavelet matrix” was used as a vector database, where the memory grows only with log(σ) instead of with n. I considered building a KNN model with a huge memory, capable of holding, for example, 5 billion vectors. First, the words in the context window are converted into an embedding using deberta-v3-small. This is a fast encoder that also takes the position of the tokens into account (disentangled attention) and is responsible for the context in the model. The embedding is then converted into a bit sequence using binary quantization, where dimensions greater than 0 are converted to 1 and otherwise to 0. The advantage is that bit sequences are compressible and are entered into the dynamic wavelet matrix, where the memory grows only with log(σ). A response token is

discuss.huggingface.co

2mabout 2 hours ago

Research PapersFresh

[D] ICML reviewer making up false claim in acknowledgement, what to do?

In a rebuttal acknowledgement we received, the reviewer made up a claim that our method performs worse than baselines with some hyperparameter settings. We did do a comprehensive list of hyperparameter comparisons and the reviewer's claim is not supported by what's presented in the paper. In this case what can we do? submitted by /u/dontknowwhattoplay [link] [comments]

Reddit r/MachineLearning

1mabout 3 hours ago

ModelsLive

Anthropic Spots 'Emotion Vectors' Inside Claude That Influence AI Behavior

Researchers say internal emotion-like signals shape how large language models make decisions.

Decrypt AI

1mabout 1 hour ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 150 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

Methods for Knowledge Graph Construction from Text Collections: Development and Applications

Submission history

Daily AI Digest

More about

Scaling Agentic Memory to 5 Billion Vectors via Binary Quantization and Dynamic Wavelet Matrices

[D] ICML reviewer making up false claim in acknowledgement, what to do?

Anthropic Spots 'Emotion Vectors' Inside Claude That Influence AI Behavior

Knowledge Map

Connected Articles — Knowledge Graph

Discussion

More in Research Papers

Milton Keynes University Hospital pioneers AI to combat clinician burnout - Oracle

[D] ICML reviewer making up false claim in acknowledgement, what to do?

College grads in ‘AI-proof’ careers like psychology and education are seeing negative returns on their degrees

Researchers 3D print robot the size of a single-cell organism — devices move and navigate even without a ‘brain,’ uses their shape and the environment to get going