Products model announce application platform paper arxiv

Detecting Toxic Language: Ontology and BERT-based Approaches for Bulgarian Text

arXiv cs.CLby [Submitted on 2 Apr 2026]April 4, 20261 min read1 views

arXiv:2604.01745v1 Announce Type: new Abstract: Toxic content detection in online communication remains a significant challenge, with current solutions often inadvertently blocking valuable information, including medical terms and text related to minority groups. This paper presents a more nu-anced approach to identifying toxicity in Bulgarian text while preserving access to essential information. The research explores two distinct methodologies for detecting toxic content. The developed methodologies have po-tential applications across diverse online platforms and content moderation systems. First, we propose an ontology that models the potentially toxic words in Bulgarian language. Then, we compose a dataset that comprises 4,384 manually anno-tated sentences from Bulgarian online forums

View PDF

Abstract:Toxic content detection in online communication remains a significant challenge, with current solutions often inadvertently blocking valuable information, including medical terms and text related to minority groups. This paper presents a more nu-anced approach to identifying toxicity in Bulgarian text while preserving access to essential information. The research explores two distinct methodologies for detecting toxic content. The developed methodologies have po-tential applications across diverse online platforms and content moderation systems. First, we propose an ontology that models the potentially toxic words in Bulgarian language. Then, we compose a dataset that comprises 4,384 manually anno-tated sentences from Bulgarian online forums across four categories: toxic language, medical terminology, non-toxic lan-guage, and terms related to minority communities. We then train a BERT-based model for toxic language classification, which reaches a 0.89 F1 macro score. The trained model is directly applicable in a real environment and can be integrated as a com-ponent of toxic content detection systems.

Subjects:

Computation and Language (cs.CL)

Cite as: arXiv:2604.01745 [cs.CL]

(or arXiv:2604.01745v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2604.01745

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Melania Berbatova [view email] [v1] Thu, 2 Apr 2026 08:06:26 UTC (600 KB)

Original source

arXiv cs.CL

https://arxiv.org/abs/2604.01745

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modelannounceapplication

ModelsFresh

Architecture and Orchestration of Memory Systems in AI Agents

The evolution of artificial intelligence from stateless models to autonomous, goal-driven agents depends heavily on advanced memory architectures. While Large Language Models (LLMs) possess strong reasoning abilities and vast embedded knowledge, they lack persistent memory, making them unable to retain past interactions or adapt over time. This limitation leads to repeated context injection, increasing token [ ] The post Architecture and Orchestration of Memory Systems in AI Agents appeared first on Analytics Vidhya .

Analytics Vidhya

1mabout 8 hours ago

Products

AI agent in healthcare: applications, evaluations, and future directions - Nature

AI agent in healthcare: applications, evaluations, and future directions Nature

GNews AI healthcare

1mabout 1 month ago

Analyst NewsLive

My forays into cyborgism: theory, pt. 1

In this post, I share the thinking that lies behind the Exobrain system I have built for myself. In another post, I'll describe the actual system. I think the standard way of relating to LLM/AIs is as an external tool (or "digital mind") that you use and/or collaborate with. Instead of you doing the coding, you ask the LLM to do it for you. Instead of doing the research, you ask it to. That's great, and there is utility in those use cases. Now, while I hardly engage in the delusion that humans can have some kind of long-term symbiotic integration with AIs that prevents them from replacing us [1] , in the short term, I think humans can automate, outsource, and augment our thinking with LLM/AIs. We already augment our cognition with technologies such as writing and mundane software. Organizi

LessWrong

10m43 minutes ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 170 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!