Live
Black Hat USAAI BusinessBlack Hat AsiaAI Business‘I’m not dumb’: Hong Kong’s London trade office manager denies running spy networkSCMP Tech (Asia AI)ciflow/torchtitan/178947: Update on "add API to check if a tensor is symm-mem-tensor"PyTorch ReleasesGoogle Panda Algorithm: Understanding Its Impact and How to Recover from Its ConsequencesDev.to AIComplete Guide to llm-d CNCF Sandbox — Kubernetes-Native Distributed LLM InferenceDev.to AIciflow/trunk/178016: simplify testPyTorch Releasesciflow/torchtitan/178016: simplify testPyTorch ReleasesI Built an AI Coloring Page Generator — Got 500+ Organic Visits in One DayDev.to AIHeated Rivalry: A Guide to the Best Books, Movies, Video Games, and Podcasts for Fans of the Hit SeriesDev.to AIWe're running an AI-authored research workshop for Northeast India's 200+ languages - and publishing everything openlyDev.to AIciflow/torchtitan/177627: UpdatePyTorch Releasesciflow/torchtitan/177621: UpdatePyTorch Releasestrunk/d52b2f548aa3cfcfcd499fcba764fccf29628de6: [inductor] Enable precompiled headers in fbcode (#178870) (#178870)PyTorch ReleasesBlack Hat USAAI BusinessBlack Hat AsiaAI Business‘I’m not dumb’: Hong Kong’s London trade office manager denies running spy networkSCMP Tech (Asia AI)ciflow/torchtitan/178947: Update on "add API to check if a tensor is symm-mem-tensor"PyTorch ReleasesGoogle Panda Algorithm: Understanding Its Impact and How to Recover from Its ConsequencesDev.to AIComplete Guide to llm-d CNCF Sandbox — Kubernetes-Native Distributed LLM InferenceDev.to AIciflow/trunk/178016: simplify testPyTorch Releasesciflow/torchtitan/178016: simplify testPyTorch ReleasesI Built an AI Coloring Page Generator — Got 500+ Organic Visits in One DayDev.to AIHeated Rivalry: A Guide to the Best Books, Movies, Video Games, and Podcasts for Fans of the Hit SeriesDev.to AIWe're running an AI-authored research workshop for Northeast India's 200+ languages - and publishing everything openlyDev.to AIciflow/torchtitan/177627: UpdatePyTorch Releasesciflow/torchtitan/177621: UpdatePyTorch Releasestrunk/d52b2f548aa3cfcfcd499fcba764fccf29628de6: [inductor] Enable precompiled headers in fbcode (#178870) (#178870)PyTorch Releases

OnCoCo 1.0: A Public Dataset for Fine-Grained Message Classification in Online Counseling Conversations

arXivMarch 31, 202610 min read0 views
Source Quiz

arXiv:2512.09804v2 Announce Type: replace-cross Abstract: This paper presents OnCoCo 1.0, a new public dataset for fine-grained message classification in online counseling. It is based on a new, integrative system of categories, designed to improve the automated analysis of psychosocial online counseling conversations. Existing category systems, predominantly based on Motivational Interviewing (MI), are limited by their narrow focus and dependence on datasets derived mainly from face-to-face counseling. This limits the detailed examination of textual counseling conversations. In response, we d — Jens Albrecht, Robert Lehmann, Aleksandra Poltermann, Eric Rudolph, Philipp Steigerwald, Mara Stieler

View PDF HTML (experimental)

Abstract:This paper presents OnCoCo 1.0, a new public dataset for fine-grained message classification in online counseling. It is based on a new, integrative system of categories, designed to improve the automated analysis of psychosocial online counseling conversations. Existing category systems, predominantly based on Motivational Interviewing (MI), are limited by their narrow focus and dependence on datasets derived mainly from face-to-face counseling. This limits the detailed examination of textual counseling conversations. In response, we developed a comprehensive new coding scheme that differentiates between 38 types of counselor and 28 types of client utterances, and created a labeled dataset consisting of about 2.800 messages from counseling conversations. We fine-tuned several models on our dataset to demonstrate its applicability. The data and models are publicly available to researchers and practitioners. Thus, our work contributes a new type of fine-grained conversational resource to the language resources community, extending existing datasets for social and mental-health dialogue analysis.

Comments: Accepted at SoCon-NLPSI@LREC 2026

Subjects:

Computation and Language (cs.CL); Machine Learning (cs.LG)

Cite as: arXiv:2512.09804 [cs.CL]

(or arXiv:2512.09804v2 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2512.09804

arXiv-issued DOI via DataCite

Submission history

From: Jens Albrecht [view email] [v1] Wed, 10 Dec 2025 16:18:20 UTC (69 KB) [v2] Sun, 29 Mar 2026 13:07:02 UTC (71 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
OnCoCo 1.0:…researchpaperarxivmachine-lea…deep-learni…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 166 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers