Enhancing Online Support Group Formation Using Topic Modeling Techniques
Online health communities (OHCs) are vital for fostering peer support and improving health outcomes. Support groups within these platforms can provide more personalized and cohesive peer support, yet traditional support group formation methods face challenges related to scalability, static categorization, and insufficient personalization. To overcome these limitations, we propose two novel machine learning models for automated support group formation: the Group specific Dirichlet Multinomial Regression (gDMR) and the Group specific Structured Topic Model (gSTM). These models integrate user gen — Pronob Kumar Barman, Tera L. Reynolds, James Foulds
View PDF HTML (experimental)
Abstract:Online health communities (OHCs) are vital for fostering peer support and improving health outcomes. Support groups within these platforms can provide more personalized and cohesive peer support, yet traditional support group formation methods face challenges related to scalability, static categorization, and insufficient personalization. To overcome these limitations, we propose two novel machine learning models for automated support group formation: the Group specific Dirichlet Multinomial Regression (gDMR) and the Group specific Structured Topic Model (gSTM). These models integrate user generated textual content, demographic profiles, and interaction data represented through node embeddings derived from user networks to systematically automate personalized, semantically coherent support group formation. We evaluate the models on a large scale dataset from this http URL, comprising over 2 million user posts. Both models substantially outperform baseline methods including LDA, DMR, and STM in predictive accuracy (held out log likelihood), semantic coherence (UMass metric), and internal group consistency. The gDMR model yields group covariates that facilitate practical implementation by leveraging relational patterns from network structures and demographic data. In contrast, gSTM emphasizes sparsity constraints to generate more distinct and thematically specific groups. Qualitative analysis further validates the alignment between model generated groups and manually coded themes, showing the practical relevance of the models in informing groups that address diverse health concerns such as chronic illness management, diagnostic uncertainty, and mental health. By reducing reliance on manual curation, these frameworks provide scalable solutions that enhance peer interactions within OHCs, with implications for patient engagement, community resilience, and health outcomes.
Subjects:
Information Retrieval (cs.IR); Machine Learning (stat.ML)
Cite as: arXiv:2603.24765 [cs.IR]
(or arXiv:2603.24765v1 [cs.IR] for this version)
https://doi.org/10.48550/arXiv.2603.24765
arXiv-issued DOI via DataCite
Submission history
From: Pronob Kumar Barman [view email] [v1] Wed, 25 Mar 2026 19:37:23 UTC (2,473 KB) [v2] Sat, 28 Mar 2026 02:00:34 UTC (2,473 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxivScientists uncover the brain’s hidden learning blocks
Princeton researchers found that the brain excels at learning because it reuses modular “cognitive blocks” across many tasks. Monkeys switching between visual categorization challenges revealed that the prefrontal cortex assembles these blocks like Legos to create new behaviors. This flexibility explains why humans learn quickly while AI models often forget old skills. The insights may help build better AI and new clinical treatments for impaired cognitive adaptability.
This tiny implant sends secret messages to the brain
Researchers have built a fully implantable device that sends light-based messages directly to the brain. Mice learned to interpret these artificial patterns as meaningful signals, even without touch, sight, or sound. The system uses up to 64 micro-LEDs to create complex neural patterns that resemble natural sensory activity. It could pave the way for next-generation prosthetics and new therapies.
AI finds a hidden stress signal inside routine CT scans
Researchers used a deep learning AI model to uncover the first imaging-based biomarker of chronic stress by measuring adrenal gland volume on routine CT scans. This new metric, the Adrenal Volume Index, correlates strongly with cortisol levels, allostatic load, perceived stress, and even long-term cardiovascular outcomes, including heart failure risk.
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers
The breakthrough that makes robot faces feel less creepy
Humans pay enormous attention to lips during conversation, and robots have struggled badly to keep up. A new robot developed at Columbia Engineering learned realistic lip movements by watching its own reflection and studying human videos online. This allowed it to speak and sing with synchronized facial motion, without being explicitly programmed. Researchers believe this breakthrough could help robots finally cross the uncanny valley.
Unbreakable? Researchers warn quantum computers have serious security flaws
Quantum computers could revolutionize everything from drug discovery to business analytics—but their incredible power also makes them surprisingly vulnerable. New research from Penn State warns that today’s quantum machines are not just futuristic tools, but potential gold mines for hackers. The study reveals that weaknesses can exist not only in software, but deep within the physical hardware itself, where valuable algorithms and sensitive data may be exposed.
Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!