Towards Privacy-Preserving Mental Health Support with Large Language Models

Large language models (LLMs) have shown promise for mental health support, yet training such models is constrained by the scarcity and sensitivity of real counseling dialogues. In this article, we present MindChat, a privacy-preserving LLM for mental health support, together with MindCorpus, a synthetic multi-turn counseling dataset constructed via a multi-agent role-playing framework. To synthesize high-quality counseling data, the developed dialogue-construction framework employs a dual closed-loop feedback design to integrate psychological expertise and counseling techniques through role-playing: (i) turn-level critique-and-revision to improve coherence and counseling appropriateness within a session, and (ii) session-level strategy refinement to progressively enrich counselor behaviors across sessions. To mitigate privacy risks under decentralized data ownership, we fine-tune the base model using federated learning with parameter-efficient LoRA adapters and incorporate differentially private optimization to reduce membership and memorization risks. Experiments on synthetic-data quality assessment and counseling capability evaluation show that MindCorpus improves training effectiveness and that MindChat is competitive with existing general and counseling-oriented LLM baselines under both automatic LLM-judge and human evaluation protocols, while exhibiting reduced privacy leakage under membership inference attacks.

Key Contributions

MindCorpus: a synthetic multi-turn counseling dataset constructed via a multi-agent role-playing framework with dual closed-loop feedback (turn-level critique-and-revision and session-level strategy refinement)
MindChat: a mental health LLM fine-tuned via federated LoRA adapters with differentially private optimization to defend against membership inference attacks on sensitive counseling data
Demonstrated reduced MIA success on the privacy-preserved model while maintaining competitive counseling capability against general and domain-specific LLM baselines

🛡️ Threat Analysis

Membership Inference Attack

The paper explicitly uses differentially private optimization to 'reduce membership and memorization risks' and evaluates the model against membership inference attacks, demonstrating reduced privacy leakage compared to baselines — a direct, evaluated defense against MIA in LLM training.

Details

Domains

nlpfederated-learning

Model Types

llmfederated

Threat Tags

training_time

Datasets

MindCorpus

Applications

2026 0 cit.

Membership Inference AttackModel Inversion Attack

61%