Communications of International Proceedings

Natural Language Processing in Healthcare: A Case Study on Depression Detection

Artificial Intelligence, Data Analytics, and Intelligent Systems: 45AI 2025

Mudasir Ahmad WANI1, Kashish Ara SHAKIL2, Talal Abdulmohsen S ALDHABAAN3, Muhammad Asim1, Ogerta ELEZAJ4 and Mohammed ELAFFENDI1

1 EIAS Data Science Lab, College of Computer and Information Sciences, Prince Sultan University, Riyadh, Saudi Arabia

2 Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, Riyadh, Saudi Arabia

3 College of Computer and Information Sciences, Prince Sultan University, Riyadh, Saudi Arabia

4 University of Birmingham, Birmingham, United Kingdom School of Computing and Digital Technology

Volume 2025 (20), Article ID 4541325, Artificial Intelligence, Data Analytics, and Intelligent Systems: 45AI 2025

https://doi.org/10.5171/2025.4541325

Abstract

With the rapid growth of electronic health records (EHRs), clinical notes, and physician summaries, healthcare systems are generating vast amounts of unstructured textual data. Unlocking meaningful insights from this information especially to support early detection of mental health conditions like depression remains a significant challenge. While Natural Language Processing (NLP) offers powerful tools to address this, there’s still a need to explore its effectiveness in real-world clinical contexts.

In this study, we apply a range of NLP techniques to clinical text to detect early signs of depression. Our pipeline includes domain-specific preprocessing steps like tokenization, lemmatization, and lexical normalization. We use TF-IDF and contextual embeddings for feature extraction, followed by classification using traditional models (Logistic Regression, Random Forest) and deep learning approaches (LSTM, ClinicalBERT).

We obtained promising results, Logistic Regression and LSTM models achieved perfect ROC-AUC scores of 1.000, with F1-scores of 0.800, reflecting strong balance between precision and recall. ClinicalBERT achieved high precision (1.000) but struggled with recall (0.400), resulting in a lower F1-score of 0.571. Random Forest, by contrast, performed poorly across most metrics. These findings show the potential of combining classic and modern NLP methods for early depression detection and suggest that even simpler models can deliver strong results with well-engineered features. We hope this work supports further efforts in building intelligent, interpretable clinical decision-support tools in mental health care.

Keywords: Natural Language Processing (NLP), Medical Text Analysis, Deep Learning, Artificial Intelligence, Clinical BERT.

Natural Language Processing in Healthcare: A Case Study on Depression Detection

Mudasir Ahmad WANI1, Kashish Ara SHAKIL2, Talal Abdulmohsen S ALDHABAAN3, Muhammad Asim1, Ogerta ELEZAJ4 and Mohammed ELAFFENDI1

https://doi.org/10.5171/2025.4541325

Abstract

+Articles

+General Information