PREDICTING SUICIDE RISK THROUGH MACHINE LEARNING–BASED ANALYSIS OF PATIENT NARRATIVES AND DIGITAL BEHAVIORAL MARKERS IN CLINICAL PSYCHOLOGY SETTINGS

Authors

  • Md. Akbar Hossain Master of Science in Clinical Psychology, University of Dhaka, Dhaka, Bangladesh Author
  • Farzana Ahmed Masters in School Psychology PhD candidate, Department of School Psychology, Special Education and Sociology. Indiana University of Pennsylvania, USA Author

DOI:

https://doi.org/10.63125/mqty9n77

Keywords:

Suicide Risk Stratification, Machine Learning, Logistic Regression, Explainable AI (SHAP), Psychometric Likert Constructs

Abstract

This study addresses the persistent problem that conventional suicide risk screening and clinician judgment often struggle to achieve operationally useful precision, especially when risk signals are multidimensional and base rates are imbalanced; therefore, the purpose was to develop and benchmark an explainable, data-driven risk stratification approach that integrates psychometric and clinical indicators to classify individuals into high vs low or moderate suicide-risk tiers within a bounded case context using a quantitative, cross-sectional, case-based design. The sample comprised 320 clinical cases drawn from an enterprise service setting that included both outpatient (61.3%) and emergency or acute contacts (38.7%), with a mean age of 29.8 (SD 8.7) and a high-risk prevalence of 27.8% (n = 89). Key variables included five-point Likert composite predictors capturing distress severity, hopelessness, perceived burdensomeness, psychosocial strain, sleep disturbance, impulsivity, perceived social support, and coping capacity, alongside binary clinical indicators for prior suicide attempt history and substance-use concern. Descriptively, the cohort showed elevated distress and cognitive burden (for example distress M = 3.62, SD = 0.78; hopelessness M = 3.41, SD = 0.83) with comparatively lower protective resources (social support M = 2.64, SD = 0.92; coping M = 2.71, SD = 0.88). All Likert constructs demonstrated acceptable to high reliability (Cronbach’s α range 0.80 to 0.89). Risk classification correlated positively with distress (r = .49) and hopelessness (r = .44) and negatively with social support (r = −.40) and coping (r = −.35), supporting coherent construct behavior prior to modeling. In regression, distress (OR = 2.18, p < .001) and hopelessness (OR = 1.71, p < .001) increased odds of high-risk classification while social support reduced odds (OR = 0.63, p < .001), establishing a strong interpretable benchmark (Nagelkerke R² = .41; AUC = .82). Headline performance results showed that gradient boosting outperformed the baseline with AUC = .88, sensitivity = .81, specificity = .78, precision = .56, F1 = .66, and improved calibration (Brier score = .14); importantly, it captured 46.1% of all high-risk cases within the top 10% highest-risk score band, indicating high-yield triage potential. Explainability highlighted clinically interpretable drivers, led by distress severity and hopelessness, with protective deficits such as low social support also ranking highly. These findings imply that enterprise clinical workflows can benefit from a transparent, construct-grounded stratification pipeline that improves sensitivity and concentrates risk into actionable high-risk quantiles while preserving interpretability for decision support and resource-aware triage planning.

Downloads

Published

2023-12-25

How to Cite

Md. Akbar Hossain, & Farzana Ahmed. (2023). PREDICTING SUICIDE RISK THROUGH MACHINE LEARNING–BASED ANALYSIS OF PATIENT NARRATIVES AND DIGITAL BEHAVIORAL MARKERS IN CLINICAL PSYCHOLOGY SETTINGS. Review of Applied Science and Technology , 2(04), 158–193. https://doi.org/10.63125/mqty9n77

Cited By: