Validating Machine Learning Algorithms for Twitter Data Against Established Measures of Suicidality

Scott R Braithwaite, Christophe Giraud-Carrier, Josh West, Michael D Barnes, Carl Lee Hanson, Scott R Braithwaite, Christophe Giraud-Carrier, Josh West, Michael D Barnes, Carl Lee Hanson

Abstract

Background: One of the leading causes of death in the United States (US) is suicide and new methods of assessment are needed to track its risk in real time.

Objective: Our objective is to validate the use of machine learning algorithms for Twitter data against empirically validated measures of suicidality in the US population.

Methods: Using a machine learning algorithm, the Twitter feeds of 135 Mechanical Turk (MTurk) participants were compared with validated, self-report measures of suicide risk.

Results: Our findings show that people who are at high suicidal risk can be easily differentiated from those who are not by machine learning algorithms, which accurately identify the clinically significant suicidal rate in 92% of cases (sensitivity: 53%, specificity: 97%, positive predictive value: 75%, negative predictive value: 93%).

Conclusions: Machine learning algorithms are efficient in differentiating people who are at a suicidal risk from those who are not. Evidence for suicidality can be measured in nonclinical populations using social media data.

Keywords: machine learning; social media; suicide; twitter.

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

Figure 1
Figure 1
Result from Decision Tree Learning Algorithm.

References

    1. Centers for Disease Control and Prevention. 2015. [2016-04-22]. Fatal injury data .
    1. World Health Organization . Prevention suicide: a global imperative. Geneva, Switzerland: World Health Organization; 2014.
    1. Gaynes BN, West SL, Ford CA, Frame P, Klein J, Lohr KN, U.S. Preventive Services Task Force Screening for suicide risk in adults: a summary of the evidence for the U.S. Preventive Services Task Force. Ann Intern Med. 2004 May 18;140(10):822–35.140/10/822
    1. Hayashi H, Asaga A, Sakudoh M, Hoshino S, Katsuta S, Akine Y. [Linac based radiosurgery; a technical report] No Shinkei Geka. 1992 Jul;20(7):769–73.
    1. Scott M, Wilcox H, Huo Y, Turner JB, Fisher P, Shaffer D. School-based screening for suicide risk: balancing costs and benefits. Am J Public Health. 2010 Sep;100(9):1648–52. doi: 10.2105/AJPH.2009.175224.AJPH.2009.175224
    1. Peña JB, Caine ED. Screening as an approach for adolescent suicide prevention. Suicide Life Threat Behav. 2006 Dec;36(6):614–37. doi: 10.1521/suli.2006.36.6.614.
    1. Schwartz HA, Ungar LH. Data-Driven Content Analysis of Social Media: A Systematic Overview of Automated Methods. The ANNALS of the American Academy of Political and Social Science. 2015 Apr 09;659(1):78–94. doi: 10.1177/0002716215569197.
    1. Aslam AA, Tsou M, Spitzberg BH, An L, Gawron JM, Gupta DK, Peddecord KM, Nagel AC, Allen C, Yang J, Lindsay S. The reliability of tweets as a supplementary method of seasonal influenza surveillance. J Med Internet Res. 2014;16(11):e250. doi: 10.2196/jmir.3532. v16i11e250
    1. Paul M, Dredze M. You are what you tweet: analyzing Twitter for public health. Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media; 2011; Barcelona, Spain. 2011.
    1. Eichstaedt JC, Schwartz HA, Kern ML, Park G, Labarthe DR, Merchant RM, Jha S, Agrawal M, Dziurzynski LA, Sap M, Weeg C, Larson EE, Ungar LH, Seligman Martin E P Psychological language on Twitter predicts county-level heart disease mortality. Psychol Sci. 2015 Feb;26(2):159–69. doi: 10.1177/0956797614557867. 0956797614557867
    1. West JH, Hall PC, Hanson CL, Prier K, Giraud-Carrier C, Neeley ES, Barnes MD. Temporal variability of problem drinking on Twitter. OJPM. 2012;02(01):43–48. doi: 10.4236/ojpm.2012.21007.
    1. Hanson CL, Cannon B, Burton S, Giraud-Carrier C. An exploration of social circles and prescription drug abuse through Twitter. J Med Internet Res. 2013;15(9):e189. doi: 10.2196/jmir.2741. v15i9e189
    1. Hanson CL, Burton SH, Giraud-Carrier C, West JH, Barnes MD, Hansen B. Tweaking and tweeting: exploring Twitter for nonmedical use of a psychostimulant drug (Adderall) among college students. J Med Internet Res. 2013;15(4):e62. doi: 10.2196/jmir.2503. v15i4e62
    1. Prier K, Smith M, Giraud-Carrier C, Hanson C. Identifying health related topics on twitter: An exploration of tobacco-related tweets as a test topic. Proceedings of the 4th International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction; March 29-31, 2011; College Park, MD. 2011.
    1. Eysenbach G. Infodemiology and infoveillance: framework for an emerging set of public health informatics methods to analyze search, communication and publication behavior on the Internet. J Med Internet Res. 2009;11(1):e11. doi: 10.2196/jmir.1157. v11i1e11
    1. Eysenbach G. Infodemiology and infoveillance tracking online health information and cyberbehavior for public health. Am J Prev Med. 2011 May;40(5 Suppl 2):S154–8. doi: 10.1016/j.amepre.2011.02.006.S0749-3797(11)00088-2
    1. Cummins N, Scherer S, Krajewski J, Schnieder S, Epps J, Quatieri TF. A review of depression and suicide risk assessment using speech analysis. Speech Communication. 2015 Jul;71:10–49. doi: 10.1016/j.specom.2015.03.004.
    1. De Choudhury M, Counts S, Horvitz E. Social media as a measurement tool of depression in populations. Proceedings of the 5th Annual ACM Web Science Conference; March 2-4, 2013; Paris, France. 2013.
    1. Jashinsky J, Burton SH, Hanson CL, West J, Giraud-Carrier C, Barnes MD, Argyle T. Tracking suicide risk factors through Twitter in the US. Crisis. 2014;35(1):51–9. doi: 10.1027/0227-5910/a000234.334K5X21L0436430
    1. Coppersmith G, Dredze M, Harman C, Hollingshead K, Mitchell M. CLPsych 2015 shared task: Depression and PTSD on Twitter. Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology; May 31-June 5, 2015; Denver, Colorado. 2015.
    1. Huang X, Zhang L, Chiu D, Liu T, Li X, Zhu T. Detecting suicidal ideation in Chinese microblogs with psychological lexicons. Proceedings of the 11th International Conference on Ubiquitous Intelligence and Computing and 11th International Conference on Autonomic and Trusted Computing and 14th International Conference on Scalable Computing and Communications; December 9-12, 2014; Bali, Indonesia. 2014.
    1. O'Dea B, Wan S, Batterham PJ, Calear AL, Paris C, Christensen H. Detecting suicidality on Twitter. Internet Interventions. 2015 May;2(2):183–188. doi: 10.1016/j.invent.2015.03.005.
    1. Guan L, Hao B, Cheng Q, Yip PS, Zhu T. Identifying Chinese Microblog Users With High Suicide Probability Using Internet-Based Profile and Linguistic Features: Classification Model. JMIR Mental Health. 2015 May 12;2(2):e17. doi: 10.2196/mental.4227.
    1. Sueki H. The association of suicide-related Twitter use with suicidal behaviour: a cross-sectional study of young internet users in Japan. J Affect Disord. 2015 Jan 1;170:155–60. doi: 10.1016/j.jad.2014.08.047.S0165-0327(14)00536-9
    1. Kumar M, Dredze M, Coppersmith G, De CM. Detecting changes in suicide content manifested in social media following celebrity suicides. Proceedings of the 26th ACM Conference on Hypertext & Social Media; 2015; KalKanli, Cyprus. 2015.
    1. Thompson P, Bryan C, Poulin C. Predicting military and veteran suicide risk: Cultural aspects. Predicting military and veteran suicide risk: Cultural aspects; Proceedings of the Workshop on Computational Linguistics and Clinical Psychology; June 27, 2014; Baltimore, Maryland. 2014.
    1. Coppersmith G, Dredze M, Harman C. Quantifying Mental Health Signals on Twitter. Proceedings of the Workshop on Computational Linguistics and Clinical Psychology; June 27, 2014; Baltimore, Maryland. 2014.
    1. Schwartz H, Eichstaedt J, Kern M, Park G, Sap M, Stillwell D, Kosinski M, Ungar L. Towards Assessing Changes in Degree of Depression through Facebook. Proceedings of the Workshop on Computational Linguistics and Clinical Psychology; June 27, 2014; Baltimore, Maryland. 2014.
    1. Mitchell M, Hollingshead K, Coppersmith G. Quantifying the Language of Schizophrenia in Social Media. Proceedings of the Workshop on Computational Linguistics and Clinical Psychology; May 31-June 5, 2015; Denver Colorado. 2015.
    1. Pedersen T. Screening Twitter Users for Depression and PTSD with Lexical Decision Lists. Proceedings of the Workshop on Computational Linguistics and Clinical Psychology; May 31-June 5, 2015; Denver, Colorado. 2015.
    1. Robinson J, Cox G, Bailey E, Hetrick S, Rodrigues M, Fisher S, Herrman H. Social media and suicide prevention: a systematic review. Early Interv Psychiatry. 2016 Apr;10(2):103–21. doi: 10.1111/eip.12229.
    1. Joiner TE, Pfaff JJ, Acres JG. A brief screening tool for suicidal symptoms in adolescents and young adults in general health settings: reliability and validity data from the Australian National General Practice Youth Suicide Prevention Project. Behav Res Ther. 2002 Apr;40(4):471–81.
    1. Van Orden Kimberly A. Witte TK, Cukrowicz KC, Braithwaite SR, Selby EA, Joiner TE. The interpersonal theory of suicide. Psychol Rev. 2010 Apr;117(2):575–600. doi: 10.1037/a0018697. 2010-06891-010
    1. Van Orden Kimberly A. Witte TK, Gordon KH, Bender TW, Joiner TE. Suicidal desire and the capability for suicide: tests of the interpersonal-psychological theory of suicidal behavior among adults. J Consult Clin Psychol. 2008 Feb;76(1):72–83. doi: 10.1037/0022-006X.76.1.72.2008-00950-010
    1. Peer E, Vosgerau J, Acquisti A. Reputation as a sufficient condition for data quality on Amazon Mechanical Turk. Behav Res Methods. 2014 Dec;46(4):1023–31. doi: 10.3758/s13428-013-0434-y.
    1. Pennebaker J, Boyd R, Jordan K, Blackburn K. The development and psychometric properties of LIWC. 2015. [2016-04-22]. .
    1. Golder SA, Macy MW. Diurnal and seasonal mood vary with work, sleep, and daylength across diverse cultures. Science. 2011 Sep 30;333(6051):1878–81. doi: 10.1126/science.1202775. 333/6051/1878
    1. De Choudhury M, Counts S, Horvitz E. Major life changes and behavioral markers in social media: case of childbirth. Proceedings of the Conference on Computer Supported Cooperative Work; February 23-27, 2013; San Antonio, Texas. 2013.
    1. Stirman SW, Pennebaker JW. Word use in the poetry of suicidal and nonsuicidal poets. Psychosom Med. 2001;63(4):517–22.
    1. Garcia-Caballero A, Jimenez J, Fernandez-Cabana M, Garcia-Lado I. Last Words: An LIWC Analysis of Suicide Notes from Spain. Eur Psychiat. 2012;27
    1. Fernández-Cabana M, García-Caballero A, Alves-Pérez MT, García-García MJ, Mateos R. Suicidal traits in Marilyn Monroe's Fragments: an LIWC analysis. Crisis. 2013;34(2):124–30. doi: 10.1027/0227-5910/a000183.95466276L4769512
    1. Homan C, Johar R, Liu T, Lytle M, Silenzio V, Ovesdotter AC. Toward macro-insights for suicide prevention: Analyzing fine-grained distress at scale. Proceedings of the Workshop on Computational Linguistics and Clinical Psychology; June 22-27, 2014; Baltimore, Maryland. 2014.
    1. Coppersmith G, Dredze M, Harman C, Hollingshead K. From ADHD to SAD: Analyzing the language of mental health on Twitter through self-reported diagnoses. Proceedings of the Workshop on Computational Linguistics and Clinical Psychology; May 31-June 5, 2015; Denver, Colorado. 2015.
    1. Kang R, Brown S, Dabbish L, Kielser S. Privacy attitudes of Mechanical Turk workers and the U.S. public. Proceedings of the Symposium on Usable Privacy and Security (SOUPS); July 9-11, 2014; Melo Park, California. 2014.
    1. Silenzio Vincent M B. Duberstein PR, Tang W, Lu N, Tu X, Homan CM. Connecting the invisible dots: reaching lesbian, gay, and bisexual adolescents and young adults at risk for suicide through online social networks. Soc Sci Med. 2009 Aug;69(3):469–74. doi: 10.1016/j.socscimed.2009.05.029. S0277-9536(09)00325-6
    1. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E. Scikit-learn: Machine Learning in Python. JMLR. 2011;12:2825–2830.
    1. Breiman L, Friedman J, Stone CJ, Olshen RA. Classification and regression trees. New York: Chapman & Hall; 1984.
    1. Poulin C, Shiner B, Thompson P, Vepstas L, Young-Xu Y, Goertzel B, Watts B, Flashman L, McAllister T. Predicting the risk of suicide by analyzing the text of clinical notes. PloS ONE. 2014;9(1):e85733. doi: 10.1371/journal.pone.0091602.
    1. Sueki H. The association of suicide-related Twitter use with suicidal behaviour: a cross-sectional study of young internet users in Japan. J Affect Disord. 2015 Jan 1;170:155–60. doi: 10.1016/j.jad.2014.08.047.S0165-0327(14)00536-9
    1. Coie JD, Watt NF, West SG, Hawkins JD, Asarnow JR, Markman HJ, Ramey SL, Shure MB, Long B. The science of prevention. A conceptual framework and some directions for a national research program. Am Psychol. 1993 Oct;48(10):1013–22.
    1. Beck AT, Brown GK, Steer RA, Kuyken W, Grisham J. Psychometric properties of the Beck Self-Esteem Scales. Behav Res Ther. 2001 Jan;39(1):115–24.S0005-7967(00)00028-0
    1. Baumeister RF. Suicide as escape from self. Psychol Rev. 1990;97(1):90–113. doi: 10.1037//0033-295X.97.1.90.
    1. Canetto SS, Lester D. Love and achievement motives in women's and men's suicide notes. J Psychol. 2002 Sep;136(5):573–6. doi: 10.1080/00223980209605552.
    1. Hull-Blanks EE, Kerr BA, Robinson Kurpius Sharon E Risk factors of suicidal ideations and attempts in talented, at-risk girls. Suicide Life Threat Behav. 2004;34(3):267–76. doi: 10.1521/suli.34.3.267.42782.
    1. Klibert J, Langhinrichsen-Rohling J, Luna A, Robichaux M. Suicide proneness in college students: relationships with gender, procrastination, and achievement motivation. Death Stud. 2011 Aug;35(7):625–45.
    1. Lewis SA, Johnson J, Cohen P, Garcia M, Velez CN. Attempted suicide in youth: its relationship to school achievement, educational goals, and socioeconomic status. J Abnorm Child Psychol. 1988 Aug;16(4):459–71.
    1. Hauser DJ, Schwarz N. Attentive Turkers: MTurk participants perform better on online attention checks than do subject pool participants. Behav Res Methods. 2016 Mar;48(1):400–7. doi: 10.3758/s13428-015-0578-z.10.3758/s13428-015-0578-z
    1. Buhrmester M, Kwang T, Gosling SD. Amazon's Mechanical Turk: A New Source of Inexpensive, Yet High-Quality, Data? Perspectives on Psychological Science. 2011 Feb 03;6(1):3–5. doi: 10.1177/1745691610393980.
    1. Motto JA, Bostrom AG. A randomized controlled trial of postcrisis suicide prevention. Psychiatr Serv. 2001 Jun;52(6):828–33.
    1. Clay R. Monitor on Psychology. [2016-04-22]. Suicide prevention is a top White House prioirty .

Source: PubMed

3
Abonnere