Evaluating the Validity of Simplified Chinese Version of LIWC in Detecting Psychological Expressions in Short Texts on Social Network Services

Nan Zhao, Dongdong Jiao, Shuotian Bai, Tingshao Zhu, Nan Zhao, Dongdong Jiao, Shuotian Bai, Tingshao Zhu

Abstract

The increasing need of automated analyzing web texts especially the short texts on Social Network Services (SNS) brings new demands of computerized text analysis instruments. The psychometric properties are the basis of the extensive use of these instruments such as the Linguistic Inquiry and Word Count (LIWC). For this study, Sina Weibo statuses were analyzed via rater coding and Simplified Chinese version of LIWC (SCLIWC), in order to evaluate the validity of SCLIWC in detecting psychological expressions in Weibo statuses (n = 60) and in identifying the psychological meaning of a single Weibo status (n = 11). Significant correlations between human ratings and SCLIWC scores and the high sensitivities of capturing single statuses with certain expressions identified by raters, proved the validity of SCLIWC in detecting psychological expressions. The results also suggested that, the efficiency of SCLIWC in detecting psychological expressions of SNS short texts could be higher if using status count scoring method, rather than the word count method as the common usage of LIWC. However, SCLIWC may not perform well in identifying the psychological meaning of a single piece of SNS short text because of its over-identification of target expressions. This study provided primary evidence of validity of SCLIWC, as well as the proper way of using it efficiently on SNS short texts.

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

References

    1. Pennebaker JW, Mehl MR, Niederhoffer KG. Psychological aspects of natural language use: Our words, our selves. Annual review of psychology 2003; 54(1):547–577. DOI: 10.1146/annurev.psych.54.101601.145041
    1. Zhong N, Yau SS, Ma J, Shimojo S, Just M, Hu B, et al. Brain Informatics-Based Big Data and the Wisdom Web of Things. Intelligent Systems, IEEE 2015; 30(5):2–7. DOI: 10.1109/MIS.2015.83
    1. Stone PJ, Dunphy DC, Smith MS, Ogilvie DM. The general inquirer: A computer approach to content analysis. Cambridge: MIT Press; 1966.
    1. Miller G, Beckwith R, Felbaum C, Gross D, Miller K. Introduction to WordNet: An on-line lexical database. Oxford: Oxford University Press; 1993.
    1. Wilson T, Hoffmann P, Somasundaran S, Kessler J, Wiebe J, Choi Y, et al. OpinionFinder: A system for subjectivity analysis. In Proceedings of hlt/emnlp on interactive demonstrations. Association for Computational Linguistics 2005; 34–35. DOI: 10.3115/1225733.1225751
    1. Pennebaker JW, Booth RJ, Francis ME. Linguistic Inquiry and Word Count: LIWC [Computer software]. 2007; Available: .
    1. Pennebaker JW, Francis ME. Cognitive, emotional, and language processes in disclosure. Cognition & Emotion 1996; 10(6):601–626. DOI:10.1080/026999396380079
    1. Rude S, Gortner EM, Pennebaker J. Language use of depressed and depression-vulnerable college students. Cognition & Emotion 2004; 18(8):1121–1133. DOI:10.1080/02699930441000030
    1. Golder SA, Macy MW. Diurnal and seasonal mood vary with work, sleep, and day length across diverse cultures. Science 2011; 333(6051):1878–1881. DOI: 10.1126/science.1202775
    1. Kacewicz E, Pennebaker JW, Davis M, Jeon M, Graesser AC. Pronoun use reflects standings in social hierarchies. Journal of Language and Social Psychology. 2013; 33(2):125–143. DOI: 10.1177/0261927X13502654
    1. Sexton, JB, Helmreich RL. Analyzing cockpit communications: the links between language, performance, error, and workload. Journal of Human Performance in Extreme Environments 2000; 5(1):6. DOI: 10.7771/2327-2937.1007
    1. Hancock JT, Curry LE, Goorha S, Woodworth M. On lying and being lied to: A linguistic analysis of deception in computer-mediated communication. Discourse Processes 2007; 45(1):1–23. DOI:10.1080/01638530701739181
    1. Slatcher RB, Vazire S, Pennebaker JW. Am “I” more important than “we”? Couples’ word use in instant messages. Personal Relationships 2008; 15(4):407–424. DOI: 10.1111/j.1475-6811.2008.00207.x
    1. Pennebaker JW, Chung CK, Frazee J, Lavergne, GM, Beave DI. When small words foretell academic success: The case of college admissions essays. PloS ONE 2014; 9(12):e115844. DOI:10.1371/journal.pone.0115844
    1. Capecelatro MR, Sacchet MD, Hitchcock PF, Miller SM, Britton WB. Major depression duration reduces appetitive word use: An elaborated verbal recall of emotional photographs. Journal of psychiatric research 2013; 47(6):809–815. DOI:10.1016/j.jpsychires.2013.01.022
    1. Newman ML, Groom CJ, Handelman LD, Pennebaker JW. Gender differences in language use: An analysis of 14,000 text samples. Discourse Processes 2008; 45(3):211–236. DOI:10.1080/01638530802073712
    1. Tausczik YR, Pennebaker JW. The psychological meaning of words: LIWC and computerized text analysis methods. Journal of language and social psychology 2010; 29(1):24–54. DOI: 10.1177/0261927X09351676
    1. Golbeck J, Robles C, Turner K. Predicting personality with social media. In CHI'11 extended abstracts on human factors in computing systems. ACM 2011; 253–262. DOI: 10.1145/1979742.1979614
    1. Golbeck J, Robles C, Edmondson M, Turner K. Predicting personality from twitter. In Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third Inernational Conference on Social Computing (SocialCom), 2011 IEEE Third International Conference on, 149–156. DOI: 10.1109/PASSAT/SocialCom.2011.33
    1. Mairesse F, Walker MA, Mehl MR, Moore RK. Using linguistic cues for the automatic recognition of personality in conversation and text. Journal of artificial intelligence research. 2007; 457–500. DOI:10.1613/jair.2349
    1. Chen J, Hsieh G, Mahmud JU, Nichols J. Understanding individuals' personal values from social media word use. In Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing. ACM 2014; 405–414. DOI: 10.1145/2531602.2531608
    1. Gilbert E, Karahalios K. Predicting tie strength with social media. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM 2009; 211–220. DOI: 10.1145/1518701.1518736
    1. Hao B, Li L, Li A, Zhu T. Predicting mental health status on social media. In: Cross-Cultural Design. Cultural Differences in Everyday Life. Berlin: Springer Verlag 2013; 101–110. DOI: 10.1007/978-3-642-39137-8_12
    1. Hu B, Fan J, Zhen W, Posner MI. Advances in Computational Psychophysiology. Science 2015; 350(6256): 114. 10.1126/science.350.6256.114-c.
    1. Hao B, Li L, Gao R, Li A, Zhu T. Sensing Subjective Well-Being from Social Media. In Active Media Technology. Springer International Publishing 2014; 324–335. DOI: 10.1007/978-3-319-09912-5_27
    1. Tumasjan A, Sprenger TO, Sandner PG, Welpe IM. Election forecasts with Twitter: How 140 characters reflect the political landscape. Social Science Computer Review 2010; 29(4):402–418. DOI: 10.1177/0894439310386557
    1. Alpers GW., Winzelberg AJ., Classen C., Roberts H, Dev P., Koopman C., et al. Evaluation of computerized text analysis in an Internet breast cancer support group. Computers in Human Behavior 2005; 21(2):361–376. DOI:10.1016/j.chb.2004.02.008
    1. Pennebaker JW, Chung CK, Ireland M, Gonzales A, Booth RJ. The development and psychometric properties of LIWC2007. 2007; Available: .
    1. Bantum EOC, Owen JE. Evaluating the validity of computerized content analysis programs for identification of emotional expression in cancer narratives. Psychological assessment 2009; 21(1):79. DOI: 10.1037/a0014643
    1. Iosub D, Laniado D, Castillo C, Morell MF, Kaltenbrunner A. SUPPORTING INFORMATION FOR: Emotions under Discussion: Gender, Status and Communication in Online Collaboration. PLoS ONE 2014; 9(8): e104880. DOI:10.1371/journal.pone.0104880
    1. Gao R, Hao B, Li H, Gao Y, Zhu T. Developing Simplified Chinese psychological linguistic analysis dictionary for microblog. In Brain and Health Informatics. Springer International Publishing 2013; 359–368. DOI: 10.1007/978-3-319-02753-1_36
    1. Huang CL, Chung CK, Hui N, Lin YC, Seih YT, Lam B, et al. The development of the Chinese linguistic inquiry and word count dictionary. Chinese Journal of Psychology 2012; 54(2):185–201. DOI: 10.1007/978-3-319-09912-5_27
    1. Li L, Li A, Hao B, Guan Z, Zhu T. Predicting Active Users' Personality Based on Micro-Blogging Behaviors. PloS ONE 2014; 9(1):e84997. DOI:10.1371/journal.pone.0084997
    1. Che W, Li Z, Liu T. Ltp: A chinese language technology platform. In Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations. Chicago: Association for Computational Linguistics 2010; 13–16.
    1. Green DM, Swets JA. Signal detection theory and psychophysics. New York: Wiley. 1966.
    1. Zhang HP, Zhang RQ, Zhao YP, Ma BJ. Big data modeling and analysis of microblog ecosystem. International Journal of Automation and Computing 2014; 11(2):119–127. DOI: 10.1007/s11633-014-0774-9
    1. Gao Y, Wang F, Luan H, Chua TS. Brand data gathering from live social media streams. In Proceedings of International Conference on Multimedia Retrieval. ACM 2014; 169. DOI:10.1145/2578726.2578748

Source: PubMed

3
Suscribir