Publications

Export 3 results:
Sort by: Title Type [ Year  (Desc)]
2016
Carpenter, Jordan, Daniel Preoţiuc-Pietro, Lucie Flekova, Salvatore Giorgi, Courtney Hagan, Margaret Kern, Anneke Buffone, Lyle Ungar, and Martin Seligman. "Real Men don’t say 'cute': Using Automatic Language Analysis to Isolate Inaccurate Aspects of Stereotypes." Social Psychological and Personality Science (2016). AbstractDraftSupplemental MaterialsWebsite

People associate certain behaviors with certain social groups. These stereotypical beliefs consist of both accurate and inaccurate associations. Using large-scale, data driven methods with social media as a context, we isolate stereotypes by using verbal expression. Across four social categories - gender, age, education level, and political orientation - we identify words and phrases that lead people to incorrectly guess the social category of the writer. Although raters often correctly categorize authors, they overestimate the importance of some stereotype-congruent signal. Findings suggest that data-driven approaches might be a valuable and ecologically valid tool for identifying even subtle aspects of stereotypes and highlighting the facets that are exaggerated or misapplied.

2015
Preotiuc-Pietro, Daniel, Johannes Eichstaedt, Gregory Park, Maarten Sap, Laura Smith, Victoria Tobolsky, Andrew H. Schwartz, and Lyle Ungar. The Role of Personality, Age and Gender in Tweeting about Mental Illnesses In Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality (CLPsych). NAACL, 2015. AbstractPDFSlides

Mental illnesses, such as depression and post traumatic stress disorder (PTSD), are highly underdiagnosed globally. Populations sharing similar demographics and personality traits are known to be more at risk than others. In this study, we characterise the language use of users disclosing their mental illness on Twitter. Language-derived personality and demographic estimates show surprisingly strong performance in distinguishing users that tweet a diagnosis of depression or PTSD from random controls, reaching an area under the receiver operating characteristic curve – AUC – of around .8 in all our binary classification tasks. In fact, when distinguishing users disclosing depression from those disclosing PTSD, the single feature of estimated age shows nearly as strong performance (AUC = .806) as using thousands of topics (AUC = .819) or tens of thousands of n-grams (AUC = .812). We also find that differential language analyses, controlled for demographics, recover many symptoms associated with the mental illnesses in the clinical literature.

2012
Samangooei, Sina, Daniel Preoţiuc-Pietro, Jing Li, Mahesan Niranjan, Nick Gibbins, and Mahesan Niranjan. Regression models of trends in streaming data. Public Deliverable for Trendminer Project, 2012.PDF