Preoţiuc-Pietro, Daniel, Jordan Carpenter, and Lyle Ungar. Personality Driven Differences in Paraphrase Preference In Workshop on Natural Language Processing and Computational Social Science (NLP+CSS). ACL, 2017. AbstractPDFSlides

Personality plays a decisive role in how people behave in different scenarios, including online social media. Researchers have used such data to study how personality can be predicted from language use. In this paper, we study phrase choice as a particular stylistic linguistic difference, as opposed to the mostly topical differences identified previously. Building on previous work on demographic preferences, we quantify differences in paraphrase choice from a massive Facebook data set with posts from over 115,000 users. We quantify the predictive power of phrase choice in user profiling and use phrase choice to study psycholinguistic hypotheses. This work is relevant to future applications that aim to personalize text generation to specific personality types.

Preoţiuc-Pietro, Daniel, Wei Xu, and Lyle Ungar. Discovering User Attribute Stylistic Differences via Paraphrasing In AAAI., 2016. AbstractPDFSlides

User attribute prediction from social media text has proven successful and useful for downstream tasks. In previous studies, user trait differences have been limited primarily to the presence or absence of words that indicate topical preferences. In this study, we aim to find linguistic style distinctions across three different user attributes: gender, age and occupational class. By combining paraphrases with a simple yet effective method, we capture a wide set of stylistic differences that are exempt from topic bias. We show their predictive power in user profiling, conformity with human perception and psycholinguistic hypotheses, and potential use in generating natural language tailored to specific user traits.