I'm a PhD student in Computer Science at Stanford University, advised by Jure Leskovec and Michael Bernstein. My work lies at the intersection of data science, computational social science, and social computing. I'm supported by a Microsoft Research PhD Fellowship and a Stanford Graduate Fellowship.
In online communities, antisocial behavior such as trolling disrupts constructive discussion. While prior work suggests that trolling behavior is confined to a vocal and antisocial minority, we demonstrate that ordinary people can engage in such behavior as well. We propose two primary trigger mechanisms: the individual's mood, and the surrounding context of a discussion (e.g., exposure to prior trolling behavior). Through an experiment simulating an online discussion, we find that both negative mood and seeing troll posts by others significantly increases the probability of a user trolling, and together double this probability. To support and extend these results, we study how these same mechanisms play out in the wild via a data-driven, longitudinal analysis of a large online news discussion community. This analysis reveals temporal mood effects, and explores long range patterns of repeated exposure to trolling. A predictive model of trolling behavior shows that mood and discussion context together can explain trolling behavior better than an individual's history of trolling. These results combine to suggest that ordinary people can, under the right circumstances, behave like trolls.
Press: Stanford research shows that anyone can become an Internet troll (Stanford News), There's a Troll Inside All of Us, Researchers Say (Technology Review), The awkward truth about trolls: any of us could become one (The Guardian)
Cheng, J., Bernstein, M.S., Danescu-Niculescu-Mizil, C., & Leskovec, J. (2017). Anyone Can Become a Troll: Causes of Trolling Behavior in Online Discussions. To appear at CSCW 2017.
Cascades of information-sharing are a primary mechanism by which content reaches its audience on social media, and an active line of research has studied how such cascades, which form as content is reshared from person to person, develop and subside. In this paper, we perform a large-scale analysis of cascades on Facebook over significantly longer time scales, and find that a more complex picture emerges, in which many large cascades recur, exhibiting multiple bursts of popularity with periods of quiescence in between. We characterize recurrence by measuring the time elapsed between bursts, their overlap and proximity in the social network, and the diversity in the demographics of individuals participating in each peak. We discover that content virality, as revealed by its initial popularity, is a main driver of recurrence, with the availability of multiple copies of that content helping to spark new bursts. Still, beyond a certain popularity of content, the rate of recurrence drops as cascades start exhausting the population of interested individuals. We reproduce these observed patterns in a simple model of content recurrence simulated on a real social network. Using only characteristics of a cascade's initial burst, we demonstrate strong performance in predicting whether it will recur in the future.
Cheng, J., Adamic, L.A., Kleinberg, J. & Leskovec, J. (2016). Do Cascades Recur? WWW 2016.
User contributions in the form of posts, comments, and votes are essential to the success of online communities. However, allowing user participation also invites undesirable behavior such as trolling. In this paper, we characterize antisocial behavior in three large online discussion communities by analyzing users who were banned from these communities. We find that such users tend to concentrate their efforts in a small number of threads, are more likely to post irrelevantly, and are more successful at garnering responses from other users. Studying the evolution of these users from the moment they join a community up to when they get banned, we find that not only do they write worse than other users over time, but they also become increasingly less tolerated by the community. Further, we discover that antisocial behavior is exacerbated when community feedback is overly harsh. Our analysis also reveals distinct groups of users with different levels of antisocial behavior that can change over time. We use these insights to identify antisocial users early on, a task of high practical importance to community maintainers.
Press: Proactive policing (Economist), Algorithm 'identifies future trolls from just five posts' (Guardian), Researchers Develop a Troll-Hunting Algorithm (Popular Science), How a Troll-Spotting Algorithm Learned Its Anti-antisocial Trade (Technology Review), Science Says You Should Ignore Internet Trolls (Time), Scientists have figured out how to tell when someone is an online troll (Washington Post), 'Troll hunting' algorithm could make web a better place (Wired).
Cheng, J., Danescu-Niculescu-Mizil, C. & Leskovec, J. (2015). Antisocial Behavior in Online Discussion Communities. ICWSM 2015.
Crowdsourcing systems lack effective measures of the effort required to complete each task. Without knowing how much time workers need to execute a task well, requesters struggle to accurately structure and price their work. Objective measures of effort could better help workers identify tasks that are worth their time. We propose a data-driven effort metric, ETA (error-time area), that can be used to determine a task's fair price. It empirically models the relationship between time and error rate by manipulating the time that workers have to complete a task. ETA reports the area under the error-time curve as a continuous metric of worker effort. The curve's 10th percentile is also interpretable as the minimum time most workers require to complete the task without error, which can be used to price the task. We validate the ETA metric on ten common crowdsourcing tasks, including tagging, transcription, and search, and find that ETA closely tracks how workers would rank these tasks by effort. We also demonstrate how ETA allows requesters to rapidly iterate on task designs and measure whether the changes improve worker efficiency. Our findings can facilitate the process of designing, pricing, and allocating crowdsourcing tasks.
Cheng, J., Teevan, J. & Bernstein, M.S. (2015). Measuring Crowdsourcing Effort with Error-Time Curves. CHI 2015.
A large, seemingly overwhelming task can sometimes be transformed into a set of smaller, more manageable microtasks that can each be accomplished independently. In crowdsourcing systems, microtasking enables unskilled workers with limited commitment to work together to complete tasks they would not be able to do individually. We explore the costs and benefits of decomposing macrotasks into microtasks for three task categories: arithmetic, sorting, and transcription. We find that breaking these tasks into microtasks results in longer overall task completion times, but higher quality outcomes and a better experience that may be more resilient to interruptions. These results suggest that microtasks can help people complete high quality work in interruption-driven environments.
Cheng, J., Teevan, J., Iqbal, S. T. & Bernstein, M.S. (2015). Break It Down: A Comparison of Macro- and Microtasks. CHI 2015.
Hybrid crowd-machine learning classifiers are classification models that start with a written description of a learning goal, use the crowd to suggest predictive features and label data, and then weigh these features using machine learning to produce models that are accurate and use human-understandable features. These hybrid classifiers enable fast prototyping of machine learning models that can improve on both algorithm performance and human judgment, and accomplish tasks where automated feature extraction is not yet feasible. Flock, an interactive machine learning platform, instantiates this approach.
Cheng, J. & Bernstein, M.S. (2015). Flock: Hybrid Crowd-Machine Learning Classifiers. CSCW 2015.
Social media systems rely on user feedback and rating mechanisms for personalization, ranking, and content filtering. However, when users evaluate content contributed by fellow users (e.g., by liking a post or voting on a comment), these evaluations create complex social feedback effects. We investigate how ratings on a piece of content affect its author’s future behavior. By studying four large comment-based news communities, we find that negative feedback leads to significant behavioral changes that are detrimental to the community. Not only do authors of negatively-evaluated content contribute more, but also their future posts are of lower quality, and are perceived by the community as such. Moreover, these authors are more likely to subsequently evaluate their fellow users negatively, percolating these effects through the community. In contrast, positive feedback does not carry similar effects, and neither encourages rewarded authors to write more, nor improves the quality of their posts. Interestingly, the authors that receive no feedback are most likely to leave a community. Furthermore, a structural analysis of the voter network reveals that evaluations polarize the community the most when positive and negative votes are equally split.
Press: Data Mining Reveals How The "Down-Vote" Leads To A Vicious Circle Of Negative Feedback (Physics arXiv Blog), and comments, both positive and negative, on Reddit (2), Slashdot, and Hacker News.
Cheng, J., Danescu-Niculescu-Mizil, C. & Leskovec, J. (2014). How Community Feedback Shapes User Behavior. ICWSM 2014.
On many social networking web sites such as Facebook and Twitter, resharing or reposting functionality allows users to share others' content with their own friends or followers. As content is reshared from user to user, large cascades of reshares can form. In this work, we develop a framework for addressing cascade prediction problems. On a large sample of photo reshare cascades on Facebook, we find strong performance in predicting whether a cascade will continue to grow in the future. We find that the relative growth of a cascade becomes more predictable as we observe more of its reshares, that temporal and structural features are key predictors of cascade size, and that initially, breadth, rather than depth in a cascade is a better indicator of larger cascades. This prediction performance is robust in the sense that multiple distinct classes of features all achieve similar performance. Observing independent cascades of the same content, we find that while these cascades differ greatly in size, we are still able to predict which ends up the largest.
Press: Can cascades be predicted? (Facebook), Can an algorithm predict which popular content will become viral content? (NiemanLab), Computer scientists learn to predict when photos will ‘go viral’ on Facebook (Stanford News), The Curious Nature of Sharing Cascades on Facebook (Technology Review).
Cheng, J., Adamic, L. A., Dow, P. A., Kleinberg, J. & Leskovec, J. (2014). Can Cascades be Predicted? WWW 2014.
Activation thresholds generalize the crowdfunding concept of calling in donations when a collective monetary goal is reached into. With activation thresholds, commitments that are conditioned on others’ participation, and supporters only need to show up for an event if enough other people commit as well. Catalyst is a platform that introduces activation thresholds for on-demand events.
Cheng, J. & Bernstein, M. (2014). Catalyst: Triggering Collective Action with Thresholds. CSCW 2014.
Ensemble is a platform for online collaborative storywriting. Motivated by the idea that individual creative leaders and the crowd have complementary creative strengths, in Ensemble, a leader directs the high-level vision for a story and articulates creative constraints for the crowd.
Kim, J., Cheng, J. & Bernstein, M. (2014). Exploring Complementary Strengths of Leaders and Crowds in Creative Collaboration. CSCW 2014.
We study how people use and respond to three different annotation styles: single-word tags, multi-word tags, and comments. We find significant differences in how annotation styles influence the objectivity, descriptiveness, and interestingness of annotations.
Cheng, J. & Cosley, D. (2013). How Annotation Styles Influence Content and Preferences. HYPERTEXT 2013.
Storeys is a graph-based visualization tool designed for collaborative story writing that represents stories in a branching tree of individual sentences. The fine-grained, branching structure supports collaboration by reducing contribution cost, conflict over text ownership, and production blocking. Also designed to be ludic and playful, in initial evaluations Storeys was seen as a fun tool for creativity that balanced the exploration and elaboration of ideas.
Cheng, J., Kang, L. & Cosley, D. (2013). Storeys – Designing Collaborative Storytelling Interfaces. CHI 2013 Extended Abstracts (Interactivity).
We describes two diagnostic tools to predict students are at risk of dropping out from an online class. Experiments on a large, online HCI class suggest that the tools we introduce can help identify students who will not complete assignments, with an F1 score of 0.46 and 0.73 three days before the assignment due date.
Cheng, J., Kulkarni, C. & Klemmer, S. (2013). Tools for Predicting Drop-off in Large Online Classes. CSCW 2013 Companion.
Understanding the ways in which information achieves widespread public awareness is a research question of significant interest. We consider whether, and how, the way in which the information is phrased - the choice of words and sentence structure - can affect this process.
Danescu-Niculescu-Mizil, C., Cheng, J., Kleinberg, J. & Lee, L. (2012). You had me at hello: How phrasing affects memorability. Proceedings of ACL 2012.
When looking at how people interact on Twitter, how can network factors help us predict which interactions are reciprocal (i.e. both parties participating), and which aren't (i.e. one user pestering another)? What factors are best in predicting reciprocity?
Cheng, J., Romero, D., Meeder, B., & Kleinberg, J. (2011). Predicting Reciprocity in Social Networks. 2011 IEEE Second International Conference on Social Computing (SocialCom).
GoSlow is an iPhone application that helps users think and reflect on their day using daily suggestions. GoSlow is designed to be reflective rather than persuasive, meaning that the application doesn't aim to modify user behavior but rather encourage introspection.
Cheng, J., Bapat, A., Thomas, G., Tse, K., Nawathe, N., Crockett, J. & Leshed, G. (2011). GoSlow: designing for slowness, reflection and solitude. CHI 2011 Extended Abstracts (alt.chi).
Have you ever wondered whether tagging could actually be...fun? How to use color (among other things) to design playful and useful tagging interfaces.
Cheng, J. & Cosley, D. (2010). kultagg: ludic design for tagging interfaces. Proceedings of GROUP 2010.
An Android client which displays traffic congestion using data from a server which uses probes to figure out congestion information.
Ng, W.S. & Cheng, J. 2009. Delivering visual pertinent information services for commuters. Proceedings of IEEE APSCC, 325-331.