Justin Cheng is a PhD student in Computer Science at Stanford University, advised by Jure Leskovec and Michael Bernstein. He's broadly interested in data mining, social computing, and social networks. His work is supported by a Microsoft Research PhD Fellowship and a Stanford Graduate Fellowship.
Crowdsourcing systems lack effective measures of the effort required to complete each task. Without knowing how much time workers need to execute a task well, requesters struggle to accurately structure and price their work. Objective measures of effort could better help workers identify tasks that are worth their time. We propose a data-driven effort metric, ETA (error-time area), that can be used to determine a task's fair price. It empirically models the relationship between time and error rate by manipulating the time that workers have to complete a task. ETA reports the area under the error-time curve as a continuous metric of worker effort. The curve's 10th percentile is also interpretable as the minimum time most workers require to complete the task without error, which can be used to price the task. We validate the ETA metric on ten common crowdsourcing tasks, including tagging, transcription, and search, and find that ETA closely tracks how workers would rank these tasks by effort. We also demonstrate how ETA allows requesters to rapidly iterate on task designs and measure whether the changes improve worker efficiency. Our findings can facilitate the process of designing, pricing, and allocating crowdsourcing tasks. Honorable Mention PDF
Cheng, J., Teevan, J. & Bernstein, M.S. (2015). Measuring Crowdsourcing Effort with Error-Time Curves. To appear at CHI 2015.
A large, seemingly overwhelming task can sometimes be transformed into a set of smaller, more manageable microtasks that can each be accomplished independently. In crowdsourcing systems, microtasking enables unskilled workers with limited commitment to work together to complete tasks they would not be able to do individually. We explore the costs and benefits of decomposing macrotasks into microtasks for three task categories: arithmetic, sorting, and transcription. We find that breaking these tasks into microtasks results in longer overall task completion times, but higher quality outcomes and a better experience that may be more resilient to interruptions. These results suggest that microtasks can help people complete high quality work in interruption-driven environments. Honorable Mention PDF
Cheng, J., Teevan, J., Iqbal, S. T. & Bernstein, M.S. (2015). Break It Down: A Comparison of Macro- and Microtasks. To appear at CHI 2015.
Hybrid crowd-machine learning classifiers are classification models that start with a written description of a learning goal, use the crowd to suggest predictive features and label data, and then weigh these features using machine learning to produce models that are accurate and use human-understandable features. These hybrid classifiers enable fast prototyping of machine learning models that can improve on both algorithm performance and human judgment, and accomplish tasks where automated feature extraction is not yet feasible. Flock, an interactive machine learning platform, instantiates this approach. Honorable Mention PDF
Cheng, J. & Bernstein, M.S. (2015). Flock: Hybrid Crowd-Machine Learning Classifiers. To appear at CSCW 2015.
Social media systems rely on user feedback and rating mechanisms for personalization, ranking, and content filtering. However, when users evaluate content contributed by fellow users (e.g., by liking a post or voting on a comment), these evaluations create complex social feedback effects. We investigate how ratings on a piece of content affect its author’s future behavior. By studying four large comment-based news communities, we find that negative feedback leads to significant behavioral changes that are detrimental to the community. Not only do authors of negatively-evaluated content contribute more, but also their future posts are of lower quality, and are perceived by the community as such. Moreover, these authors are more likely to subsequently evaluate their fellow users negatively, percolating these effects through the community. In contrast, positive feedback does not carry similar effects, and neither encourages rewarded authors to write more, nor improves the quality of their posts. Interestingly, the authors that receive no feedback are most likely to leave a community. Furthermore, a structural analysis of the voter network reveals that evaluations polarize the community the most when positive and negative votes are equally split. PDF Slides
Press: Data Mining Reveals How The "Down-Vote" Leads To A Vicious Circle Of Negative Feedback (Physics arXiv Blog), and comments, both positive and negative, on Reddit (2), Slashdot, and Hacker News.
Cheng, J., Danescu-Niculescu-Mizil, C. & Leskovec, J. (2014). How Community Feedback Shapes User Behavior. ICWSM 2014.
On many social networking web sites such as Facebook and Twitter, resharing or reposting functionality allows users to share others' content with their own friends or followers. As content is reshared from user to user, large cascades of reshares can form. In this work, we develop a framework for addressing cascade prediction problems. On a large sample of photo reshare cascades on Facebook, we find strong performance in predicting whether a cascade will continue to grow in the future. We find that the relative growth of a cascade becomes more predictable as we observe more of its reshares, that temporal and structural features are key predictors of cascade size, and that initially, breadth, rather than depth in a cascade is a better indicator of larger cascades. This prediction performance is robust in the sense that multiple distinct classes of features all achieve similar performance. Observing independent cascades of the same content, we find that while these cascades differ greatly in size, we are still able to predict which ends up the largest. PDF Slides
Press: Computer scientists learn to predict when photos will ‘go viral’ on Facebook (Stanford News), The Curious Nature of Sharing Cascades on Facebook (Technology Review), Can cascades be predicted? (Facebook Data Science)
Cheng, J., Adamic, L. A., Dow, P. A., Kleinberg, J. & Leskovec, J. (2014). Can Cascades be Predicted? WWW 2014.
Activation thresholds generalize the crowdfunding concept of calling in donations when a collective monetary goal is reached into. With activation thresholds, commitments that are conditioned on others’ participation, and supporters only need to show up for an event if enough other people commit as well. Catalyst is a platform that introduces activation thresholds for on-demand events. PDF Slides
Cheng, J. & Bernstein, M. (2014). Catalyst: Triggering Collective Action with Thresholds. CSCW 2014.
Ensemble is a platform for online collaborative storywriting. Motivated by the idea that individual creative leaders and the crowd have complementary creative strengths, in Ensemble, a leader directs the high-level vision for a story and articulates creative constraints for the crowd. PDF
Kim, J., Cheng, J. & Bernstein, M. (2014). Exploring Complementary Strengths of Leaders and Crowds in Creative Collaboration. CSCW 2014.
We study how people use and respond to three different annotation styles: single-word tags, multi-word tags, and comments. We find significant differences in how annotation styles influence the objectivity, descriptiveness, and interestingness of annotations. PDF
Cheng, J. & Cosley, D. (2013). How Annotation Styles Influence Content and Preferences. HYPERTEXT 2013.
Storeys is a graph-based visualization tool designed for collaborative story writing that represents stories in a branching tree of individual sentences. The fine-grained, branching structure supports collaboration by reducing contribution cost, conflict over text ownership, and production blocking. Also designed to be ludic and playful, in initial evaluations Storeys was seen as a fun tool for creativity that balanced the exploration and elaboration of ideas. PDF
Cheng, J., Kang, L. & Cosley, D. (2013). Storeys – Designing Collaborative Storytelling Interfaces. CHI 2013 Extended Abstracts (Interactivity).
We describes two diagnostic tools to predict students are at risk of dropping out from an online class. Experiments on a large, online HCI class suggest that the tools we introduce can help identify students who will not complete assignments, with an F1 score of 0.46 and 0.73 three days before the assignment due date. PDF
Cheng, J., Kulkarni, C. & Klemmer, S. (2013). Tools for Predicting Drop-off in Large Online Classes. CSCW 2013 Companion.
Understanding the ways in which information achieves widespread public awareness is a research question of significant interest. We consider whether, and how, the way in which the information is phrased - the choice of words and sentence structure - can affect this process. PDF
Danescu-Niculescu-Mizil, C., Cheng, J., Kleinberg, J. & Lee, L. (2012). You had me at hello: How phrasing affects memorability. Proceedings of ACL 2012.
When looking at how people interact on Twitter, how can network factors help us predict which interactions are reciprocal (i.e. both parties participating), and which aren't (i.e. one user pestering another)? What factors are best in predicting reciprocity? PDF Slides
Cheng, J., Romero, D., Meeder, B., & Kleinberg, J. (2011). Predicting Reciprocity in Social Networks. 2011 IEEE Second International Conference on Social Computing (SocialCom).
GoSlow is an iPhone application that helps users think and reflect on their day using daily suggestions. GoSlow is designed to be reflective rather than persuasive, meaning that the application doesn't aim to modify user behavior but rather encourage introspection.
Cheng, J., Bapat, A., Thomas, G., Tse, K., Nawathe, N., Crockett, J. & Leshed, G. (2011). GoSlow: designing for slowness, reflection and solitude. CHI 2011 Extended Abstracts (alt.chi).
Have you ever wondered whether tagging could actually be...fun? How to use color (among other things) to design playful and useful tagging interfaces.
Cheng, J. & Cosley, D. (2010). kultagg: ludic design for tagging interfaces. Proceedings of GROUP 2010.
An Android client which displays traffic congestion using data from a server which uses probes to figure out congestion information.
Ng, W.S. & Cheng, J. 2009. Delivering visual pertinent information services for commuters. Proceedings of IEEE APSCC, 325-331.