Where Will They Click Next? A Social Foraging Model for Collaborating TeamsModern knowledge work is increasingly collaborative, especially in information-intensive domains such as crisis response, scientific discovery, and software engineering. Software engineering epitomizes these trends through practices like pair programming and collaborative debugging. Yet existing computational models of information foraging remain individual-centric, leaving teams without support for social foraging-leveraging partners’ actions and communication to navigate complex projects. We introduce PFIS-T, a predictive computational model of social information foraging. Building on the PFIS model family, it integrates implicit cues from teammates’ recent navigation and explicit cues from synchronous communication to predict a programmer’s next action. We evaluated PFIS-T with ten three-person debugging teams, finding that it substantially outperforms the strongest individual baseline, PFIS3, predicting 81.5% of navigations and improving accuracy by 16.7%. These results show how predictive models can operationalize social foraging and point to opportunities for collaborative IDEs and interactive systems that adaptively surface social trails to improve coordination and awareness.2026SLShahnewaz Leon et al.North Carolina State UniversityDistributed Team CollaborationComputational Methods in HCIPrototyping & User TestingCHI
Investigating the Interaction of Game Features and Spatial Skills with Performance and Perceived Difficulty in a Block-Based 3D Programming Puzzle GameBlock-based programming puzzle games are popular as engaging, visual tools for introducing novice learners to programming. Many incorporate 3D environments, where players navigate and solve spatial challenges. However, little research has examined how spatial reasoning skills interact with game features to shape player experience. This study investigates the interplay between game features, spatial ability, in-game performance, and perceived difficulty within BOTs: 3D programming game where players plan spatial movements to solve puzzles. In an online study with 60 players, we examined how feature changes affected performance and perceived difficulty. Spatial skills strongly predicted performance but did not predict perceived difficulty. Larger or more complex layouts increased performance costs, with backward-facing player characters producing the largest spike in performance demands. Loops reliably increased perceived difficulty. Our findings highlight concrete needs for early spatial scaffolds, clearer support for mental-model shifts, and better cues for recognizing repetition and abstraction.2026YRYasitha Rajapaksha et al.North Carolina State UniversitySerious & Functional GamesProgramming Education & Computational ThinkingSTEM Education & Science CommunicationCHI
BiasViz: A Project-Based, Narrative-Centered Learning Tool for Engaging Middle School Students in Critical Thinking about AI BiasesDeveloping the ability to think critically about AI and interpret its outputs requires an understanding of AI bias, a key skill for both AI users and future developers. While some initiatives have introduced teens to algorithmic bias, few have engaged them in actively identifying and quantifying bias in real-world generative AI systems. This paper presents BiasViz, an interactive tool that leverages project-based and narrative-centered learning to help middle school students (11-14 year old) analyze AI bias in large language models. We conducted a study of 28 students’ interactions with BiasViz to evaluate its efficacy in fostering critical thinking about AI bias. Our findings suggest that BiasViz successfully introduced most students to AI bias, and some used the tool to explore personally relevant biases. We identify opportunities for the tool’s iteration and associated curriculum to promote learning and share insights for designing learning environments that foster youth’s critical thinking about AI.2026HDHasti Darabipourshiraz et al.Northwestern UniversityHuman-LLM CollaborationAI Ethics, Fairness & AccountabilityProgramming Education & Computational ThinkingCHI
When AI Gets It Wrong: Scaffolding AI Hallucination Detection for Children Through Chatbot CreationChildren increasingly interact with generative AI systems that can produce hallucinated content, potentially reinforcing misconceptions and undermining critical thinking skills. We investigate how children detect and respond to hallucinations while building and testing LLM-powered chatbots in a development environment. We integrated hallucination-awareness scaffolds such as confidence indicators, fact-checking, repeated questioning, and model comparison. Through a study with 48 middle school learners aged 10-14, participants showed significant pre-to-post gains in AI knowledge, hallucination awareness, and confidence in building trustworthy chatbots. They developed multi-layered strategies, including probing inconsistencies and cross-checking with external sources. Key challenges included over-reliance on visible cues, fragmented use of scaffolds, and a tension between creativity and reliability. These findings highlight design implications for children’s AI literacy for responsible AI development: supporting proactive, iterative engagement in the development cycle, integrating scaffolds into coherent workflows, and balancing creativity with accuracy.2026XTXiaoyi Tian et al.North Carolina State UniversityHuman-LLM CollaborationChildren's AI Literacy & Data LiteracyParticipatory DesignCHI
Exploring the Design and Impact of Interactive Worked Examples for Learners with Varying Prior KnowledgeTutoring systems improve learning through tailored interventions, such as worked examples, but often suffer from the aptitude-treatment interaction effect where low prior knowledge learners benefit more. We applied the ICAP learning theory to design two new types of worked examples, Buggy (students fix bugs), and Guided (students complete missing rules), requiring varying levels of cognitive engagement, and investigated their impact on learning in a controlled experiment with 155 undergraduate students in a logic problem solving tutor. Students in the Buggy and Guided examples groups performed significantly better on the posttest than those receiving passive worked examples. Buggy problems helped high prior knowledge learners whereas Guided problems helped low prior knowledge learners. Behavior analysis showed that Buggy produced more exploration-revision cycles, while Guided led to more help-seeking and fewer errors. This research contributes to the design of interventions in logic problem solving for varied levels of learner knowledge and a novel application of behavior analysis to compare learner interactions with the tutor.2026STSutapa Dey Tithi et al.North Carolina State UniversityIntelligent Tutoring Systems & Learning AnalyticsPrototyping & User TestingBehavior Change & Reflection TechnologyCHI
"Shall We Dig Deeper?": Designing and Evaluating Strategies for LLM Agents to Advance Knowledge Co-Construction in Asynchronous Online DiscussionsAsynchronous online discussions enable diverse participants to co-construct knowledge beyond individual contributions. This process ideally evolves through sequential phases, from superficial information exchange to deeper synthesis. However, many discussions stagnate in the early stages. Existing AI interventions typically target isolated phases, lacking mechanisms to progressively advance knowledge co-construction, and the impacts of different intervention styles in this context remain unclear and warrant investigation. To address these gaps, we conducted a design workshop to explore AI intervention strategies (task-oriented and/or relationship-oriented) throughout the knowledge co-construction process, and implemented them in an LLM-powered agent capable of facilitating progression while consolidating foundations at each phase. A within-subject study (N=60) involving five consecutive asynchronous discussions showed that the agent consistently promoted deeper knowledge progression, with different styles exerting distinct effects on both content and experience. These findings provide actionable guidance for designing adaptive AI agents that sustain more constructive online discussions.2026YZYuanhao Zhang et al.Hong Kong University of Science and TechnologyHuman-LLM CollaborationAI-Assisted Decision-Making & AutomationUser Research Methods (Interviews, Surveys, Observation)CHI
Matching Explanation Detail to Scene Complexity: Studying Situational Awareness-Specific AI Feedback in Pedestrian Encounter Driving ScenariosProviding the same level of information through the in-vehicle interface can overwhelm automated vehicle occupants in simple scenarios or leave them underinformed in more demanding situations. This study investigates how human preference for in-vehicle feedback detail scales with scene complexity during pedestrian encounters. We measure scene complexity through driving decision diversity and validate its positive correlation with pedestrian crossing intent uncertainty in an initial experiment (N=68). Using a mock-up in-vehicle interface, the second experiment (N=88) evaluates user preferences for manually crafted feedback concepts simulating three levels of the system's situational awareness. Results indicate that as intent uncertainty increases, users prefer more detailed feedback. While perception-only feedback suffices for simple encounters, in complex situations, information on system comprehension and projection aids better and easier understanding of driving decisions. These findings provide an empirical basis for scaling feedback to situational needs. As this study used manually generated feedback based on ground-truth data, the findings require further investigation considering real-world AI performance in automated vehicles.2026MEMd Fazle Elahi et al.Purdue UniversityAutomated Driving Interface & Takeover DesignExternal HMI (eHMI) — Communication with Pedestrians & CyclistsIn-Vehicle Haptic, Audio & Multimodal FeedbackCHI
Exploring Teacher-Chatbot Interaction and Affect in Block-Based ProgrammingAI-based chatbots have the potential to accelerate learning and teaching, but may also have counterproductive consequences without thoughtful design and scaffolding. To better understand teachers’ perspectives on large language model (LLM) based chatbots, we conducted a study with 11 teams of middle-school teachers using chatbots for a science and computational thinking activity within a block-based programming environment. Based on a qualitative analysis of audio transcripts and chatbot interactions, we propose three profiles: explorer, frustrated, and mixed that reflect diverse scaffolding needs. In their discussions, we found that teachers perceived chatbot benefits such as building prompting skills and self confidence alongside risks including potential declines in learning and critical thinking. Key design recommendations include scaffolding the introduction to chatbots, facilitating teacher control of chatbot features, and suggesting when and how chatbots should be used. Our contribution informs the design of chatbots to support teachers and learners in middle school coding activities.2026BRBahare Riahi et al.North Carolina State UniversityHuman-LLM CollaborationProgramming Education & Computational ThinkingIntelligent Tutoring Systems & Learning AnalyticsCHI
Designing Looms as Kits for Collaborative AssemblyBoth engineering kits and collaboration skills are increasingly prevalent in education and are often used in conjunction, yet little research addresses how kit hardware can be designed to support collaboration. Collaboration skills are best acquired through carefully designed collaborative experiences. While prior work highlights technology's important role in fostering collaboration, less research explicitly explores hardware design. This paper posits that certain design features of kits can promote or hinder collaboration. We present a user study evaluating the collaborative assembly of two loom kits: one for higher education (RoboLoom) and one for individuals (Ashford Loom). We developed a coding scheme from the 3Cs framework (coordination, cooperation, and communication) to analyze which hardware features influenced collaboration. We find five design feature categories that may influence collaboration: repetitiveness, specificity, difficulty, parallelizability, and physicality. This paper presents our findings and recommendations for implementing these features into educational kit hardware design to create opportunities for collaboration.2025SSSamantha Speer et al.Makerspace CultureParticipatory DesignPrototyping & User TestingUIST
How Problematic Are Suspenseful Interactions?Current "social acceptability" guidelines for interactive technologies advise against certain, seemingly problematic forms of interaction. Specifically, "suspenseful" interactions, characterized by visible manipulations and invisible effects, are generally considered be problematic. However, the empirical grounding for this claim is surprisingly weak. To test its validity, this paper presents a controlled replication study (n=281) of the "suspensefulness effect". Although it could be statistically replicated with two out of three social acceptability measures, effect sizes were small (r≤.2), and all compared forms of interaction, including the suspenseful one, had high absolute social acceptability scores. Thus, despite the slight negative effect, suspenseful interactions seem less problematic in the overall scheme of things. We discuss alternative approaches to improve the social acceptability of interactive technology, and recommend to more closely engage with their specific social situatedness.2025AUAlarith UhdeUser Research Methods (Interviews, Surveys, Observation)Prototyping & User TestingMobileHCI
"A five-year-old could understand it" versus "This is way too confusing": Exploring Non-expert Understandings and Perceptions of Cybersecurity DefinitionsExperts struggle with explaining cybersecurity in a language and tone appropriate for non-expert audiences. This communication gap may make it difficult for a broad and diverse audience to fully engage in cybersecurity. Fundamental forms of communication, such as definitions, can be for a means for experts to communicate cybersecurity concepts to non-experts. To explore how nonexperts perceive cybersecurity definitions and identify potential areas of misunderstanding and misconception, we performed a semi-structured interview study with 30 non-experts of different generations (ages) and education levels. Our findings reveal that non-experts may have incomplete mental models of cybersecurity, misinterpret terms and concepts commonly used in definitions, and express strong preferences for how cybersecurity is defined. While our study focuses on definitions, our results have broader implications for how cybersecurity should be communicated to a diverse range of individuals.2025LNLorenzo C. Neil et al.North Carolina State UniversityPrivacy by Design & User ControlCybersecurity Training & AwarenessCHI
Analyzing the Impact and Accuracy of Facebook Activity on Facebook's Ad-Interest Inference ProcessSocial media platforms like Facebook have become increasingly popular for serving targeted ads to their users. This has led to increased privacy concerns due to the lack of transparency regarding how ads are matched against each user profile. Facebook infers user interests through their activities and targets ads based on those interests. Although Facebook provides explanations for why a particular interest is inferred about a user, there is still a gap in understanding what activities lead to interest inferences and the extent to which the sentiment or context of activities is considered in inferring interests. To obtain insights into how Facebook generates interests from a user’s Facebook activities, we performed controlled experiments by creating new accounts and systematically executing numerous planned activities. This enabled us to make causal inferences about activities that lead to generating specific interests, many of which were not representative of actual user preferences. We also evaluated which activities resulted in interests and found that very naive activities, such as only viewing/scrolling through a page, leads to an interest inference. We found 33.22% of the inferred interests were inaccurate or irrelevant. We further evaluated the interest inference explanations provided by Facebook and found that these explanations were too generalized and, at times, misleading. To understand if our findings hold for a large and diverse sample, we conducted a user study where we recruited 146 participants (through Amazon Mechanical Turk) from different regions of the world to evaluate the accuracy of interests inferred by Facebook. We developed a browser extension to extract data from their own Facebook accounts and ask questions based on such data. Our participants reported a similar range (29%) of inaccuracy as observed in our controlled experiments. We also found that most of our participants were unaware of the availability of Facebook’s ad preference manager, interest inference process, and even interest explanations.2022ASAafaq Sabir et al.Algorithmic Decision-making; Algorithmic Decision-makingCSCW
Hey Alexa, Who Am I Talking to?: Analyzing Users’ Perception and Awareness Regarding Third-party Alexa SkillsThe Amazon Alexa voice assistant provides convenience through automation and control of smart home appliances using voice commands. Amazon allows third-party applications known as skills to run on top of Alexa to further extend Alexa's capability. However, as multiple skills can share the same invocation phrase and request access to sensitive user data, growing security and privacy concerns surround third-party skills. In this paper, we study the availability and effectiveness of existing security indicators or a lack thereof to help users properly comprehend the risk of interacting with different types of skills. We conduct an interactive user study (inviting active users of Amazon Alexa) where participants listen to and interact with real-world skills using the official Alexa app. We find that most participants fail to identify the skill developer correctly (i.e., they assume Amazon also develops the third-party skills) and cannot correctly determine which skills will be automatically activated through the voice interface. We also propose and evaluate a few voice-based skill type indicators, showcasing how users would benefit from such voice-based indicators.2022ASAafaq Sabir et al.North Carolina State UniversityIntelligent Voice Assistants (Alexa, Siri, etc.)Agent Personality & AnthropomorphismPrivacy by Design & User ControlCHI
RMS: Removing Barriers to Analyze the Availability and Surge Pricing of Ridesharing ServicesRidesharing services do not make data of their availability (supply, utilization, idle time, and idle distance) and surge pricing publicly available. It limits the opportunities to study the spatiotemporal trends of the availability and surge pricing of these services. Only a few research studies conducted in North America analyzed these features for only Uber and Lyft. Despite the interesting observations, the results of prior works are not generalizable or reproducible because i) the datasets collected in previous publications are spatiotemporally sensitive, i.e., previous works do not represent the current availability and surge pricing of ridesharing services in different parts of the world; ii) the analyses presented in previous works are limited in scope (in terms of countries and ridesharing services they studied). Hence, prior works are not generally applicable to ridesharing services operating in different countries. This paper addresses the issue of ridesharing-data unavailability by presenting Ridesharing Measurement Suite (RMS). RMS removes the barrier of entry for analyzing the availability and surge pricing of ridesharing services for ridesharing users, researchers from various scientific domains, and regulators. RMS continuously collects the data of the availability and surge pricing of ridesharing services. It exposes real-time data of these services through \textit{i)} graphical user interfaces and \textit{ii)} public APIs to assist various stakeholders of these services and simplify the data collection and analysis process for future ridesharing research studies. To signify the utility of RMS, we deployed RMS to collect and analyze the availability and surge pricing data of 10 ridesharing services operating in nine countries for eight weeks in pre and during pandemic periods. Using the data collected and analyzed by RMS, we identify that previous articles miscalculated the utilization of ridesharing services as they did not count in the vehicles driving in multiple categories of the same service. We observe that during COVID-19, the supply of ridesharing services decreased by 54\%, utilization of available vehicles increased by 6\%, and a 5$\times$ increase in the surge frequency of services. We also find that surge occurs in a small geographical region, and its intensity reduces by 50\% in about 0.5 miles away from the location of a surge. We present several other interesting observations on ridesharing services' availability and surge pricing.2022HKHassan Ali Khan et al.North Carolina State UniversityRidesharing PlatformsSustainable HCICHI
Remote, but Connected: How #TidyTuesday Provides an Online Community of Practice for Data Scientists.Data science practitioners face the challenge of continually honing their skills such as data wrangling and visualization. As data scientists seek online spaces to network, learn and share resources with one another, each individual has to employ their own ad-hoc strategy to practice their data science skills. Given these disjointed efforts, it is crucial to ask: how can we build an inclusive, welcoming online community of practice that unites data scientists in their collective efforts to become experts? Daily hashtags on Twitter are used on specific days and have shown promise in forming a community of practice (CoP) in social networking sites like Twitter, but how do they benefit the community and its members? To understand how daily hashtags benefit data scientists and form an online CoP, we conducted a qualitative study on #TidyTuesday---a daily hashtag project for data scientists using R---using the framework of CoP as a lens for analysis. We conducted semi-structured interviews with 26 participants and uncovered motivations behind their participation in #TidyTuesday, how the project benefited them, and how it cultivated an online CoP. Our findings contribute to the CSCW research on community of practices by providing design trade-offs of using daily hashtags on Twitter, and guidelines on growing and sustaining an online community of practice for data scientists.2021NSNischal Shrestha et al.Expert WorkCSCW
Unravel: A Fluent Code Explorer for Data WranglingData scientists have adopted a popular design pattern in programming called the fluent interface for composing data wrangling code. The fluent interface works by combining multiple transformations on a data table---or dataframes---with a single chain of expressions, which produces an output. Although fluent code promotes legibility, the intermediate dataframes are lost, forcing data scientists to unravel the chain through tedious code edits and re-execution. Existing tools for data scientists do not allow easy exploration or support understanding of fluent code. To address this gap, we designed a tool called Unravel that enables structural edits via drag-and-drop and toggle switch interactions to help data scientists explore and understand fluent code. Data scientists can apply simple structural edits via drag-and-drop and toggle switch interactions to reorder and (un)comment lines. To help data scientists understand fluent code, Unravel provides function summaries and always-on visualizations highlighting important changes to a dataframe. We discuss the design motivations behind Unravel and how it helps understand and explore fluent code. In a first-use study with 14 data scientists, we found that Unravel facilitated diverse activities such as validating assumptions about the code or data, exploring alternatives, and revealing function behavior.2021NSNischal Shrestha et al.Interactive Data VisualizationComputational Methods in HCIUIST
Morphaces: Exploring Morphable Surfaces for Tangible Sketching in VRThis pictorial documents our inquiry into the design and utility of morphable surfaces to provide tangible feedback while sketching in Virtual Reality (VR). We explored materials and various structures that could enable a surface to morph. We designed and implemented the Morphace ecosystem that includes 3D printed accessories that enable handheld and desk-mounted pen-and-surface interaction for the Oculus Quest VR device. We present this preliminary exploration with the hope that this will be explored further by the design and broader HCI community.2021PPPayod Panda et al.Shape-Changing Interfaces & Soft Robotic MaterialsImmersion & Presence ResearchVR Medical Training & RehabilitationC&C
Anchorhold Afference: Virtual Reality, Radical Compassion, and Embodied PositionalityThis work situates the potential of empathy and affective application in VR systems - as well as explore the role of gamified spaces through digital humanities and critical making. We argue that the material infrastructure of VR technologies make Anchorhold Afference, a virtual reality model of Julian of Norwich’s anchorhold created by Author 1 with Unity and Oculus, an especially vivid experience. In a time when VR is conflated with video games and in which games are most traditionally associated with conquest, winning, and mastery, Anchorhold Afference opposes this and instead fosters radical compassion, as aligning with feminist media and data understandings, to invite users to an embodied experience. This work considers how VR technology can allow us to discover and evaluate the embodiment and materiality of isolation and confinement through a singular, unified and gamified experience, while also retrospectively considering the rhetorical emergence evoked through this process.2021KDKelsey Virginia Dufresne et al.Social & Collaborative VRTechnology Ethics & Critical HCIInteractive Narrative & Immersive StorytellingC&C
Data Analysts and Their Software Practices: A Profile of the Sabermetrics Community and BeyondFor modern data analytics, practices from software development are increasingly necessary to manage data, but they must be incorporated alongside other statistical and scientific skills. Therefore, we ask: how does a community recontextualize software development through the unique pressures of their work? To answer this, we explore the analytic community around baseball, or sabermetrics. To discover software development's place in the search for robust statistical insight in sports, we interview 10 participants in the sabermetric community and survey over 120 more data analysts, both in baseball and not. We explore how their work lives at the intersection of science and entertainment, and as a consequence, baseball data serves as an accessible yet deep subject to practice analytic skills. Software development exists within an iterative research process that cycles between defining rigorous statistical methods and preserving the flexibility to chase interesting problems. In this question-driven process, members of the community inhabit several overlapping roles of intentional work, in which software development can become the priority to support research and statistical infrastructure, and we discuss the way that the community can foster the balance of these skills.2020JMJustin Middleton et al.Data WorkCSCW
Engaging Students with Instructor Solutions in Online Programming HomeworkStudents working on programming homework do not receive the same level of support as in the classroom, relying primarily on automated feedback from test cases. One low-effort way to provide more support is by prompting students to compare their solution to an instructor's solution, but it is unclear the best way to design such prompts to support learning. We designed and deployed a randomized controlled trial during online programming homework, where we provided students with an instructor's solution, and randomized whether they were prompted to compare their solution to the instructor's, to fill in the blanks for a written explanation of the instructor's solution, to do both, or neither. Our results suggest that these prompts can effectively engage students in reflecting on instructor solutions, although the results point to design trade-offs between the amount of effort that different prompts require from students and instructors, and their relative impact on learning.2020TPThomas W. Price et al.North Carolina State UniversityHuman-LLM CollaborationProgramming Education & Computational ThinkingPrototyping & User TestingCHI