SusBench: An Online Benchmark for Evaluating Dark Pattern Susceptibility of Computer-Use AgentsAs LLM-based computer-use agents (CUAs) begin to autonomously interact with real-world interfaces, understanding their vulnerability to manipulative interface designs becomes increasingly critical. We introduce SusBench, an online benchmark for evaluating the susceptibility of CUAs to UI dark patterns, designs that aim to manipulate or deceive users into taking unintentional actions. Drawing nine common dark pattern types from existing taxonomies, we developed a method for constructing believable dark patterns on real-world consumer websites through code injections, and designed 313 evaluation tasks across 55 websites. Our study with 29 participants showed that humans perceived our dark pattern injections to be highly realistic, with the vast majority of participants not noticing that these had been injected by the research team. We evaluated five state-of-the-art CUAs on the benchmark. We found that both human participants and agents are particularly susceptible to the dark patterns of Preselection, Trick Wording, and Hidden Information, while being resilient to other overt dark patterns. Our findings inform the development of more trustworthy CUAs, their use as potential human proxies in evaluating deceptive designs, and the regulation of an online environment increasingly navigated by autonomous agents.2026LGLongjie Guo et al.University of WashingtonDark Patterns RecognitionExplainable AI (XAI)Algorithmic Transparency & AuditabilityIUI
RDoFlow: Automatically assessing under-specified statistical analyses in HCIWhen designing and analyzing a study, researchers must navigate a large space of methodological decisions, or "researcher degrees of freedom." If these choices are not preregistered or transparently reported, they can increase the risk of inflated false-positive rates and exaggerated effect sizes, undermining scientific credibility. Drawing on psychology research that characterizes these degrees of freedom, we create a protocol for scoring how hypotheses are reported in the HCI literature (ReportDoF). We manually apply ReportDoF to 100 hypotheses from HCI texts authored between 2015-2025, including both preregistrations and papers. Based on this experience, we contribute an LLM workflow and proof-of-concept interactive interface (RDoFlow) that applies ReportDoF to new texts, enabling large-scale analysis of the composition and quality of reported analysis specifications. For example, RDoFlow reveals that HCI research more frequently tests multiple dependent variables for a single hypothesis than psychology research does---a practice that increases the risk of false positives.2026MGMadeleine Grunde-McLaughlin et al.University of WashingtonUser Research Methods (Interviews, Surveys, Observation)Research Ethics & Open ScienceComputational Methods in HCIIUI
Interactive Reasoning: Visualizing and Controlling Chain-of-Thought Reasoning in Large Language ModelsThe output quality of large language models (LLMs) can be improved via “reasoning”: generating segments of chain-of-thought (CoT) content to further condition the model prior to producing user-facing output. While these chains contain valuable information, they are verbose and lack explicit organization, making them tedious to review. Moreover, they lack opportunities for user feedback, such as removing unwanted considerations, adding desired ones, or clarifying unclear assumptions. We introduce Interactive Reasoning, an interaction design that visualizes chain-of-thought outputs as a hierarchy of topics and enables user review and modification. We implement interactive reasoning in Hippo, a prototype for AI-assisted decision making in the face of uncertain trade-offs. In a user study with 16 participants, we find that interactive reasoning in Hippo allows users to quickly identify and interrupt erroneous generations, efficiently steer the model towards customized responses, and better understand both model reasoning and model outputs. Our work contributes to a new paradigm that incorporates user oversight into LLM reasoning processes.2026RPRock Yuren Pang et al.University of WashingtonHuman-LLM CollaborationExplainable AI (XAI)AI-Assisted Decision-Making & AutomationIUI
Exploring the Future of AI in Clinical Collaboration: A Study on Tumor Board Case Preparation Multidisciplinary tumor boards (MTBs) bring specialists together to identify therapies for complex cancer cases, but preparing for them is time-intensive. Clinicians must extract key details from extensive records and evaluate treatment options. While large language models (LLMs) show promise in medicine for basic tasks like summarizing notes, little is known about their role in high-stakes tasks like MTB preparation. We conducted a mixed-methods study with 16 oncologists using two AI systems to prepare patient cases for MTB: an off-the-shelf assistant (Copilot) and a task-specific multi-agent system (Healthcare Agent Orchestrator, HAO). We analyzed oncologist prompts, AI responses, and oncologists' perception of AI. Participants showed greater willingness to adopt HAO but were often overconfident in AI summaries and skeptical of AI-recommended therapies. Trust calibration strategies, such as source links and agent-trajectories, failed to align trust with system capabilities. We conclude with how AI systems should be built to support clinicians in high-stakes tasks.2026JLJiachen Li et al.Northeastern UniversityHuman-LLM CollaborationExplainable AI (XAI)AI-Assisted Decision-Making & AutomationCHI
Rethinking Misinformation: A Holistic Community Model for Youth Resilience through Socioemotional Learning and Sociocultural DesignWith the growing prevalence of online mis/disinformation encountered by children, digital media literacy has become an urgent concern. Existing research emphasizes cognitive models, focusing on individual reasoning and specific quantitative criteria to classify people’s information literacy. However, critics argue that focusing solely on cognitive approach neglects the social, emotional, and cultural contexts that shape how mis/disinformation is created and spread. In this study, we expand beyond the cognitive model by examining socioemotional learning (SEL) and sociocultural (SC) perspectives. To explore how children conceptualize mis/disinformation through these lenses, we conducted 26 co-design workshops with children ages 6–11 over a 2.5-year period. Our findings highlight children’s awareness of emotional responses, peer pressure, financial incentives, and the importance of community support. These insights contribute to HCI by foregrounding the need for design approaches that integrate cognitive, SEL, and SC dimensions. We present an integrated framework to inform how community groups can support children and design recommendations that address the growing sophistication of mis/disinformation.2026JYJason Yip et al.University of WashingtonMisinformation & Fact-CheckingYouth Online Safety & PrivacyDigital Parenting & Screen Time ManagementCHI
The Toronto Water Atlas: Staging Encounters With Nature Through DesignRecent work in Human-Computer-Interaction (HCI) and Science and Technology Studies (STS) argues that improving our relationship with nature demands designing for nature as plural and multiple. This means moving beyond approaches that impose one version of what nature is, toward sustaining its different enactments and relationships. This paper examines how such ontological multiplicity can be sustained through design. Drawing on Mol’s ontological multiplicity and Stengers’ scene-setting, we present the Toronto Water Atlas, a seven-month design project involving artists, scientists, and community members. Through workshops and coworking sessions, we experimented with various design choices that surfaced and sustained multiple enactments of water, in direct contrast with singular formulations of water prevalent in informatics used for policy and planning. Through this work, we draw attention to the infrastructural biases that restrict ontological multiplicity, and demonstrate how design can more deliberately sustain diverse water ontologies by staging conditions for partial, and relational enactments.2026TATaneea S Agrawaal et al.University of TorontoHuman-Nature Relationships (More-than-Human Design)Sustainable HCICHI
Like, Comment & Caption: A Decade of Social Media Video Caption Research (2015–2025)As video has become the dominant mode of content on platforms such as YouTube, TikTok, and Instagram, captioning has emerged as a critical factor for accessibility, engagement, and visibility. While prior studies have examined different types of social media video captions or communities' captioning usage, a systematic synthesis has not been undertaken, leading to the risk of proposing interventions that overlook core platform constraints or miss critical accessibility needs. This paper reviews 36 peer-reviewed papers published between 2015 and 2025 across fields such as Human-Computer Interaction (HCI), accessibility, media studies, education, and language learning. We note that captions operate as collective infrastructure co-produced by viewers, creators, and platforms. Deaf and Hard of Hearing (DHH), neurodivergent, and multilingual viewers depend on captions and increasingly expect mechanisms for feedback, while creators face inadequate tool support. Building on these insights, we propose the framework of Participatory Captioning and suggest design implications, highlighting future directions for social media video caption research.2026HNHuong Nguyen et al.New Jersey Institute of TechnologyVoice AccessibilitySocial Platform Design & User BehaviorDeaf & Hard-of-Hearing Support (Captions, Sign Language, Vibration)CHI
Clay ARTools: Precise Machine Toolpath Editing for Clay 3D Printing With Craft-Inspired Direct Manipulation Tools in ARCeramics practice is an embodied activity where creators use manual tools in unique ways to shape physical material. Clay 3D printing uses the same material as manual ceramics craft, enabling new opportunities for form and texture by precisely controlling the 3D printing toolpath. However, current clay 3D printing design workflows require developing forms through digital software rather than tool-based making. We present Clay ARTools, an augmented reality (AR) system for designing clay 3D printed vessels. We developed Clay ARTools in collaboration with a professional ceramicist to create AR toolpath editing operations that reference manual use of ceramic tools. Through the design and fabrication of 3D-printed clay artifacts, we demonstrate how AR ceramic tools enable precise and controllable modifications of the toolpath, from the overall form down to individual toolpath points. We demonstrate how extending physical tool metaphors with digital representations and numerical precision enables craft-like interaction with CAM-based design techniques.2026JPEmilie Yu et al.University of California, Santa BarbaraDesktop 3D Printing & Personal FabricationAR Navigation & Context AwarenessPhysical-Digital Hybrid InteractionCHI
BikeButler: A Personalized, Context-sensitive Bike Routing Tool using Open Data and VLM-based Analyses of Street View ImageryUrban cycling benefits personal wellbeing, public health, and global sustainability. While current tools such as Google and Apple Maps provide bike route recommendations, they do not account for a person’s dynamic context (e.g., commuting, recreation). We introduce BikeButler, a personalized, context-sensitive bicycle route generation tool that enables users to generate, compare, virtually preview, and iteratively customize bike routes via custom profiles that encode seven bikeability features, including bike lane existence, slope, vegetation, and surface quality—fusing data from OpenStreetMap, open government data, and a custom VLM-based analysis of Street View images. To design BikeButler, we employed a human-centered, iterative approach starting with formative interviews and culminating in a user study (N=16). Our findings demonstrate that bike routing preferences change as a function of context, that BikeButler enables users to quickly create and iterate context-sensitive routes, and that generated routes differ significantly from Google Maps bike routing, reinforcing the importance of personalization.2026JHJared Hwang et al.University of WashingtonMicromobility (E-bike, E-scooter) InteractionSmart Cities & Urban SensingPublic Transit & Trip PlanningCHI
Promptimizer: User-Led Prompt Optimization for Personal Content ClassificationWhile LLMs now enable social media users to create content classifiers easily through natural language, automatic prompt optimization techniques are often necessary to create performant classifiers. However, such techniques can fail to consider how users want to evolve their classifiers over the course of usage, including desiring to steer them in different ways during initialization and refinement. We introduce a user-centered prompt optimization technique, Promptimizer, that maintains high performance and ease-of-use but additionally (1) allows for user input into the optimization process and (2) produces final prompts that are interpretable. A lab experiment (n=16) found that users significantly preferred Promptimizer’s human-in-the-loop optimization over a fully automatic approach. We also implement Promptimizer into Puffin, a tool to support YouTube content creators in creating and maintaining personal classifiers to manage their comments. Over a 3-week deployment with 10 creators, participants successfully created diverse filters to better understand their audiences and protect their communities.2026LWLeijie Wang et al.University of WashingtonHuman-LLM CollaborationAI-Assisted Decision-Making & AutomationRecommender System UXCHI
Ability Heuristics for Conducting Accessibility InspectionsThe accessibility of interactive technologies is often evaluated using checklists that are low-level, numerous, and platform-specific. Such checklists are typically used by accessibility experts, leaving everyday designers and developers with little support for assessing their own interfaces. To make accessibility evaluations easier to conduct, we devised a set of nine ``ability heuristics'' that prompt designers to engage with accessibility throughout the design process. We empirically evaluated these ability heuristics with 37 design students, comparing them to usability heuristics and WCAG. The ability heuristics emphasized the quality of accessibility features compared to the other methods, and surfaced issues that were more broadly dispersed across disability groups. Further, the students found the heuristics were as easy to use as the alternative methods. We argue that the heuristics help to move beyond binary notions of accessibility, pushing designers to consider the quality of features across diverse disabilities and the range of abilities within.2026CMClaire L. Mitchell et al.University of WashingtonUniversal & Inclusive DesignCognitive Impairment & Neurodiversity (Autism, ADHD, Dyslexia)Participatory DesignCHI
TraceRing: Touchpad-like Pointing with a Single IMU Ring through Personalized LearningAchieving touchpad-like pointing with a single IMU ring is highly desirable for portable and wearable interaction, yet challenging due to incomplete motion data and significant user variability. We present TraceRing, a finger-worn IMU system that enables precise two-dimensional cursor control. To address the limitations of generic end-to-end models, we propose a personalized training framework that learns user-specific representations through joint multi-task and contrastive learning, while dynamically selecting the most suitable expert model. This approach enables personalization without requiring per-user fine-tuning, and reduces velocity prediction error by 33.9% over state-of-the-art baselines. Furthermore, a real-time study shows it delivers speed and accuracy far exceeding those of AirMouse (2.26s v.s. 3.01s in average task completion time). These results demonstrate TraceRing as a portable and comfortable alternative for mobile computing and AR interaction applications.2026ZHZhe He et al.Tsinghua UniversityHaptic WearablesHand Gesture RecognitionMobile Augmented RealityCHI
The Engagement-Prolonging Designs Teens Encounter on Very Large Online PlatformsIn the attention economy, online platforms are incentivized to design products that maximize user engagement, even when such practices conflict with users' best interests. We conducted a structured content analysis of all Very Large Online Platforms (VLOPs) to identify the designs these influential apps and sites use to capture attention and extend engagement. Specifically, we conducted this analysis posing as a teenager to identify the designs that young people are exposed to. We find that VLOPs use four strategies to extend teens' use: pressuring, enticing, trapping, and lulling them into spending more time online. We report on a hierarchical taxonomy organizing the 63 designs that fall under these categories. Applying this taxonomy to all 17 VLOPs, we identify 583 instances of engagement-prolonging designs, with social media platforms using twice as many as other VLOPs. We present three vignettes illustrating how these designs reinforce one another in practice. We further contribute a graphical dataset of videos illustrating these features in the wild.2026YCYixin Chen et al.University of WashingtonDark Patterns RecognitionCyberbullying & Online HarassmentYouth Online Safety & PrivacyCHI
Towards Understanding Children’s Collaborative Interaction Patterns in Child-AI Co-creative InterfacesChildren are increasingly using generative AI for co-creative activities, such as storytelling. While co-creativity is inherently about collaboration between children and AI, little is known about how children naturally engage, respond, and negotiate collaboration with AI. To address this gap, we conducted a participatory design study with children (ages 8–13) to examine the roles children and AI take and the strategies children use to align AI’s output with their intent. Our findings introduce four novel child–AI collaboration profiles. We found that children were open to technical AI refinements (e.g., adding details to their drawings) as scaffolds for developing drawing skills, but resisted conceptual transformations (e.g., changing objects) that altered their original ideas. We introduce the Child-Centered Co-creative AI (CCAI, “Kai”) framework, grounded in children’s natural collaborative behaviors during co-creation with AI, to inform the design of future child–AI co-creativity interfaces.2026FFFrancesca Fusco et al.SUPSIGenerative AI (Text, Image, Music, Video)Children's AI Literacy & Data LiteracyParticipatory DesignCHI
Cocoa: Co-Planning and Co-Execution with AI AgentsAs AI agents take on increasingly long-running tasks involving sophisticated planning and execution, there is a corresponding need for novel interaction designs that enable deeper human-agent collaboration. However, most prior works leverage human interaction to fix "autonomous" workflows that have yet to become fully autonomous or rigidly treat planning and execution as separate stages. Based on a formative study with 9 researchers using AI to support their work, we propose a design that affords greater flexibility in collaboration, so that users can 1) delegate agency to the user or agent via a collaborative plan where individual steps can be assigned; and 2) interleave planning and execution so that plans can adjust after partial execution. We introduce Cocoa, a system that takes design inspiration from computational notebooks to support complex research tasks. A lab study (n=16) found that Cocoa enabled steerability without sacrificing ease-of-use, and a week-long field deployment (n=7) showed how researchers collaborated with Cocoa to accomplish real-world tasks.2026KFK. J. Kevin Feng et al.University of WashingtonHuman-LLM CollaborationPrototyping & User TestingComputational Methods in HCICHI
Automatic Synthesis of Visualization Design Knowledge BasesFormal representations of the visualization design space, such as knowledge bases and graphs, consolidate design practices into a shared resource and enable automated reasoning and interpretable design recommendations. However, prior approaches typically depend on fixed, manually authored rules, making it difficult to build novel representations or extend them for different visualization domains. Instead, we propose data-driven methods that automatically synthesize visualization design knowledge bases. Specifically, our methods (1) extract candidate design features from a visualization corpus, (2) select features forward and backward, and (3) render the final knowledge base. In our benchmark evaluation compared to Draco 2, our synthesized knowledge base offers general and interpretable design features and improves the accuracy of predicting effective designs by 1–15% in varied training and test sets. When we apply our approach to genomics visualization, the synthesized knowledge base includes sensible features with accuracy up to 97%, demonstrating the applicability of our approach to other visualization domains.2026HKHyeok Kim et al.University of WashingtonInteractive Data VisualizationMedical & Scientific Data VisualizationComputational Methods in HCICHI
Depictions of Privacy Invasion and Surveillance in Artworks and Potential Lessons For Privacy CommunicationUser-facing communication about privacy (e.g., privacy policies, privacy tools' user interfaces) is frequently ignored and often ineffective. In contrast to these arguably staid interfaces, artworks often focus on provocation, engagement, and critical interpretation. For decades, artists have created privacy art—artistic media in galleries relating to the surveillance and privacy of individuals. What are artists saying about privacy, and how? Crucially, what lessons might they have for designing privacy-focused user interfaces? To this end, we compiled over 800 privacy artworks, qualitatively analyzing a sample. Common topics spanned artistic media (from paintings to immersive installations) and eras. Artworks built upon familiar concepts (e.g., cameras, homes) to speculate on society's future and present personal information (e.g., artist, viewer, public). We discuss lessons for making non-artistic privacy communication more engaging and powerful through directing attention (e.g., lighting, collage) and setting a tone (e.g., unsettling, fun, mundane).2026TETess Eschebach et al.University of ChicagoPrivacy by Design & User ControlDigital Art Installations & Interactive PerformanceTechnology Ethics & Critical HCICHI
Generative AI and Creative Mediums for Youth’s Emotion Regulation: An Interview Study with CliniciansEmotion regulation (ER) is essential to youth well-being, and cognitive-behavioral therapy (CBT) is an established approach for building ER skills. Clinicians often use creative mediums such as visuals and narratives to support ER through CBT, yet access and personalization remain limited. Generative AI (GenAI) shows promise for addressing these limitations, but its benefits and risks in youth ER remain underexplored, underscoring the need for expert perspectives. We interviewed 20 ER specialists--psychotherapists, art therapists, and psychiatrists--using a GenAI technological probe that generated CBT-based visuals and narratives. Clinicians highlighted GenAI’s potential as a “bridge” to help youth concretely identify and express emotions, practice personalized coping skills, and mediate ER conversations between home and clinics. They also cautioned that the vividness and unpredictability of GenAI outputs may trigger trauma or reinforce maladaptive thinking. We propose psychologically grounded design implications for GenAI to foster safe, engaging youth ER as a foundation for lifelong well-being.2026DYDaeun Yoo et al.University of WashingtonGenerative AI (Text, Image, Music, Video)Mental Health Apps & Online Support CommunitiesMental Health Technology for YouthCHI
Nonvisual Support for Understanding and Reasoning about Data Structures Blind and visually impaired (BVI) computer science students face systematic barriers when learning data structures: current accessibility approaches typically translate diagrams into alternative text, focusing on visual appearance rather than preserving the underlying structure essential for conceptual understanding. More accessible alternatives often do not scale in complexity, cost to produce, or both. Motivated by a recent shift to tools for creating visual diagrams from code, we propose a solution that automatically creates accessible representations from structural information about diagrams. Based on a Wizard-of-Oz study, we derive design requirements for an automated system, Arboretum, that compiles text-based diagram specifications into three synchronized nonvisual formats—tabular, navigable, and tactile. Our evaluation with BVI users highlights the strength of tactile graphics for complex tasks such as binary search; the benefits of offering multiple, complementary nonvisual representations; and limitations of existing digital navigation patterns for structural reasoning. This work reframes access to data structures by preserving their structural properties. The solution is a practical system to advance accessible CS education.2026BWBrianna L Wimer et al.University of Notre DameVisual Impairment Technologies (Screen Readers, Tactile Graphics, Braille)Motor Impairment Assistive Input TechnologiesSpecial Education TechnologyCHI
Evaluating Behavior Change Interventions for Responsible Data ScienceThe adoption of responsible data science (RDS) practices in AI development remains inadequate despite growing awareness of algorithmic harms. One measure of success is by observing practitioners’ behaviors – namely, their adoption of responsible sequences of behaviors in their model building practice. This paper evaluates two interventions for changing problematic behaviors: (i) a motivational priming intervention that introduces short, relevant stories, and (ii) a fairness toolkit (Aequitas)—to bridge the gap between ethical principles and practitioner behavior. Through a mixed-methods study with data scientists (N=12), we assess how these interventions influence fairness practices, model outcomes, and cognitive load across credit risk and income classification tasks. Results indicate that both interventions were efficient in promoting responsible data science behaviors and improving the delivered models’ fairness, while maintaining baseline accuracy. We argue that effective behavior change interventions must balance technical tooling with motivational scaffolding to provide actionable insights for fostering sustainable RDS practices.2026ZDZiwei Dong et al.Emory UniversityAI Ethics, Fairness & AccountabilityExplainable AI (XAI)Behavior Change & Reflection TechnologyCHI