ConverSearch: Supporting Experts in Human Behavior Analysis of Conversational Videos with a Multimodal Scene Search Tool (TIIS)无摘要信息2026RARiku Arakawa et al.Carnegie Mellon UniversityHuman Pose & Activity RecognitionConversational Search & QA SystemsComputational Methods in HCIIUI
Mapping the Design Space of User Experience for Computer Use AgentsLarge language model (LLM)-based computer use agents execute user commands by interacting with available UI elements, but little is known about how users want to interact with these agents or what design factors matter for their user experience (UX). We conducted a two-phase study to map the UX design space for computer use agents. In Phase 1, we reviewed existing systems to develop a taxonomy of UX considerations, then refined it through interviews with eight UX and AI practitioners. The resulting taxonomy included categories such as user prompts, explainability, user control, and users’ mental models, with corresponding subcategories and example design features. In Phase 2, we ran a Wizard-of-Oz study with 20 participants, where a researcher acted as a web-based computer use agent and probed user reactions during normal, error-prone and risky execution. We used the findings to validate the taxonomy from Phase 1 and deepen our understand of the design space by identifying the connections between design areas and divergence in user needs and scenarios. Our taxonomy and empirical insights provide a map for developers to consider different aspects of user experience in computer use agent design and to situate their designs within users' diverse needs and scenarios.2026RCRuijia Cheng et al.AppleHuman-LLM CollaborationExplainable AI (XAI)AI-Assisted Decision-Making & AutomationIUI
Visual Lyrics: Generating Animated Text for Music Lyric Videos with an Augmented Text EditorAnimated lyric videos transform song lyrics into dynamic visual experiences, offering a powerful medium for artistic expression and audience engagement. However, creating these videos is challenging, requiring expertise in audio, typography, graphic design, and animation, making it inaccessible to novices. To address this challenge, we introduce Visual Lyrics, a proof-of-concept system for generating animated lyric videos controlled with an augmented text editor interface. We examined existing lyric videos to distill a taxonomy and design guidelines, informing the design of Visual Lyrics. Our key insight is a multimodal music analysis pipeline based on the taxonomy and leveraging LLM's strong natural language understanding and code generation capabilities to synthesize creative and semantically meaningful animations. We collected a dataset of over 300 code-driven creative text animations to serve as inspiration for our LLM-driven pipeline, which we open source. In a user study, Visual Lyrics enabled novices to easily create high-quality animated lyric videos with high ratings of enjoyment, inspiration, and exploration.2026DLDavid Chuan-En Lin et al.Carnegie Mellon UniversityAI-Assisted Creative WritingVideo Production & EditingCreative Collaboration & Feedback SystemsIUI
SusBench: An Online Benchmark for Evaluating Dark Pattern Susceptibility of Computer-Use AgentsAs LLM-based computer-use agents (CUAs) begin to autonomously interact with real-world interfaces, understanding their vulnerability to manipulative interface designs becomes increasingly critical. We introduce SusBench, an online benchmark for evaluating the susceptibility of CUAs to UI dark patterns, designs that aim to manipulate or deceive users into taking unintentional actions. Drawing nine common dark pattern types from existing taxonomies, we developed a method for constructing believable dark patterns on real-world consumer websites through code injections, and designed 313 evaluation tasks across 55 websites. Our study with 29 participants showed that humans perceived our dark pattern injections to be highly realistic, with the vast majority of participants not noticing that these had been injected by the research team. We evaluated five state-of-the-art CUAs on the benchmark. We found that both human participants and agents are particularly susceptible to the dark patterns of Preselection, Trick Wording, and Hidden Information, while being resilient to other overt dark patterns. Our findings inform the development of more trustworthy CUAs, their use as potential human proxies in evaluating deceptive designs, and the regulation of an online environment increasingly navigated by autonomous agents.2026LGLongjie Guo et al.University of WashingtonDark Patterns RecognitionExplainable AI (XAI)Algorithmic Transparency & AuditabilityIUI
Developer Interaction Patterns with Proactive AI: A Five-Day Field StudyCurrent in-IDE AI coding tools typically rely on time-consuming manual prompting and context management, whereas proactive alternatives that anticipate developer needs without explicit invocation remain underexplored. Understanding when humans are receptive to such proactive AI assistance during their daily work remains an open question in human-AI interaction research. We address this gap through a field study of proactive AI assistance in professional developer workflows. We present a five-day in-the-wild study with 15 developers who interacted with a proactive feature of an AI assistant integrated into a production-grade IDE that offers code quality suggestions based on in-IDE developer activity. We examined 229 AI interventions across 5,732 interaction points to understand how proactive suggestions are received across workflow stages, how developers experience them, and their perceived impact. Our findings reveal systematic patterns in human receptivity to proactive suggestions: interventions at workflow boundaries (e.g., post-commit) achieved 52% engagement rates, while mid-task interventions (e.g., on declined edit) were dismissed 62% of the time. Notably, well-timed proactive suggestions required significantly less interpretation time than reactive suggestions (45.4s versus 101.4s, 𝑊 = 109.00, 𝑟 = 0.533, 𝑝 = 0.0016), indicating enhanced cognitive alignment. This study provides actionable implications for designing proactive coding assistants, including how to time interventions, align them with developer context, and strike a balance between AI agency and user control in production IDEs.2026NKNadine Kuo et al.JetBrainsAI-Assisted Decision-Making & AutomationGenerative AI (Text, Image, Music, Video)Human-LLM CollaborationIUI
The Behavioral Fabric of LLM-Powered GUI Agents: Human Values and Interaction OutcomesLarge Language Model (LLM)-powered web GUI agents are increasingly automating everyday online tasks. Despite their popularity, little is known about how users' preferences and values impact agents' reasoning and behavior. In this work, we investigate how both explicit and implicit user preferences, as well as the underlying user values, influence agent decision-making and action trajectories. We built a controlled testbed of 14 common interactive web tasks, spanning shopping, travel, dining, and housing, each replicated from real websites and integrated with a low-fidelity LLM-based recommender system. We injected 12 human preferences and values as personas into four state-of-the-art agents and systematically analyzed their task behaviors. Our results show that preference and value-infused prompts consistently guided agents toward outcomes that reflected these preferences and values. While the absence of user preference or value guidance led agents to exhibit a strong efficiency bias and employ shortest-path strategies, their presence steered agents' behavior trajectories through the greater use of corresponding filters and interactive web features. Despite their influence, dominant interface cues, such as discounts and advertisements, frequently overrode these effects, shortening the agents' action trajectories and inducing rationalizations that masked rather than reflected value-consistent reasoning. The contributions of this paper are twofold: (1) an open-source testbed for studying the influence of values in agent behaviors, and (2) an empirical investigation of how user preferences and values shape web agent behaviors.2026SGSimret Araya Gebreegziabher et al.University of Notre DameHuman-LLM CollaborationAI-Assisted Decision-Making & AutomationAI Ethics, Fairness & AccountabilityIUI
Auditorily Embodied Conversational Agents: Effects of Spatialization and Situated Audio Cues on Presence and Social PerceptionEmbodiment can enhance conversational agents, such as increasing their perceived presence. This is typically achieved through visual representations of a virtual body; however, visual modalities are not always available, such as when users interact with agents using headphones or display-less glasses. In this work, we explore auditory embodiment. By introducing auditory cues of bodily presence — through spatially localized voice and situated Foley audio from environmental interactions — we investigate how audio alone can convey embodiment and influence perceptions of a conversational agent. We conducted a 2 (spatialization: monaural vs. spatialized) × 2 (Foley: none vs. Foley) within-subjects study, where participants (n=24) engaged in conversations with agents. Our results show that spatialization and Foley increase co-presence, but reduce users’ perceptions of the agent’s attention and other social attributes.2026YCYi Fei Cheng et al.Carnegie Mellon UniversityAffective Human-Computer DialogueSpatial Audio & 3D SoundAffective Feedback & Emotion Regulation InterfacesCHI
Black LLMirror: User (Self) Perceptions in Black American English Interactions with LLMsLLMs becoming increasingly personalized to users’ language style raises both excitement and concerns for minority users such as Black American English (BAE) speakers. Yet, previous work has predominantly focused on user perceptions of out-of-context BAE statements by LLMs rather than naturalistic multi-turn interactions, and has ignored such systems’ effects on users’ self-perception. In this work, we examine the effects that multi-turn interactions with speech and text BAE-producing LLMs have on BAE speakers’ perceptions of the LLM and of themselves. We observe a significant change in participant self-esteem following the interactions, and notable qualitative differences between BAE-LLM and Standard American English (SAE) LLM interactions. We also observe significant effects of BAE-usage on user perception of the model within speech-based interactions. Our findings suggest that the effects of BAE-usage by an LLM agent on model- and self-perception among BAE-speaking users are complex and widely varied.2026MCMikayla Campbell et al.Carnegie Mellon UniversityHuman-LLM CollaborationAI Ethics, Fairness & AccountabilityAlgorithmic Fairness & BiasCHI
Values Across Contexts: Understanding How Older Adults Enact What Matters Through TechnologyAs populations age and technology becomes more pervasive, understanding the alignment between older adults' values and technology design is paramount. More research is needed to understand how older adults’ living contexts shape their values and the use of technology. To address this, through a multi-context study, we explored how values differ for older adults and how their context of living might influence the adoption and use of technology. We conducted 22 semi-structured interviews with older adults in various residential contexts. We show that older adults tend to prioritize the same core values across living contexts, yet how they express values in each context differs. Technology can amplify or inhibit key values. We describe implications for context-responsive technology and design for continuity, to allow older adults to continually uphold important values through technology use.2026HSHugo Simão et al.Universidade de LisboaAging-Friendly Technology DesignAging-in-Place Assistance SystemsCHI
Beyond Riding: Passenger Engagement with Driver Labor through Gamified InteractionsModern cities across the globe increasingly rely on ridehail services for on-demand transportation and mobility. But for drivers, such marketed affordances give rise to hidden burdens and vulnerabilities that evade the oversight of consumers and regulators. To effectively advance worker protections and motivate more socially responsible practices, consumers must understand the realistic labor, logistics and costs involved with ridehail driving. Through nine workshops with 19 drivers and 15 passengers, we explore the potential for gamified in-ride interactions to facilitate engagement with real (and lived) driver experiences, surfacing passenger knowledge gaps around latent ridehail conditions, prompting reflection and shifts in perception of their relative power and consumption behaviors, highlighting drivers' preferences for creating more immersive and contextualized service experiences, and identifying design opportunities for safe and appropriate passenger-driver interactions that motivate solidarity. In sum, we advance conceptual understandings of in-ride social and managerial relations, demonstrate potential for citizen-led advocacy in algorithmically-managed labor, and offer design guidelines for more human-centered workplace technologies.2026JHJane Hsieh et al.Carnegie Mellon UniversityRidesharing PlatformsGamification DesignBehavior Change & Reflection TechnologyCHI
WELDAR: Augmenting Live Hands-On Training with In-Situ Guidance for Novice LearnersExtended Reality (XR) systems for physical skill training have largely emphasized simulation rather than real-time in-situ instruction. We present WeldAR, an Augmented Reality (AR) system with five learning modules that overlays real-time guidance during live welding using a headset integrated into a welding helmet and a torch attachment. We conducted an in-situ within-subjects study with 24 novices, comparing AR guidance to video instruction for live welding across practice and unassisted tests. AR improved performance in both assisted practice and unassisted tests, primarily driven by gains in travel speed and work angle. By offering real-time feedback on four performance measures, AR supported novices in carrying embodied knowledge into independent tasks. Our contributions include: (1) WeldAR for in-situ physical skill training; (2) empirical evidence that AR enhances composite welding performance and key physical skills; and (3) implications for the development of AR systems that support in-situ, embodied skill training in welding and related trades.2026CXChuhan(Franklin) Xu et al.Carnegie Mellon UniversityAR Navigation & Context AwarenessMixed Reality WorkspacesSurgical Assistance & Medical TrainingCHI
TALES: A Taxonomy and Analysis of Cultural Representations in LLM-generated StoriesMillions of users across the globe turn to AI chatbots for their creative needs, inviting widespread interest in understanding how they represent diverse cultures. However, evaluating cultural representations in open-ended tasks remains challenging and underexplored. In this work, we present TALES, an evaluation of cultural misrepresentations in LLM-generated stories for diverse Indian cultural identities. First, we develop TALES-Tax, a taxonomy of cultural misrepresentations by collating insights from participants with lived experiences in India through focus groups (N=9) and individual surveys (N=15). Using TALES-Tax, we evaluate 6 models through a large-scale annotation study spanning 2,925 annotations from 108 annotators with lived experience and native language proficiency from across 71 regions in India and 14 languages. Concerningly, we find that 88% of the generated stories contain misrepresentations, and such errors are more prevalent in mid- and low-resourced languages and stories based in peri-urban regions in India. We also transform the annotations into TALES-QA, a standalone question bank to evaluate the cultural knowledge of models.2026KBKirti Bhagat et al.Indian Institute of ScienceHuman-LLM CollaborationAI Ethics, Fairness & AccountabilityLow-Resource Languages & Digital InclusionCHI
Not Everyone Wins with LLMs: Behavioral Patterns and Pedagogical Implications for AI Literacy in Programmatic Data ScienceLLMs promise to democratize technical work in complex domains like programmatic data analysis, but not everyone benefits equally. We study how students with varied experiences use LLMs to complete Python-based data analysis in computational notebooks in a graduate course. Drawing on homework logs, recordings, and surveys from 36 students, we ask: Which experience matters most, and how does it shape AI use? Our mixed-methods analysis shows that technical experience – not AI familiarity or communication skills – remains a significant predictor of success. Students also vary widely in how they leverage LLMs, struggling at stages of forming intent, expressing inputs, interpreting outputs, and assessing results. We identify success and failure behaviors, such as providing context or decomposing prompts, that distinguish effective use. These findings inform AI literacy interventions, highlighting that lightweight demonstrations improve surface fluency but are insufficient; deeper training and scaffolds are needed to cultivate resilient AI use skills.2026QMQianou Ma et al.Carnegie Mellon UniversityHuman-LLM CollaborationProgramming Education & Computational ThinkingIntelligent Tutoring Systems & Learning AnalyticsCHI
My Money, Your Name: Challenges and Workarounds in ID-Required Mobile Money in East AfricaMobile money (MoMo) services have increased access to financial services in low- and middle- income countries (LMICs). However, requirements to register SIM cards with a government-issued identification have left around 18% of users, most without IDs, banking under a third-party’s name. Through interviews with 72 urban and rural residents in Kenya and Tanzania, this study provides the first in-depth assessment of how third-party SIM cards are acquired and the challenges and workarounds that arise when using them for MoMo. We document how third-party SIM users use various intermediaries---friends, family, agents, and strangers---to access services and the effects of ID and account misuse by both third-party SIM users and intermediaries. We further outline the personal and systemic challenges that lead to the lack of IDs for SIM registration and discuss how digitization, now underway in both Kenya and Tanzania, should be approached to effectively address these barriers.2026ELEdith T Luhanga et al.Carnegie Mellon University AfricaMobile Finance in Developing CountriesLow-Resource Languages & Digital InclusionDeveloping Countries & HCI for Development (HCI4D)CHI
Understanding Nature Engagement Experiences of Blind PeopleNature plays a crucial role in human health and well-being, but little is known about how blind people experience and relate to it. We conducted a survey of nature relatedness with blind (N=20) and sighted (N=20) participants, along with in-depth interviews with 16 blind participants, to examine how blind people engage with nature and the factors shaping this engagement. Our survey results revealed lower levels of nature relatedness among blind participants compared to sighted peers. Our interview study further highlighted: 1) current practices and challenges of nature engagement, 2) attitudes and values that shape engagement, and 3) expectations for assistive technologies that support safe and meaningful engagement. We also provide design implications to guide future technologies that support nature engagement for blind people. Overall, our findings illustrate how blind people experience nature beyond vision and lay a foundation for technologies that support inclusive nature engagement.2026MTMengjie Tang et al.Southeast UniversityVisual Impairment Technologies (Screen Readers, Tactile Graphics, Braille)Universal & Inclusive DesignHuman-Nature Relationships (More-than-Human Design)CHI
Privy: Envisioning and Mitigating Privacy Risks for Consumer-facing AI Product ConceptsAI creates and exacerbates privacy risks, yet practitioners lack effective resources to identify and mitigate these risks. We present Privy, a tool that guides practitioners without privacy expertise through structured privacy impact assessments to: (i) identify relevant risks in novel AI product concepts, and (ii) propose appropriate mitigations. Privy was shaped by a formative study with 11 practitioners, which informed two versions --- one LLM-powered, the other template-based. We evaluated these two versions of Privy through a between-subjects, controlled study with 24 separate practitioners, whose assessments were reviewed by 13 independent privacy experts. Results show that Privy helps practitioners produce privacy assessments that experts deemed high quality: practitioners identified relevant risks and proposed appropriate mitigation strategies. These effects were augmented in the LLM-powered version. Practitioners themselves rated Privy as being useful and usable, and their feedback illustrates how it helps overcome long-standing awareness, motivation, and ability barriers in privacy work.2026HLHao-Ping (Hank) Lee et al.Carnegie Mellon UniversityExplainable AI (XAI)Privacy by Design & User ControlPrivacy Perception & Decision-MakingCHI
Evidotes: Integrating Scientific Evidence and Anecdotes to Support Uncertainties Triggered by Peer Health PostsPeer health posts surface new uncertainties, such as questions and concerns for readers. Prior work focused primarily on improving relevance and accuracy fails to address users' diverse information needs and emotions triggered. Instead, we propose directly addressing these by information augmentation. We introduce Evidotes, an information support system that augments individual posts with relevant scientific and anecdotal information retrieved using three user-selectable lenses (dive deeper, focus on positivity, and big picture). In a mixed-methods study with 17 chronic illness patients, Evidotes improved self-reported information satisfaction (3.2->4.6) and reduced self-reported emotional cost (3.4->1.9) compared to participants' baseline browsing. Moreover, by co-presenting sources, Evidotes unlocked information symbiosis: anecdotes made research accessible and contextual, while research helped filter and generalize peer stories. Our work enables an effective integration of scientific evidence and human anecdotes to help users better manage health uncertainty.2026SBShreya Bali et al.Carnegie Mellon UniversityMental Health Apps & Online Support CommunitiesChronic Disease Self-Management (Diabetes, Hypertension, etc.)Behavior Change & Reflection TechnologyCHI
Sometimes You Need Facts, and Sometimes a Hug: Understanding Older Adults’ Preferences for Explanations in LLM-Based Conversational AI SystemsDesigning Conversational AI systems to support older adults requires these systems to explain their behavior in ways that align with older adults’ preferences and context. While prior work has emphasized the importance of AI explainability in building user trust, relatively little is known about older adults’ requirements and perceptions of AI-generated explanations. To address this gap, we conducted an exploratory Speed Dating study with 23 older adults to understand their responses to contextually grounded AI explanations. Our findings reveal the highly context-dependent nature of explanations, shaped by conversational cues such as the content, tone, and framing of explanation. We also found that explanations are often interpreted as interactive, multi-turn conversational exchanges with the AI, and can be helpful in calibrating urgency, guiding actionability, and providing insights into older adults’ daily lives for their family members. We conclude by discussing implications for designing context-sensitive and personalized explanations in Conversational AI systems.2026NMNiharika Mathur et al.Georgia Institute of TechnologyHuman-LLM CollaborationExplainable AI (XAI)Aging-Friendly Technology DesignCHI
Code with Me or for Me? How Increasing AI Automation Transforms Developer WorkflowsDevelopers now have access to a growing array of increasingly autonomous AI tools for software development. While many studies examine copilots that provide chat assistance or code completions, evaluations of coding agents—which can automatically write files and run code—still rely on static benchmarks. We present the first controlled study of developer interactions with coding agents, characterizing how more autonomous AI tools affect productivity and experience. We evaluate two leading copilot and agentic coding assistants, recruiting participants who regularly use the former. Our results show agents can assist developers in ways that surpass copilots (e.g., completing tasks humans may not have accomplished) and reduce the effort required to finish tasks. Yet challenges remain for broader adoption, including ensuring users adequately understand agent behaviors. Our findings reveal how workflows shift with coding agents and how interactions differ from copilots, motivating recommendations for researchers and highlighting challenges in adopting agentic systems.2026VCValerie Chen et al.Carnegie Mellon UniversityHuman-LLM CollaborationAI-Assisted Decision-Making & AutomationExplainable AI (XAI)CHI
How Does Delegation in Social Interaction Evolve Over Time? Navigation with a Robot for Blind PeopleAutonomy and independent navigation are vital to daily life but remain challenging for individuals with blindness. Robotic systems can enhance mobility and confidence by providing intelligent navigation assistance. However, fully autonomous systems may reduce users’ sense of control, even when they wish to remain actively involved. Although collaboration between user and robot has been recognized as important, little is known about how perceptions of this relationship change with repeated use. We present a repeated exposure study with six blind participants who interacted with a navigation-assistive robot in a real-world museum. Participants completed tasks such as navigating crowds, approaching lines, and encountering obstacles. Findings show that participants refined their strategies over time, developing clearer preferences about when to rely on the robot versus act independently. This work provides insights into how strategies and preferences evolve with repeated interaction and offers design implications for robots that adapt to user needs over time.2026RHRayna Hata et al.Carnegie Mellon UniversityRobots in Education & HealthcareCognitive Impairment & Neurodiversity (Autism, ADHD, Dyslexia)Elderly Care & Dementia SupportCHI