"Pathways to the Metaverse": Exploring the User Experience Mechanisms Driving Technology Acceptance in Virtual Lab Visits with an LLM-powered AvatarMetaverse environments combined with large language models (LLMs) enable guided interaction through LLM-powered avatars that function as embodied conversational agents. In our study, we examined how scholars interact with an LLM-powered avatar modeled after a real professor during a virtual reality (VR) tour of a research lab. As little is known about how metaverse characteristics shape the user experience (UX) mechanisms that drive acceptance of such technologies, we conducted a 2 (avatar realism: abstract vs. hyperrealistic) × 2 (immersion: desktop vs. headset-based VR) within-subjects study (N = 30), where academic participants engaged in a virtual lab tour guided by the professor avatar. We conducted path analyses on three conceptual models and, based on the results, proposed the Virtual Lab Acceptance Model (VLAM), which features an experiential path (where perceived immersion increases empathy towards the avatar and task enjoyment) and a rational path (where perceived realism increases avatar credibility and task confidence). Flow states amplify these pathways by strengthening task experiences. Task enjoyment is the strongest predictor of behavioral intention. These findings inform HCI research on metaverse characteristics to drive technology acceptance through UX mechanisms, yielding design implications for developing LLM-powered avatars for virtual labs.2026XTXinyi Tu et al.Aalto UniversitySocial & Collaborative VRIdentity & Avatars in XRHuman-LLM CollaborationIUI
Optimal Explanations: A Quantitative Model of Human Error in Causal Graph InterpretationWhen Artificial Intelligence (AI) reasoning is explained via causal graphs for human oversight, the human-computer interface is the performance bottleneck for decision-supported actions. As explanations grow more complex, humans' interpretation ability degrades, resulting in ineffective oversight. This paper contributes a quantitative model of human causal reasoning bounds and demonstrates their utility for interpretable AI explanations. Our empirical contribution is a large-scale user study (n=170), allowing the quantification of the bounded human rationality for understanding causal explanations by measuring the impact of the causal complexity on human understanding. Our significant results reveal that users are bounded causal reasoners: while their decision time increases linearly with each added factor (0.65 seconds per node), our data suggests their decision errors increase exponentially. This indicates a cognitive bound on the complexity a user can effectively manage and grounds our framework for establishing a cognitively optimized complexity. Utilizing this evidence, we contribute a theoretical framework, formalizing the influence of explanation complexity on human interpretation error. Based on our empirical results, we introduce a novel method to systematically prune causal explanations to the point of optimal complexity, trading off the explanation's fidelity loss with interpretation error, resulting in an explanation with optimized complexity for human cognitive bounds for directed acyclic graphs with up to 4 relevant nodes and novice users. Our work thereby quantifies the fidelity-interpretability trade-off as a direct relationship between model complexity and interpretation error, providing the foundation for designing structure-aware, explainable AI interfaces, minimizing error for optimal human-AI collaboration.2026PZPaul-David Zuercher et al.University of CambridgeExplainable AI (XAI)AI-Assisted Decision-Making & AutomationAlgorithmic Transparency & AuditabilityIUI
“It’s Just a Wild, Wild West”: Harnessing Public Procurement as an AI Governance MechanismPublic sector AI has the potential to harm citizens, with risks increasing as its use expands. Recent work positions public procurement as a way to shape public sector AI in line with public interests, using the state’s purchasing power to influence which AI systems are procured and under what conditions. This paper examines how this potential can be realised in practice by drawing on semi-structured interviews with UK and EU buyers, providers, and procurement experts. Our findings result in six promising procurement practices that enable the public sector to shape AI in line with public interests, alongside concrete mechanisms to support their uptake. Further, we find that AI-specific procurement approaches remain immature and systems often enter through informal channels with less scrutiny. We provide directions for both research and practice on how public procurement can be used as a governance mechanism for better aligning AI with public interests.2026AHAnna Ida Hudig et al.University of CambridgeAI Ethics, Fairness & AccountabilityPrivacy by Design & User ControlAlgorithmic Fairness & BiasCHI
How Do We Evaluate Experiences in Immersive Environments?How do we evaluate experiences in immersive environments? Despite decades of research in immersive technologies such as virtual reality, the field remains fragmented. Studies rely on overlapping constructs, heterogeneous instruments, and little agreement on what counts as immersive experience. To better understand this landscape, we conducted a bottom-up scoping review of 375 papers published in ACM CHI, UIST, VRST, SUI, IEEE VR, ISMAR, and TVCG. Our analysis reveals that evaluation practices are often domain- and purpose-specific, shaped more by local choices than by shared standards. Yet this diversity also points to new directions. Instead of multiplying instruments, researchers benefit from integrating and refining them into smarter measures. Rather than focusing only on system outputs, evaluations must center the user’s lived experience. Computational modeling offers opportunities to bridge signals across methods, but lasting progress requires open and sustainable evaluation practices that support comparability and reuse. Ultimately, our contribution is to map current practices and outline a forward-looking agenda for immersive experience research.2026XLXiang Li et al.University of CambridgeImmersion & Presence ResearchPrototyping & User TestingComputational Methods in HCICHI
Cost-Aware Bayesian Optimization for Interactive DevicesDeciding which idea is worth prototyping is a central concern in iterative design. A prototype should be produced when the expected improvement is high and the cost is low. However, this is hard to decide, because costs can vary drastically: a simple parameter tweak may take seconds, while fabricating hardware consumes material and energy. Such asymmetries can discourage a designer from exploring the design space. In this paper, we present an extension of cost-aware Bayesian optimization to account for diverse prototyping costs. The method builds on the power of Bayesian optimization and requires only a minimal modification to the acquisition function. The key idea is to use designer-estimated costs to guide sampling toward more cost-effective prototypes. In technical evaluations, the method achieved comparable utility to a cost-agnostic baseline while requiring only approximately 70 percent of the cost; under strict budgets, it outperformed the baseline threefold. A within-subjects study with 12 participants in a realistic joystick design task demonstrated similar benefits. These results show that accounting for prototyping costs can make Bayesian optimization more compatible with real-world design projects.2026TLThomas Langerak et al.Aalto UniversityPrototyping & User TestingComputational Methods in HCICHI
Unbounded: Object-Boundary Interactions in Mixed RealityBoundaries such as walls, windows, and doors are ubiquitous in the physical world, yet their potential in mixed reality (MR) remains underexplored. We present Unbounded, a Research through Design inquiry into object--boundary interaction (OBI). Building on prior work, we articulate a design space aimed at providing a shared language for OBI. To demonstrate its potential, we design and implement eight examples across productivity and art exploration scenarios, showcasing how OBIs can enrich and reframe everyday interactions. We further engage with six MR experts in one-on-one feedback sessions, using the design space and examples as design probes. Their reflections broaden the conceptual scope of OBI, reveal new possibilities for how the framework may be applied, and highlight implications for future MR interaction design.2026ZLZhuoyue Lyu et al.University of CambridgeMixed Reality WorkspacesPhysical-Digital Hybrid InteractionImmersion & Presence ResearchCHI
Who Does What? Archetypes of Roles Assigned to LLMs During Human-AI Decision-MakingLLMs are increasingly supporting decision-making across high-stakes domains, requiring critical reflection on the socio-technical factors that shape how humans and LLMs are assigned roles and interact during human-in-the-loop decision-making. This paper introduces the concept of human-LLM archetypes -- defined as recurring socio-technical interaction patterns that structure the roles of humans and LLMs in collaborative decision-making. We describe 17 human-LLM archetypes derived from a scoping literature review and thematic analysis of 113 LLM-supported decision-making papers. Then, we evaluate these diverse archetypes across real-world clinical diagnostic cases to examine the potential effects of adopting distinct human-LLM archetypes on LLM outputs and decision outcomes. Finally, we present relevant tradeoffs and design choices across human-LLM archetypes, including decision control, social hierarchies, cognitive forcing strategies, and information requirements. Through our analysis, we show that selection of human-LLM interaction archetype can influence LLM outputs and decisions, bringing important risks and considerations for the designers of human-AI decision-making systems.2026SCShreya Chappidi et al.University of CambridgeHuman-LLM CollaborationAI-Assisted Decision-Making & AutomationAI Ethics, Fairness & AccountabilityCHI
Supporting Effective Goal Setting with LLM-Based ChatbotsEach day, individuals set behavioral goals such as eating healthier, exercising regularly, or increasing productivity. While psychological frameworks (i.e., goal setting and implementation intentions) can be helpful, they often need structured external support, which interactive technologies can provide. We thus explored how large language model (LLM)-based chatbots can apply these frameworks to guide users in setting more effective goals. We conducted a preregistered randomized controlled experiment ($N = 543$) comparing chatbots with different combinations of three design features: guidance, suggestions, and feedback. We evaluated goal quality using subjective and objective measures. We found that, while guidance is already helpful, it is the addition of feedback that makes LLM-based chatbots effective in supporting participants’ goal setting. In contrast, adaptive suggestions were less effective. Altogether, our study shows how to design chatbots by operationalizing psychological frameworks to provide effective support for reaching behavioral goals.2026MSMichel Schimpf et al.Cambridge UniversityHuman-LLM CollaborationBehavior Change & Reflection TechnologyAffective Human-Computer DialogueCHI
Breaking Negative Cycles: A Reflection-to-Action System for Adaptive ChangeBreaking negative mental health cycles, including rumination and recurring regrets, requires reflection that translates awareness into behavioral change. Grounded in the Transtheoretical Model (TTM) and Gross’s Emotion Regulation (ER) Process Model, we examine how Technologies Supporting Self-Reflection (TSR) bridge reflection and action. In a 15-day in-the-wild study (N = 20), participants used a voice-based journaling system to capture regrets and wishes and engaged in WhatIf-Planning, a novel structured reflection module that integrates counterfactual thinking with if–then planning. Participants were randomized to either a free-form condition or Gross-guided condition, which maps the five processes of Gross’s ER model into explicit journaling prompts. We contribute (1) a unified reflection-to-action TSR system that operationalizes the Preparation stage of TTM to bridge Contemplation and Action, and (2) triangulated empirical evidence from an in-the-wild journaling study that operationalizes Gross’s Process Model, revealing effects on coping flexibility and emotion regulation in daily life. Results show significant pre–post improvements in coping flexibility across conditions, indicating adaptive self-regulation, with the Gross-guided group generating more counterfactual alternatives, articulating more concrete if–then action plans, and implementing more plans for self-driven change.2026MKMinsol Michelle Kim et al.Massachusetts Institute of TechnologyBehavior Change & Reflection TechnologyAffective Feedback & Emotion Regulation InterfacesMental Health Apps & Online Support CommunitiesCHI
"It didn’t feel right but I needed a job so desperately": Understanding People’s Emotions and Help Needs During ScamsOnline financial scams represent a long-standing and serious threat for which people seek help. We present a study to understand people’s in situ motivations for engaging with scams and the help needs they express before, during, and after encountering a scam. We identify the main emotions scammers exploited (e.g., fear, hope) and characterize how they did so. We examine factors -- such as financial insecurity and legal precarity -- which elevate people’s risk of engaging with specific scams and experiencing harm. We indicate when people sought help and describe their help-seeking needs and emotions at different stages of the scam. We discuss how these needs could be met through the design of contextually-specific prevention, diagnostic, mitigation, and recovery interventions.2026JCJake Chanenson et al.GooglePrivacy Perception & Decision-MakingOnline Harassment & Counter-ToolsPrivacy by Design & User ControlCHI
The Limits of Stakeholder Participation in Safety-Critical Contexts: Lessons from Air Traffic ControlCalls for participatory AI development often assume that stakeholders can and should substantially shape a system's design. However, this agency may be constrained by competing demands, e.g. those safety-related. We explore this tension through a case study in Air Traffic Control (ATC) system development. Interviews with ATC operators and a focus group including R&D staff uncovered that operators’ input was confined to small changes, with major decisions made through opaque processes. Safety-related considerations often limited how operator input could be incorporated. Importantly, operators acknowledged that safety should take priority but called for more transparency over decision-making processes and the factors considered thereby. Our findings highlight how general calls for stakeholder empowerment can contradict safety-critical (and other) requirements. We show the importance of engaging broad perspectives to explore conflicting demands before aligning/prioritising these in the application context. We further outline implications for participatory practice relevant for responsible AI and HCI communities.2026EKEmma Marlene Kallina et al.UA Ruhr University Duisburg-EssenAI-Assisted Decision-Making & AutomationParticipatory DesignResearch Ethics & Open ScienceCHI
Human-AI Interaction for Time-Critical Sensemaking in Missing Persons InvestigationsEvery year an estimated 200,000 people go missing in the UK alone. Missing persons investigations involve challenging time-critical sensemaking tasks based on fragmented data sources. This paper describes a mixed-methods participatory study evaluating data science and AI-driven techniques (summarisation, fact extraction, and data visualisation) for supporting these investigations as part of a human-centered workflow. A series of human-AI interfaces were iteratively designed and tested with search officers and domain experts at Police Scotland. Based on findings, we describe: (1) user and information needs for missing persons investigations; (2) insights on the benefits and challenges of applying LLM-based techniques in high-risk contexts; and (3) lessons for integrating AI for sensemaking tasks in policing more broadly. We highlight that in high-stakes contexts, where accuracy and context-sensitivity are paramount, AI techniques must be balanced with other approaches and designed in close partnership with end-users.2026PLPola Zuzanna Labedzka et al.University of CambridgeHuman-LLM CollaborationExplainable AI (XAI)Interactive Data VisualizationCHI
Creating and Evaluating Personas Using Generative AI: A Scoping Review of 81 ArticlesAs generative AI (GenAI) is increasingly applied in persona development to represent real users, understanding the implications and limitations of this technology is essential for establishing robust practices. This scoping review analyzes how 81 articles (2022-2025) use GenAI techniques for the creation, evaluation, and application of personas. The articles exhibited good level of reproducibility, with 61% of articles sharing resources (personas, code, or datasets). Furthermore, conversational persona interfaces are increasingly provided alongside traditional profiles. However, nearly half (45%) of the articles lack evaluation, and the majority (86%) use only GPT models. In some articles, GenAI use creates a risk of circularity, in which the same GenAI model both generates and evaluates outputs. Our findings also suggest that GenAI seems to reduce the role of human developers in the persona-creation process. To mitigate the associated risks, we propose actionable guidelines for the responsible integration of GenAI into persona development.2026DADanial Amin et al.University of VaasaGenerative AI (Text, Image, Music, Video)Human-LLM CollaborationExplainable AI (XAI)CHI
SimStep: Human-in-the-Loop Authoring of Interactive Educational Simulations Through Task-Level AbstractionsGenerative AI enables educators to create interactive learning content by describing goals in natural language. However, without programming affordances such as traceability, refinement, and debugging, teachers struggle to align simulations with learners’ needs, refine them step by step, or verify that they reflect intended learning concepts. We propose a task-level abstraction approach that structures authoring as a sequence of representations, mirroring how teachers plan lessons and providing checkpoints for specification, inspection, and refinement. We instantiate this approach in SimStep, an authoring environment that scaffolds simulation design with four abstractions, including Concept Graph, Scenario Graph, Learning Goal Graph, and UI Graph, and introduces an inverse correction process to revise hidden model assumptions without requiring code manipulation. A technical evaluation shows that these abstractions preserve fidelity across transformations, while a user study with educators demonstrates their effectiveness in authoring simulations. Our work reframes AI-assisted programming as human–AI co-authoring through structured, domain-aligned abstractions.2026ZKZoe Kaputa et al.University of WashingtonIntelligent Tutoring Systems & Learning AnalyticsParticipatory DesignPrototyping & User TestingCHI
Grand Challenges around Designing Computers’ Control Over Our BodiesAdvances in emerging technologies, such as on-body mechanical actuators and electrical muscle stimulation, have allowed computers to take control over our bodies. This presents opportunities as well as challenges, raising fundamental questions about agency and the role of our body when interacting with technology. To advance this research field as a whole, we brought together expert perspectives in a week-long seminar to articulate the grand challenges that should be tackled when it comes to the design of computers’ control over our bodies. These grand challenges span technical, design, user, and ethical aspects. By articulating these grand challenges, we aim to begin initiating a research agenda that positions bodily control not only as a technical feature but as a central, experiential, and ethical concern for future human–computer interaction endeavors.2026FMFlorian 'Floyd' Mueller et al.Monash UniversityElectrical Muscle Stimulation (EMS)Brain-Computer Interface (BCI) & NeurofeedbackEmpathy & Emotional DesignCHI
The Eye–Head Mover Spectrum: Modelling Individual and Population Head Movement Tendencies in Virtual RealityPeople differ in how much they move their head versus their eyes when shifting gaze, yet such tendencies remain largely unexplored in HCI. We introduce head movement tendencies as a fundamental dimension of individual difference in VR and provide a quantitative account of their population-level distribution. Using a 360° video free-viewing dataset (N=87), we model head contributions to gaze shifts with a hinge-based parametric function, revealing a spectrum of strategies from eye-movers to head-movers. We then conduct a user study (N=28) combining 360° video viewing with a short controlled task using gaze targets. While parameter values differ across tasks, individuals show partial alignment in their relative positions within the population, indicating that tendencies are meaningful but shaped by context. Our findings establish head movement tendencies as an important concept for VR and highlight implications for adaptive systems such as foveated rendering, viewport alignment, and multi-user experience design.2026JHJinghui Hu et al.Lancaster UniversityImmersion & Presence ResearchEye Tracking & Gaze InteractionSocial & Collaborative VRCHI
Results-Actionability Gap: Understanding How Practitioners Evaluate LLM Products in the WildHow do product teams evaluate LLM-powered products? As organizations integrate large language models (LLMs) into digital products, their unpredictable nature makes traditional evaluation approaches inadequate, yet little is known about how practitioners navigate this challenge. Through interviews with nineteen practitioners across diverse sectors, we identify ten evaluation practices spanning informal 'vibe checks' to organizational meta-work. Beyond confirming four documented challenges, we introduce a novel fifth we call the results-actionability gap, in which practitioners gather evaluation data but cannot translate findings into concrete improvements. Drawing on patterns from successful teams, we contribute strategies to bridge this gap, supporting practitioners' formalization journey from ad-hoc interpretive practices (e.g., vibe checks) toward systematic evaluation. Our analysis suggests these interpretive practices are necessary adaptations to LLM characteristics rather than methodological failures. For HCI researchers, this presents a research opportunity to support practitioners in systematizing emerging practices rather than developing new evaluation frameworks.2026WMWillem van der Maden et al.IT University of CopenhagenHuman-LLM CollaborationAI-Assisted Decision-Making & AutomationExplainable AI (XAI)CHI
Objestures: Everyday Objects Meet Mid-Air Gestures for Expressive InteractionEveryday object-based interactions (EOIs) and mid-air gesture interactions (MAIs) have been widely explored, yet prior work on their integration often targets narrow use cases or specific technologies, leaving designers and developers with limited guidance that generalizes across diverse EOIs and MAIs. We introduce Objestures (“Obj” + “Gestures”)—five interaction types spanning EOIs and MAIs, forming a design space for expressive uni- and bimanual interaction. To evaluate the usefulness of Objestures, we conducted an exploratory user study (N=12) on basic 3D tasks (rotation and scaling), which showed performance comparable to the headset's native freehand manipulation. To understand the user experience, we conducted case studies with the same participants across three applications (Sound, Draw, and Shadow), where participants found the interactions intuitive, engaging, and expressive, and indicated interest in everyday use. We further demonstrate the potential of Objestures across diverse contexts through 30 examples, and discuss limitations and implications.2026ZLZhuoyue Lyu et al.University of CambridgeMid-Air Haptics (Ultrasonic)Hand Gesture RecognitionPhysical-Digital Hybrid InteractionCHI
Demystifying Reward Design in Reinforcement Learning for Upper Extremity Interaction: Practical Guidelines for Biomechanical Simulations in HCIDesigning effective reward functions is critical for reinforcement learning-based biomechanical simulations, yet HCI researchers and practitioners often waste (computation) time with unintuitive trial-and-error tuning. This paper demystifies reward function design by systematically analyzing the impact of effort minimization, task completion bonuses, and target proximity incentives on typical HCI tasks such as pointing, tracking, and choice reaction. We show that proximity incentives are essential for guiding movement, while completion bonuses ensure task success. Effort terms, though optional, help refine motion regularity when appropriately scaled. We perform an extensive analysis of how sensitive task success and completion time depend on the weights of these three reward components. From these results we derive practical guidelines to create plausible biomechanical simulations without the need for reinforcement learning expertise, which we then validate on remote control and keyboard typing tasks. This paper advances simulation-based interaction design and evaluation in HCI by improving the efficiency and applicability of biomechanical user modeling for real-world interface development.2025HSHannah Selder et al.Human Pose & Activity RecognitionComputational Methods in HCIUIST
Design Activity Simulation: Opportunities and Challenges in Using Multiple Communicative AI Agents to Tackle Design ProblemsLarge Language Models (LLMs) can enhance structured design thinking, yet existing copilot approaches integrate them into human workflows rather than exploring their autonomous potential. This paper investigates how LLM-based communicative AI agents can independently tackle open-ended design problems and how their strengths and limitations inform human-AI collaboration. We iteratively design a system where AI agents play different roles and simulate human design activity through conversational turns. The agents investigate user needs, identify design constraints, and explore the design space, with useful insights emerging from their interactions. To assess reasoning quality, we conducted a human jury evaluation with five HCI researchers and explored potential applications through a contextual inquiry with seven professionals. Our findings demonstrate that integrating human design thinking techniques enhances AI reasoning. AI agents effectively tackle design problems, generating low-novelty yet well-grounded and practical solutions that meet key design requirements.2025BYBoyin Yang et al.Human-LLM CollaborationCreative Collaboration & Feedback SystemsKnowledge Worker Tools & WorkflowsCUI