Improving Human Verification of LLM Reasoning through Interactive Explanation InterfacesThe reasoning capabilities of Large Language Models (LLMs) have led to their increasing employment in several critical applications, particularly education, where they support problem-solving, tutoring, and personalized study. While there are a plethora of works showing the effectiveness of LLMs in generating step-by-step solutions through chain-of-thought (CoT) reasoning on reasoning benchmarks, little is understood about whether the generated CoT is helpful for end-users in improving their ability to comprehend mathematical reasoning problems and detect errors/hallucinations in LLM-generated solutions. To address this gap and contribute to understanding how reasoning can improve human-AI interaction, we present three new interactive reasoning interfaces: interactive CoT (iCoT), interactive Program-of-Thought (iPoT), and interactive Graph (iGraph), and a novel framework that generates the LLM's reasoning from traditional CoT to alternative, interactive formats. Across 125 participants, we found that interactive interfaces significantly improved performance. Specifically, the iGraph interface yielded the highest clarity and error detection rate (85.6 %), followed by iPoT (82.5 %), iCoT (80.6 %), all outperforming standard CoT (73.5 %). Interactive interfaces also led to faster response times, where participants using iGraph were fastest (57.9 secs), compared to iCoT and iPoT (60 secs), and the standard CoT baseline (64.7 secs). Furthermore, participants preferred the iGraph reasoning interface, citing its superior ability to enable users to follow the LLM's reasoning process. We discuss the implications of these results and provide recommendations for the future design of reasoning models. The code and interfaces for this project can be found here: https://github.com/Runtaozhou/Interactive-CoT.2026RZRuntao Zhou et al.University of VirginiaHuman-LLM CollaborationExplainable AI (XAI)Prototyping & User TestingIUI
``It Feels Like I am Invited to Communicate'': Mediating Ad-Hoc Bystander-VR User Interruptions Through Proactive ProxiesAs VR expands into public spaces, new challenges emerge around spontaneous interactions between bystanders and unfamiliar VR users. While current VR systems often prioritize user awareness of their physical surroundings, they overlook the social dynamics affecting nearby bystanders. We conducted a deception-based study (N=80) examining how interface availability influences bystanders' comfort, confidence, and hesitation when interrupting VR users. We compared traditional static interruption interfaces (e.g., button on screen) with a proactive proxy that actively approached bystanders upon detecting interruption intent. Static interfaces, due to insufficient cueing, frequently caused bystander discomfort, leading to hesitant physical interruptions or complete communication avoidance. In contrast, the proactive proxy implicitly conveyed social permission, significantly enhancing bystanders' comfort and confidence. Our findings provide empirical insights into how bystanders assess availability and initiate interruptions with unfamiliar VR users in shared spaces, offering design implications for VR systems that support bystander agency and comfort during these interactions.2026ARAdil Rahman et al.University of VirginiaSocial & Collaborative VRImmersion & Presence ResearchMulti-User Large Display CollaborationCHI
Infrastructuring for Access: Co-Designing Writing Tools with a Dyslexic AcademicDisability Studies and Accessibility HCI document what design elements are (in)accessible to disabled communities and illuminate technological ableism. However, Disability Studies' systemic critique rarely includes roadmaps for design. We articulate a roadmap by marrying HCI's infrastructuring theory with Disability Studies into an approach called Infrastructuring for Access. Focusing on dyslexic writers' experiences with spell checkers, we demonstrate Infrastructuring for Access in a collaborative design process. We co-designed software to address limitations of spell checkers and conducted an eight-month field deployment. Our technological contribution is Jargon Manager, a toolkit with a browser extension for writers to opportunistically save terms in a custom dictionary and then use later via a word processor extension. Our theory contribution moves from a space of critique into a space of repair: Infrastructuring for Access expands the design space from only removing barriers to also institutionalizing disabled practitioners' existing workarounds, therefore alleviating access labor and broadening participation.2026EWEmily Q. Wang et al.Oberlin CollegeCognitive Impairment & Neurodiversity (Autism, ADHD, Dyslexia)Universal & Inclusive DesignParticipatory DesignCHI
Evaluating Peer Fact-Checking on WhatsAppPrivate messaging platforms hinder public oversight, making misinformation hard to counter. Meanwhile, platforms are pivoting to crowdsourced verification amid waning trust in institutional fact-checkers. This raises a critical question: how do peer corrections compare with local journalists or fact-checking tiplines? We tested this via a privacy-preserving randomized field study on participants' real WhatsApp group messages in India, complemented by interviews. Fact-checks from a close contact significantly improved accuracy over the control group, while corrections from the local journalist and national tipline did not reach statistical significance. However, none of the interventions improved participants' ability to identify novel misinformation on similar themes, suggesting corrections on WhatsApp are context-bound rather than skill-building. We contribute: (1) the first ecologically valid randomized test of peer-led fact-checking on WhatsApp, benchmarked against journalists and tiplines; (2) an empirical account of how participants make sense of corrections in closed messaging environments; and (3) design implications for community-based fact-checking, including training high-social-capital individuals as embedded verifiers.2026SHSudhamshu Hosamane et al.Rutgers UniversityMisinformation & Fact-CheckingCommunity Collaboration & WikipediaCHI
DataSpeck: An AI-Driven Human-in-the-Loop System for Automating Transformations in Data Conversion WorkflowsIn data-driven systems, integrating disparate data sources becomes challenging when incoming data does not conform to the system's specifications. Despite advances in automated schema matching systems, data integration tasks involving complex semantic interrelationships still require users to manually identify and define transformations between datasets, which can be cognitively demanding and time-consuming. We present DataSpeck, an end-to-end system that automates the conversion of disparate data sources to fit any pre-existing data specification. DataSpeck employs an AI-driven human-in-the-loop design, using LLMs to analyze semantic relationships and generate step-by-step transformation pipelines autonomously, while only requesting user attention to resolve semantic ambiguities. In our technical evaluation, DataSpeck successfully automated ~86% of varied data transformations while generating interpretable strategies with confidence scores and targeted clarification requests. In a user study (N=12), participants completed data conversion tasks ~53% faster with significantly reduced cognitive load using DataSpeck compared to Microsoft Excel with Copilot.2026ARAdil Rahman et al.University of VirginiaHuman-LLM CollaborationExplainable AI (XAI)AI-Assisted Decision-Making & AutomationCHI
Redirected Pinch: Efficient and Comfortable Bare-Hand Interaction for 2D Windows in VRVirtual Reality (VR) offers portable and flexible workspaces. However, enabling efficient and comfortable interactions without external input devices remains challenging. We propose leveraging redirected input to enable comfortable and touch-like interaction for quick and intuitive control. Our design study revealed that while touch interaction performs well with direct input, its performance degrades significantly under input redirection. In contrast, using pinch improves redirected input by providing self-haptic feedback and reducing input dimensionality, thereby compensating for spatial discrepancies. Based on these findings, we introduce Redirected Pinch, a bare-hand interaction technique that combines input redirection with pinch confirmation. It creates a virtual plane at waist height, remapping hand movements on the plane to a vertical window, with pinch gestures used for confirmation. A user study demonstrated that Redirected Pinch achieves a strong balance of accuracy, efficiency, comfort, and sense of agency across fundamental interactions.2026WYWen Ying et al.University of VirginiaImmersion & Presence ResearchFull-Body Interaction & Embodied InputIn-Vehicle Haptic, Audio & Multimodal FeedbackCHI
Exploring Teacher-Chatbot Interaction and Affect in Block-Based ProgrammingAI-based chatbots have the potential to accelerate learning and teaching, but may also have counterproductive consequences without thoughtful design and scaffolding. To better understand teachers’ perspectives on large language model (LLM) based chatbots, we conducted a study with 11 teams of middle-school teachers using chatbots for a science and computational thinking activity within a block-based programming environment. Based on a qualitative analysis of audio transcripts and chatbot interactions, we propose three profiles: explorer, frustrated, and mixed that reflect diverse scaffolding needs. In their discussions, we found that teachers perceived chatbot benefits such as building prompting skills and self confidence alongside risks including potential declines in learning and critical thinking. Key design recommendations include scaffolding the introduction to chatbots, facilitating teacher control of chatbot features, and suggesting when and how chatbots should be used. Our contribution informs the design of chatbots to support teachers and learners in middle school coding activities.2026BRBahare Riahi et al.North Carolina State UniversityHuman-LLM CollaborationProgramming Education & Computational ThinkingIntelligent Tutoring Systems & Learning AnalyticsCHI
Inferring Affect and Intervention Opportunities for Cancer Survivors from Digital Diaries with Context-Aware LLMsCancer survivors face unique mental health challenges, yet nearly half report unmet psychosocial needs. Smartphone interventions could help, but a major obstacle is knowing if, when, and how to intervene because inferring affective states with low-burden methods is hard. We test whether ultra-brief mobile diaries can infer contextual information approximating survivors’ affect, desire to regulate affect, and potential availability for brief digital behavioral interventions. Analyzing 24,183 entries from 407 survivors, administrative and health-related situations align with higher negative affect, whereas leisure/social situations align with higher positive affect. We introduce a Context-Aware LLM (CALLM) framework, which curates context via similarity-aligned peer cases and short personal trajectories, achieving balanced accuracy of 72.96% (positive affect), 73.29% (negative affect), 73.72% (regulation desire), and 60.09% (intervention availability), outperforming baselines. Post-hoc analyses show LLM confidence tracks accuracy, longer entries aid inference, and brief calibration improves personalization. Findings inform future just-in-time adaptive interventions for this underrepresented population.2026ZWZhiyuan Wang et al.Department of Systems and Information Engineering, University of VirginiaHuman-LLM CollaborationMental Health Apps & Online Support CommunitiesSleep & Stress MonitoringCHI
ThingMoji: User-Captured Cut-Outs For In-Stream Visual CommunicationLive streaming has become increasingly popular, driven by the desire for direct and real-time interactions between streamers and viewers. However, current text-based interactions and pre-defined emojis limit expressiveness, especially when referring to specific stream moments. We propose ThingMoji, a type of user-captured cut-outs to enhance user expression and foster more effective communication between streamers and their audience. ThingMojis are unique digital icons created by users by capturing snapshots and annotating specific areas at any point during the stream. We developed StreamThing, a live-streaming platform integrated with ThingMojis, to explore their use during object-focused live streaming contexts. In a user study with three in-the-wild deployments reveals the expressive use of ThingMojis in diverse live-streaming scenarios with rich visual contents. Our findings show that ThingMojis enable viewers to reference specific objects, express emotions, and create shared visual narratives. Streamers found ThingMojis valuable for facilitating on-the-fly communication around visual content and fostering playful interactions. The study also uncovered challenges in ThingMoji comprehension, issues for long-term uses of ThingMojis, and potential concerns regarding misuse. Based on these insights, we discussed new opportunities for supporting object-focused communication during live streaming environments.2025EHErzhen Hu et al.Online Interactions with Friends and StrangersCSCW
Thing2Reality: Enabling Spontaneous Creation of 3D Objects from 2D Content using Generative AI in XR MeetingsDuring remote communication, participants often share both digital and physical content, such as product designs, digital assets, and environments, to enhance mutual understanding. Recent advances in augmented communication have facilitated users to swiftly create and share digital 2D copies of physical objects from video feeds into a shared space. However, conventional 2D representations of digital objects limits spatial referencing in immersive environments. To address this, we propose Thing2Reality, an Extended Reality (XR) meeting platform that facilitates spontaneous discussions of both digital and physical items during remote sessions. With Thing2Reality, users can quickly materialize ideas or objects in immersive environments and share them as conditioned multiview renderings or 3D Gaussians. Thing2Reality enables users to interact with remote objects or discuss concepts in a collaborative manner. Our user studies revealed that the ability to interact with and manipulate 3D representations of objects significantly enhances the efficiency of discussions, with the potential to augment discussion of 2D artifacts.2025EHErzhen Hu et al.Social & Collaborative VRIdentity & Avatars in XRGenerative AI (Text, Image, Music, Video)UIST
DialogLab: Authoring, Simulating, and Testing Dynamic Human-AI Group ConversationsDesigning compelling multi-party conversations involving both humans and AI agents presents significant challenges, particularly in balancing scripted structure with emergent, human-like interactions. We introduce DialogLab, a prototyping toolkit for authoring, simulating, and testing hybrid human-AI dialogues. DialogLab provides a unified interface to configure conversational scenes, define agent personas, manage group structures, specify turn-taking rules, and orchestrate transitions between scripted narratives and improvisation. Crucially, DialogLab allows designers to introduce controlled deviations from the script—through configurable agents that emulate human unpredictability—to systematically probe how conversations adapt and recover. DialogLab facilitates rapid iteration and evaluation of complex, dynamic multi-party human-AI dialogues. An evaluation with both end users and domain experts demonstrates that DialogLab supports efficient iteration and structured verification, with applications in training, rehearsal, and research on social dynamics. Our findings show the value of integrating real-time, human-in-the-loop improvisation with structured scripting to support more realistic and adaptable multi-party conversation design.2025EHErzhen Hu et al.Conversational ChatbotsHuman-LLM CollaborationUIST
AR-Based Embodied Avatar Assistance for Nonspeaking Autistic People? Design and Feasibility StudyMany nonspeaking autistic individuals rely on Communication and Regulation Partners (CRPs) to develop spelling-based communication using physical letterboards, but this support is often geographically inaccessible. We developed a remote presence system using Augmented Reality (AR) to enable immersive, collaborative spelling instruction. The system features holographic letterboards and fully embodied avatars with real-time head and hand tracking, allowing remote interaction between students and CRPs. In a study with 18 nonspeaking autistic participants, 15 (83%) successfully completed avatar-supported sessions. Interaction was higher, and participants reported a preference for the avatar condition over voice-only support. These findings demonstrate the feasibility of avatar-based AR telepresence for remote communication training. The system provides a demonstration of AR-supported interaction designed with nonspeaking autistic individuals—an underrepresented group in HCI—and offers design insights for inclusive telepresence technologies that address geographic and accessibility barriers.2025TDTravis Dow et al.Identity & Avatars in XRAugmentative & Alternative Communication (AAC)DIS
Grab-and-Release Spelling in XR: A Feasibility Study for Nonspeaking Autistic People Using Video-Passthrough DevicesThis paper explores the feasibility of using video-passthrough Extended Reality (XR) devices to support communication in nonspeaking autistic individuals. Prior XR work relied on expensive AR headsets and near-hand tapping interactions. We present LetterBox, a novel application for video-passthrough XR headsets (e.g., Meta Quest) that enables spelling via a “grab-snap-release” interaction. The app includes three immersion levels and a dynamic pass-through window to maintain caregiver presence. We conducted a study with 19 participants across four North American sites. All completed a multiphase spelling task and answered open-ended questions. Despite tolerability concerns, all participants wore the headset throughout; only one requested a break. The average spelling accuracy was 90.91%. In open-spelling, 14 participants responded—often independently. Reaction time and interaction speed data highlighted the impact of visual complexity, offering insights for reducing errors. These findings suggest video-passthrough XR is well tolerated and that grab-snap-release interactions may benefit users with motor challenges.2025LALorans Alabood et al.Identity & Avatars in XRSpecial Education TechnologyDIS
Diversifying Grain-Based Compliance Illusion by Varying Base ComplianceGrain-based compliance illusion mimics the mechanical vibrations when a compliant object deforms with grain-like, short (~15 ms) impulse-response vibrations. Previous work has demonstrated its robust effect on various types of devices. However, the impact of the device's inherent compliance (i.e., base compliance) on perceived compliance remains unclear. This paper investigates the influence of base compliance on the perception of illusory compliance through three psychophysical experiments. The results show that (1) the compliance illusion remained effective with base compliance, (2) the description of compliance was affected by both illusory and base compliance, and (3) it is possible to render the compliance with the same magnitude but multiple different feelings.2025BMBuyoung Mun et al.UNIST, TACT Lab, Computer Science & EngineeringVibrotactile Feedback & Skin StimulationForce Feedback & Pseudo-Haptic WeightCHI
Infrastructures for Inspiration: The Routine of Creative Identity Through Inspiration on the Creative InternetOnline, visual artists have more places than ever to routinely share their creative work and connect with other artists. These interactions support the routine enactment of creative identity in artists and provide inspirational opportunities for artists. As creative work shifts online, interactions between artists and routines around how these artists get inspired to do creative work are mediated by and through the logics of the online platforms where they take place. In an interview study of 22 artists, this paper explores the interplay between the development of artists' creative identities and the, at times, contradictory practices they have around getting inspired. We find platforms which support the disciplined practice of creative work while supporting spontaneous moments of inspiration, play an increasing role in passive approaches to searching for inspiration, and foster numerous small community spaces for artists to negotiate their creative identities. We discuss how platforms can better support and embed mechanisms for inspiration into their infrastructures into their design and platform policy.2025ESEllen Simpson et al.University of Virginia, School of Data ScienceCreative Collaboration & Feedback SystemsKnowledge Worker Tools & WorkflowsCHI
Understanding Attitudes and Trust of Generative AI Chatbots for Social Anxiety SupportSocial anxiety (SA) has become increasingly prevalent. Traditional coping strategies often face accessibility challenges. Generative AI (GenAI), known for their knowledgeable and conversational capabilities, are emerging as alternative tools for mental well-being. With the increased integration of GenAI, it is important to examine individuals' attitudes and trust in GenAI chatbots' support for SA. Through a mixed-method approach that involved surveys (n = 159) and interviews (n = 17), we found that individuals with severe symptoms tended to trust and embrace GenAI chatbots more readily, valuing their non-judgmental support and perceived emotional comprehension. However, those with milder symptoms prioritized technical reliability. We identified factors influencing trust, such as GenAI chatbots' ability to generate empathetic responses and its context-sensitive limitations, which were particularly important among individuals with SA. We also discuss the design implications and use of GenAI chatbots in fostering cognitive and emotional trust, with practical and design considerations.2025YWYimeng Wang et al.William & MaryConversational ChatbotsHuman-LLM CollaborationMental Health Apps & Online Support CommunitiesCHI
Your Hands Can Tell: Detecting Redirected Hand Movements in Virtual RealityIn-air hand interactions are prevalent in Virtual Reality (VR), and prior studies have shown that manipulating the visual movement of the hand to be different from the actual hand movement, i.e., hand redirection, could create a more immersive and engaging VR experience. However, this manipulation risks degrading task performance and, if maliciously applied, poses a threat to user safety. Such manipulations may arise from VR applications developed with intentional or inadvertent perceptual manipulations that yield harmful outcomes. We advocate for a user's prerogative to be informed of any such potential manipulations before application usage. To address this, our study introduces an \textit{Autoencoder}-based anomaly detection technique that leverages users' inherent hand movements to identify hand redirection, thereby preserving the integrity of application use. Our model is trained on regular (i.e., non-manipulated) hand movement patterns and employs a stochastic thresholding approach for anomaly detection. We validated our method through a technical evaluation involving 21 participants engaged in reaching tasks under manipulated and non-manipulated scenarios. The results demonstrated a high accuracy of hand redirection detection at 93.7%, with an F1-score of 93.9%.2025MAMd Aashikur Rahman Azim et al.University of Virginia, Department of Computer ScienceVibrotactile Feedback & Skin StimulationHand Gesture RecognitionFull-Body Interaction & Embodied InputCHI
CommSense: A Wearable-Based Computational Framework for Evaluating Patient-Clinician InteractionsQuality patient-provider communication is critical to improve clinical care and patient outcomes. While progress has been made with communication skills training for clinicians, significant gaps exist in how to best monitor, measure, and evaluate the implementation of communication skills in the actual clinical setting. Advancements in ubiquitous technology and natural language processing make it possible to realize more objective, real-time assessment of clinical interactions and in turn provide more timely feedback to clinicians about their communication effectiveness. In this paper, we propose CommSense, a computational sensing framework that combines smartwatch audio and transcripts with natural language processing methods to measure selected ''best-practice'' communication metrics captured by wearable devices in the context of palliative care interactions, including understanding, empathy, presence, emotion, and clarity. We conducted a pilot study involving N=40 clinician participants, to test the technical feasibility and acceptability of CommSense in a simulated clinical setting. Our findings demonstrate that CommSense effectively captures most communication metrics and is well-received by both practicing clinicians and student trainees. Our study also highlights the potential for digital technology to enhance communication skills training for healthcare providers and students, ultimately resulting in more equitable delivery of healthcare and accessible, lower cost tools for training with the potential to improve patient outcomes.2024ZWZhiyuan Wang et al.Session 4b: Patient-Centered Care and Youth EmpowermentCSCW
Changing Your Tune: Lessons for Using Music to Encourage Physical ActivityClark 等人分析音乐干预对体育活动的影响,总结出音乐提升运动动机和表现的设计原则,为促进身体活动提供实用策略。2024MCMatthew Clark et al.Fitness Tracking & Physical Activity MonitoringMusic Composition & Sound Design ToolsUbiComp
JupyterLab in Retrograde: Contextual Notifications That Highlight Fairness and Bias Issues for Data ScientistsCurrent algorithmic fairness tools focus on auditing completed models, neglecting the potential downstream impacts of iterative decisions about cleaning data and training machine learning models. In response, we developed Retrograde, a JupyterLab environment extension for Python that generates real-time, contextual notifications for data scientists about decisions they are making regarding protected classes, proxy variables, missing data, and demographic differences in model performance. Our novel framework uses automated code analysis to trace data provenance in JupyterLab, enabling these notifications. In a between-subjects online experiment, 51 data scientists constructed loan-decision models with Retrograde providing notifications continuously throughout the process, only at the end, or never. Retrograde's notifications successfully nudged participants to account for missing data, avoid using protected classes as predictors, minimize demographic differences in model performance, and exhibit healthy skepticism about their models.2024GHGalen Harrison et al.University of Virginia, University of ChicagoGenerative AI (Text, Image, Music, Video)Explainable AI (XAI)Algorithmic Transparency & AuditabilityCHI