The HEART Interface: Visualizing Risk Score Uncertainty in the Cardiothoracic ICUArtificial Intelligence (AI) holds significant potential for supporting clinical decision-making, particularly in high-pressure environments, such as Cardiothoracic Intensive Care Units (CT-ICU). Care teams in these settings face challenges such as alarm fatigue, rapid staff turnover, time-sensitive decisions, and an overwhelming amount of data. AI-driven Clinical Decision-Support Systems (AI-CDSS) can support care teams in overcoming some of these challenges by providing solutions like detecting and reporting risk scores for adverse events that may lead to increased fatalities or re-admissions, enabling timely intervention. One key challenge with risk scores is missing data, which can create considerable uncertainty in risk score values. AI-CDSSs rarely convey the risk score uncertainty, which is important in the effectiveness and reliability of clinical decision-making. In this paper, we describe the interface design process for HEART, an AI-powered system developed collaboratively with clinical and AI experts over a 16-month iterative design process for a hospital's CT-ICU. The HEART interactive interface integrates understandable visualizations of risk scores and their uncertainty within both a holistic view of all patients in the unit and detailed patient-specific views. We reflect on the user-centered design process, report findings from an expert walkthrough study, and discuss lessons learned as well as broader implications. This work contributes valuable insights into uncertainty visualization design for AI-derived risk scores in a critical care application. Beyond these specific insights, our work illustrates the kind of comprehensive, human-centered design process necessary for responsible AI adoption in critical environments. Supplemental Material, including a video demo of the HEART interface and additional details on the algorithm, is available on the project's OSF repository: https://osf.io/akqx8/.2026MNMahsan Nourani et al.Northeastern UniversityExplainable AI (XAI)Uncertainty VisualizationMedical & Scientific Data VisualizationIUI
Balancing Efficiency and Empathy: Healthcare Providers' Perspectives on AI-Supported Workflows for Serious Illness Conversations in the Emergency DepartmentSerious Illness Conversations (SICs)—discussions about values and care preferences for patients with life-threatening illness—rarely occur in Emergency Departments (EDs), despite evidence that early conversations improve care alignment and reduce unnecessary interventions. We interviewed 11 ED providers to identify challenges in SICs and opportunities for technology support, with a focus on AI. Our analysis revealed a four-stage SIC workflow (identification, preparation, conduction, documentation) and barriers at each stage, including fragmented patient information, limited time and space, lack of conversational guidance, and burdensome documentation. Providers expressed interest in AI systems for synthesizing information, supporting real-time conversations, and automating documentation, but emphasized concerns about preserving human connection and clinical autonomy. This tension highlights the need for technologies that enhance efficiency without undermining the interpersonal nature of SICs. We propose design guidelines for ambient and peripheral AI systems to support providers while preserving the essential humanity of these conversations.2026MZMenglin Zhao et al.Northeastern UniversityAI-Assisted Decision-Making & AutomationMental Health Apps & Online Support CommunitiesTelemedicine & Remote Patient MonitoringCHI
Human-AI Narrative Synthesis to Foster Shared Understanding in Civic Decision-MakingCommunity engagement processes in representative political contexts, like school districts, generate massive volumes of feedback that overwhelm traditional synthesis methods, creating barriers to shared understanding not only between civic leaders and constituents but also among community members. To address these barriers, we developed StoryBuilder, a human-AI collaborative pipeline that transforms community input into accessible first-person narratives. Using 2,480 community responses from an ongoing school rezoning process, we generated 124 composite stories and deployed them through a mobile-friendly StorySharer interface. Our mixed-methods evaluation combined a four-month field deployment, user studies with 21 community members, and a controlled experiment examining how narrative composition affects participant reactions. Field results demonstrate that narratives helped community members relate across diverse perspectives. In the experiment, experience-grounded narratives generated greater respect and trust than opinion-heavy narratives. We contribute a human-AI narrative synthesis system and insights on its varied acceptance and effectiveness in a real-world civic context.2026COCassandra Overney et al.Massachusetts Institute of TechnologyHuman-LLM CollaborationParticipatory DesignCommunity Engagement & Civic TechnologyCHI
Exploring the Future of AI in Clinical Collaboration: A Study on Tumor Board Case Preparation Multidisciplinary tumor boards (MTBs) bring specialists together to identify therapies for complex cancer cases, but preparing for them is time-intensive. Clinicians must extract key details from extensive records and evaluate treatment options. While large language models (LLMs) show promise in medicine for basic tasks like summarizing notes, little is known about their role in high-stakes tasks like MTB preparation. We conducted a mixed-methods study with 16 oncologists using two AI systems to prepare patient cases for MTB: an off-the-shelf assistant (Copilot) and a task-specific multi-agent system (Healthcare Agent Orchestrator, HAO). We analyzed oncologist prompts, AI responses, and oncologists' perception of AI. Participants showed greater willingness to adopt HAO but were often overconfident in AI summaries and skeptical of AI-recommended therapies. Trust calibration strategies, such as source links and agent-trajectories, failed to align trust with system capabilities. We conclude with how AI systems should be built to support clinicians in high-stakes tasks.2026JLJiachen Li et al.Northeastern UniversityHuman-LLM CollaborationExplainable AI (XAI)AI-Assisted Decision-Making & AutomationCHI
Dark Patterns Meet GUI Agents: LLM Agent Susceptibility to Manipulative Interfaces and the Role of Human OversightThe dark patterns, deceptive interface designs manipulating user behaviors, have been extensively studied for their effects on human decision-making and autonomy. Yet, with the rising prominence of LLM-powered GUI agents that automate tasks from high-level intents, understanding how dark patterns affect agents is increasingly important. We present a two-phase empirical study examining how agents, human participants, and human-AI teams respond to 16 types of dark patterns across diverse scenarios. Phase 1 highlights that agents often fail to recognize dark patterns, and even when aware, prioritize task completion over protective action. Phase 2 revealed divergent failure modes: humans succumb due to cognitive shortcuts and habitual compliance, while agents falter from procedural blind spots. Human oversight improved avoidance but introduced costs such as attentional tunneling and cognitive load. Our findings show neither humans nor agents are uniformly resilient, and collaboration introduces new vulnerabilities, suggesting design needs for transparency, adjustable autonomy, and oversight.2026JTJingyu Tang et al.University of Notre DameDark Patterns RecognitionHuman-LLM CollaborationAI-Assisted Decision-Making & AutomationCHI
Through the Lens of Human-Human Collaboration: An Configurable Research Platform for Exploring Human-Agent CollaborationIntelligent systems have traditionally been designed as tools rather than collaborators, often lacking critical characteristics that collaboration partnerships require. Recent advances in large language model (LLM) agents open new opportunities for human-LLM-agent collaboration by enabling natural communication and various social and cognitive behaviors. Yet it remains unclear whether principles of computer-mediated collaboration established in HCI and CSCW persist, change, or fail when humans collaborate with LLM agents. To support systematic investigations of these questions, we introduce an open and configurable research platform for HCI researchers. The platform's modular design allows seamless adaptation of classic CSCW experiments and manipulation of theory-grounded interaction controls. We demonstrate the platform's research efficacy and usability through three case studies: (1) two Shape Factory experiments for resource negotiation with 16 participants, (2) one Hidden Profile experiment for information pooling with 16 participants, and (3) a participatory cognitive walkthrough with five HCI researchers to refine workflows of researcher interface for experiment setup and analysis.2026BYBingsheng Yao et al.Northeastern UniversityHuman-LLM CollaborationParticipatory DesignPrototyping & User TestingCHI
My Body, Their Business: User Perspectives on Commercial Data Practices in FemTech AppsFemTech, including apps for fertility, menstruation, and menopause, increasingly shapes how users manage intimate aspects of their health. Yet these apps are often built on opaque commercial models, raising ethical concerns about consent, privacy, and misuse of sensitive health data. While prior work has documented these risks, less is known about how users perceive and negotiate commercial data practices in FemTech apps. We conducted an online survey with 187 participants, combining factorial vignettes with provotypes--- interface prototypes designed to provoke reflection--- to examine user boundaries and discomforts around FemTech data collection and commercial use. Participants drew sharp distinctions across data types, resisting peripheral data collection and pervasive tracking. Commercial practices were often judged conditionally: tolerated only when functionally relevant. Notably, our provotypes, even under exaggerated transparency, elicited more forgiving responses to commercial practices compared to brief text descriptions in the vignettes. We discuss implications for designing transparent, accountable, and user-aligned FemTech.2026GAGhada Alsebayel et al.Northeastern UniversityPrivacy by Design & User ControlPrivacy Perception & Decision-MakingReproductive & Women's HealthCHI
Sometimes You Need Facts, and Sometimes a Hug: Understanding Older Adults’ Preferences for Explanations in LLM-Based Conversational AI SystemsDesigning Conversational AI systems to support older adults requires these systems to explain their behavior in ways that align with older adults’ preferences and context. While prior work has emphasized the importance of AI explainability in building user trust, relatively little is known about older adults’ requirements and perceptions of AI-generated explanations. To address this gap, we conducted an exploratory Speed Dating study with 23 older adults to understand their responses to contextually grounded AI explanations. Our findings reveal the highly context-dependent nature of explanations, shaped by conversational cues such as the content, tone, and framing of explanation. We also found that explanations are often interpreted as interactive, multi-turn conversational exchanges with the AI, and can be helpful in calibrating urgency, guiding actionability, and providing insights into older adults’ daily lives for their family members. We conclude by discussing implications for designing context-sensitive and personalized explanations in Conversational AI systems.2026NMNiharika Mathur et al.Georgia Institute of TechnologyHuman-LLM CollaborationExplainable AI (XAI)Aging-Friendly Technology DesignCHI
"I can take what I want and adapt as needed": BIPOC Identity Making and Resistance Through Internet Aesthetics on TikTokInternet Aesthetics are personal styles that are curated, instantiated, and remade on social media through collections of art, fashion, sensory experiences, literature, and media to communicate and share lifestyle narratives. BIPOC users often use Internet Aesthetics on TikTok as identity-making tools. However, they may experience algorithmic symbolic annihilation in which the platform neglects the existence of BIPOC in particular Internet Aesthetics, reducing their agency over their online identity-making. Using semi-structured interviews, we identify how BIPOC users understand Internet Aesthetics and what strategies BIPOC use to engage with them on TikTok. We discuss how BIPOC users apply algorithmic folk theories and offline strategies to resist symbolic annihilation while engaging in identity-making by extracting joy and meaning from Internet Aesthetics. We also model the uncertainty BIPOC users face around experiencing symbolic annihilation using the concept of microaggressions and give guidance on designing tools to addressing this phenomenon.2026NCNatalie Chen et al.University of MichiganGender & Race Issues in HCISocial Platform Design & User BehaviorAI Ethics, Fairness & AccountabilityCHI
"Do I Really Need This?": Illuminating Challenges in Integrating Computational Training Tools in Esports CoachingThe rise in popularity and value of esports motivates the creation of computational training tools (CTTs) for learning, assessment, and skill gain. While some tools exist commercially, much of the work in the research literature is rarely used outside of a lab, resulting in a lack of knowledge on the challenges involved in real-world integration. In this work, we develop a bespoke CTT for League of Legends, MySkills, based on prior work and deploy it at a professional training academy for three months. Based on two rounds of stakeholder interviews, we uncover insights into users' perspectives on using CTTs in esports coaching and the challenges inherent in introducing a novel tool into an existing, real-world esports training context. From these results, we connect the domain of esports training technology to existing conversations on translational HCI, challenges in bridging research and practice, and present implications for future work.2026EKErica Kleinman et al.Northeastern UniversityGame UX & Player BehaviorSerious & Functional GamesPrototyping & User TestingCHI
Amplifying Rural Educators’ Perspectives: A Qualitative Study on the Impacts of Generative AI in Rural U.S. High SchoolsRecent breakthroughs in Generative AI (GenAI) are reshaping educational landscapes, presenting challenges and opportunities. While all contexts present unique challenges, rural schools are historically under-resourced, facing persistent technology-related barriers. To understand and reduce these barriers, we studied 31 rural high school educators across three U.S. states to examine their use of GenAI and understand how GenAI introduces new challenges, opportunities, and may exacerbate existing educational barriers. Results show while rural educators use GenAI to streamline teaching tasks, existing resource disparities restrict meaningful integration. Through rural educators' voices, we reveal issues like infrastructure barriers, resistance to adoption, and lack of AI literacy training create significant obstacles. Nonetheless, educators envision GenAI can support themselves and their students, but findings emphasize the need for rural-specific design approaches. As a community, embracing inclusive GenAI design and re-examining assumptions about technology adoption in under-served educational contexts is essential to reducing barriers rather than widening them.2026SMShira Michel et al.Northeastern UniversityGenerative AI (Text, Image, Music, Video)Human-LLM CollaborationInclusive DesignCHI
DiagLink: A Dual-User Diagnostic Assistance System by Synergizing Experts with LLMs and Knowledge GraphsThe global shortage and uneven distribution of medical expertise continue to hinder equitable access to accurate diagnostic care. While existing intelligent diagnostic system have shown promise, most struggle with dual-user interaction, and dynamic knowledge integration—limiting their real-world applicability. In this study, we present DiagLink, a dual-user diagnostic assistance system that synergizes large language models (LLMs), knowledge graphs (KGs), and medical experts to support both patients and physicians. DiagLink uses guided dialogues to elicit patient histories, leverages LLMs and KGs for collaborative reasoning, and incorporates physician oversight for continuous knowledge validation and evolution. The system provides a role-adaptive interface, dynamically visualized history, and unified multi-source evidence to improve both trust and usability. We evaluate DiagLink through user study, use cases and expert interviews, demonstrating its effectiveness in improving user satisfaction and diagnostic efficiency, while offering insights for the design of future AI-assisted diagnostic systems.2026ZZZihan Zhou et al.Northeastern UniversityHuman-LLM CollaborationExplainable AI (XAI)Telemedicine & Remote Patient MonitoringCHI
Touch, Gesture, and Conversation: A Case Study in the Choreography of Interaction with a Data PhysicalizationMany potential benefits of data physicalizations are thought to stem from their tangible nature and their presence in a shared space, which allows for physical interaction in a social environment. Prior research has studied where observers touch data physicalizations and how these touches depend on task. However, limited research explores the moment-by-moment details of touches, gestures, and verbal dialogue and how these interactions contribute to the larger data physicalization sensemaking process. This case study offers a new data analysis from a previous study to provide fine-grained accounting of one participant’s touches, gestures, and conversations around a data object. We identify four types of touches and gestures and describe relational patterns between individual interactions. This work provides a foundation for further exploring the reasons people touch or gesture with data physicalizations, connecting these efforts with gesture studies research, and identifying design implications for data physicalization.2026LPLaura J Perovich et al.Northeastern UniversityData PhysicalizationPhysical-Digital Hybrid InteractionComputational Methods in HCICHI
The Pit Beneath the Town Square: How Digital Solastalgia Affects Platform Migration and Community Structures of Transfeminine UsersTransgender women and transfeminine nonbinary people (transfems) rely on social media for community and critical resources that are rare offline. However, transfems face perpetual risks of those communities suddenly turning hostile or transphobic, forcing withdrawal and losing those resources. To investigate, we conducted a two-stage asynchronous remote communities (ARC) study with 27 total participants, drawn from transfeminine-heavy online communities. We found that transfems forced off crucial platforms by transphobic hostility experience digital solastalgia, distress caused by observing a platform's character significantly erode over time. Moreover, through visual elicitation, we found that transfems perceive characteristics of locality and space in their online communities. These factors shape transfems' disengagement strategies and community structures. Informed by this, we develop community-sourced design guidance for designers and admins on building robust online platforms that can better protect transfeminine communities, and investigate the transferable effects of these suggestions to other marginalized groups, such as disabled users.2026EMErika Melder et al.Northeastern UniversitySocial Platform Design & User BehaviorEmpowerment of Marginalized GroupsTechnology Ethics & Critical HCICHI
Whose Time Counts? Temporal Arrangements in Sociotechnical InfrastructuresThis paper examines how infrastructures organize time in ways that unevenly distribute burden, access, and opportunity across communities. We draw on two ethnographic cases: eviction case filings in Atlanta, part of the state’s legal and housing governance infrastructure, and a sexual healthcare intervention in Chicago, situated within the city’s public health services. We advance HCI’s engagement with temporality by demonstrating how infrastructures sediment layers of political, social, and technical decisions over time. We conceptualize infrastructures as stratified formations where earlier allocations of power become materially and procedurally embedded, configuring present-day experiences of public systems. We define \emph{temporal arrangements} as the patterned ways infrastructures shape and allocate time, producing unequal demands on who waits, who moves, and who must continually adjust. We describe two temporal arrangements—\emph{compression} and \emph{gaps}—to show how systems structure and constrain access to care, support, and basic services. By linking inherited infrastructural logics to everyday temporal burdens, we offer HCI a framework for examining how inequities persist through time.2026CWCatherine Wieczorek et al.Georgia Institute of TechnologyTechnology Ethics & Critical HCIPrivacy by Design & User ControlHCI in Public Health Crises (e.g., COVID-19)CHI
LLM-based Embodied Conversational Agent for Reducing Foreign Language Speaking Anxiety in Social VRForeign language speaking anxiety (FLSA) poses a major challenge for English-language learners, suppressing confidence and triggering a cycle of avoidance that hinders language acquisition. To address this, we explored the use of LLM-based embodied conversational agents (ECA) in social virtual reality (VR), which provide personalized support and multimodal interaction in a contextualized environment. We developed three English-language learning scenarios in social VR and conducted a five-day mixed-methods study where participants (N=20) engaged in daily 30-minute role-play practice with an LLM-based ECA to evaluate the efficacy of the system. Quantitative results showed a significant reduction in self-reported FLAS after 3 days, along with subtle gains in speaking proficiency measures. Qualitatively, learners perceived increased confidence, attributing it to the LLM-based ECA's non-judgmental stance, linguistic scaffolding, affective encouragement, and adaptive feedback. Our findings suggest the potential of LLM-based ECAs in social VR for language learning and offer considerations for future agent design.2026MPMengxu Pan et al.Northeastern UniversityHuman-LLM CollaborationSocial & Collaborative VRImmersion & Presence ResearchCHI
VizCrit: Exploring Strategies for Displaying Computational Feedback in a Visual Design ToolVisual design instructors often provide multi-modal feedback, mixing annotations with text. Prior theory emphasizes the importance of actionable feedback, where “actionability” lies on a spectrum—from surfacing relevant design concepts to suggesting concrete fixes. How might creativity tools implement annotations that support such feedback, and how does the actionability of feedback impact novices’ process-related behaviors, perceptions of creativity, learning of design principles, and overall outcomes? We introduce VizCrit, a system for providing computational feedback that supports the actionability spectrum, realized through algorithmic issue detection and visual annotation generation. In a between-subjects study (N=36), novices revised a design under one of three conditions: textbook-based, awareness-centered, or solution-centered feedback. We found that solution-centered feedback led to fewer design issues and higher self-perceived creativity compared with textbook-based feedback, although expert ratings on creativity showed no significant differences. We discuss the implications for AI in Creativity Support Tools, including the potential of calibrating feedback actionability to help novices balance productivity with learning, growth, and developing design awareness.2026MLMingyi Li et al.Northeastern UniversityGenerative AI (Text, Image, Music, Video)Creative Collaboration & Feedback SystemsCHI
Colour in Translation: Data, Models, and Benchmarking for Cross-Linguistic Colour NamingColour naming links vision and language. Yet, effective cross linguistic colour communication is limited by the lack of multilingual data and computational models for comprehensive colour name translation. We collected 6,408 unique colour naming responses in five languages using online experiments and fieldwork. For each language, we train a "spin colour forest", a novel partially rotated decision trees model that accurately estimate colour naming distributions across the full gamut, consistently outperforming existing methods. Unlike prior work that assumed 11 universal colour categories, our results reveal cross-linguistic variation in naming granularity: American English uses 47 indispensable colour names, British English 32, French 27, Greek 32, and the Himba 7 to categorise the same perceptually uniform colour space. Building on these findings, we develop a colour translation benchmark, which we demonstrate by evaluating both the lexical and perceptual accuracy of a large language model. Our evaluation reveals a critical lexical-perceptual disconnect, demonstrating that language models lack perceptual grounding in colour translation. Our data, models, and benchmark provide an empirical foundation for inclusive design that reflects how people communicate colour across cultures.2026DMDimitris Mylonas et al.Northeastern University LondonMultilingual & Cross-Cultural Voice InteractionCross-Cultural Usability ResearchExplainable AI (XAI)CHI
Re-Examining the Examiners: Changes in Privacy and Security Perceptions of Exam ProctoringWith the shift to remote learning during the COVID-19 pandemic, educators turned to remote exam proctoring software to support integrity for online tests. However, due to the mechanisms used to surveil test-takers, these systems come with significant privacy and security tradeoffs. At the height of the pandemic, Balash et al. (SOUPS '21) found that test-takers had privacy concerns with remote proctoring but acquiesced due to a number of factors. We investigate how perceptions have changed four years later. To gain a fuller perspective on how users experience these tools now, we replicate Balash et al.'s study with 127 participants who have experienced exam proctoring. We found a significant shift in favor of proctoring software, with greater acceptance of all monitoring methods compared to 2020. This is likely due to the convenience of remote exams and a growing resignation to privacy trade-offs. We discuss these implications and suggest future directions.2026AHAdryana Hutchinson et al.The George Washington UniversityPrivacy by Design & User ControlPrivacy Perception & Decision-MakingCHI
“It Depends”: Re-Authoring Play Through Clinical Reasoning in Wearable AR Rehab GamesAugmented reality (AR) games hold promise for rehabilitation, yet most remain confined to laboratory studies with limited clinical uptake. Recent advances in spatial computing, especially lightweight, glasses-form-factor AR, create a timely opportunity to embed rehabilitative play into clinical practice and daily contexts. To investigate this potential, we systematically reviewed 132 applications and conducted playtesting with 14 licensed physical therapists. Our analysis revealed three ways therapists re-authored AR games: co-authored play (reshaping movements, progressions, and difficulty), situated play (adapting across specialties, conditions, and contexts), and dual play (mediating both physical recovery and psychological support). We reframe therapists’ frequent phrase—“It depends”—as a generative design principle. This study contributes a clinical reasoning–based framework and design principles and guidelines for creating personalized, situated forms of play that align with therapists’ everyday workflows and inform future lab-to-clinic translation.2026BXBinyan Xu et al.Northeastern UniversityVR Medical Training & RehabilitationFitness Tracking & Physical Activity MonitoringMobile Augmented RealityCHI