Evaluating Generative AI in the Lab: Methodological Challenges and GuidelinesGenerative AI (GenAI) systems are inherently non-deterministic, producing varied outputs even for identical inputs. While this variability is central to their appeal, it challenges established HCI evaluation practices that typically assume consistent and predictable system behavior. Designing controlled lab studies under such conditions therefore remains a key methodological challenge. We present a reflective multi-case analysis of four lab-based user studies with GenAI-integrated prototypes, spanning conversational in-car assistant systems and image generation tools for design workflows. Through cross-case reflection and thematic analysis across all study phases, we identify five methodological challenges and propose eighteen practice-oriented recommendations, organized into five guidelines. These challenges represent methodological constructs that are either amplified, redefined, or newly introduced by GenAI's stochastic nature: (C1) reliance on familiar interaction patterns, (C2) fidelity-control trade-offs, (C3) feedback and trust, (C4) gaps in usability evaluation, and (C5) interpretive ambiguity between interface and system issues. Our guidelines address these challenges through strategies such as reframing onboarding to help participants manage unpredictability, extending evaluation with constructs such as trust and intent alignment, and logging system events, including hallucinations and latency, to support transparent analysis. This work contributes (1) a methodological reflection on how GenAI's stochastic nature unsettles lab-based HCI evaluation and (2) eighteen recommendations that help researchers design more transparent, robust, and comparable studies of GenAI systems in controlled settings.2026HPHyerim Park et al.BMW GroupGenerative AI (Text, Image, Music, Video)Human-LLM CollaborationUser Research Methods (Interviews, Surveys, Observation)IUI
Enhancing Generative AI Image Refinement with Scribbles and Annotations: A Comparative Study of Multimodal PromptsGenerative AI (GenAI) image tools are increasingly used in design practice, enabling rapid ideation but offering limited support for refinement tasks such as adjusting layout, scale, or visual attributes. While text prompts and inpainting allow localized edits, they often remain inefficient or ambiguous for precise, in-context, and iterative refinement---motivating the exploration of alternative methods. This work examines how pen-based scribbles and annotations can enhance GenAI image refinement. A formative study with seven professional designers informed a prototype supporting three input modalities: text-only, visual-only, and combined prompting. A within-subjects study with 30 designers and design students compared these modalities across closed- and open-ended tasks, evaluating expressiveness, efficiency, workload, user experience, iteration, and multimodal strategies. Visual prompts improved clarity and speed for spatial edits while reducing workload, whereas text remained effective for semantic and global changes. The combined modality received the highest overall ratings, enabling complementary use, balancing spatial precision with semantic detail, and supporting smoother iteration. Task-specific preferences also emerged: adding new objects often required both modalities, while moving or modifying elements was typically handled through visual input. This work contributes (1) an empirical comparison of multimodal prompting for GenAI refinement, (2) a prototype integrating scribbles and annotations, and (3) insights into designers' multimodal strategies to inform future GenAI interfaces that better support refinement in GenAI-supported design workflows.2026HPHyerim Park et al.BMW GroupGenerative AI (Text, Image, Music, Video)Creative Collaboration & Feedback SystemsGraphic Design & Typography ToolsIUI
Mixed Presence in Mixed Reality: Charting the Challenges and OpportunitiesThis paper investigates the challenges of designing mixed-presence environments for Mixed Reality and suggests future research directions derived from an expert workshop. Developing mixed-presence systems is a complex undertaking that combines the intricacies of both co-located and distributed mixed-reality spaces. Current literature in this field describes various promising design and development approaches but lacks a systematic overview, resulting in fragmented solutions to re-occurring challenges. Therefore, we conducted a comprehensive review of mixed-presence and multi-user remote mixed-reality systems, categorizing the prevalent challenges faced during the development of such systems, but also current trends, common use cases, study tasks and methodologies. Supported by these results, we then conducted an expert ideation workshop to collect and structure promising future research directions. As a result, we provide a detailed resource to orient and prepare developers for probable challenges and support researchers in making informed design decisions for future mixed-presence studies in Mixed Reality.2026KKKatja Krug et al.TUD Dresden University of TechnologyMixed Reality WorkspacesImmersion & Presence ResearchMulti-User Large Display CollaborationCHI
Interaction Methods in Generative AI Image Tools: A Review of Trends and Design Opportunities Across HCI and IndustryGenerative AI (GenAI) image tools are increasingly integrated into design workflows, prompting HCI research on their interaction methods and interfaces. We reviewed 37 such tools, including 28 HCI research systems and nine commercial systems (2022--July 2025), using three analytical frameworks: interaction methods, creative processes, and tool functionalities. We found that text prompts remain the dominant input method, while visual and attribute-based inputs---particularly in academic tools---are gaining traction and are often combined with text for refinement. Commercial systems emphasize parameter control, whereas academic tools focus on semantic attributes and visual organization. Most tools support ideation and exploration, but provide limited support for refinement and evaluation. Based on these findings, we identify nine design opportunities, including advanced visual interaction, simplified parameter control, precision editing, direct manipulation, workflow integration, default settings that support rapid exploration, and user guidance for later stages. We contribute a framework for analyzing GenAI interfaces and actionable directions for designing more usable, creativity-supportive GenAI image systems.2026HPHyerim Park et al.University of StuttgartGenerative AI (Text, Image, Music, Video)Human-LLM CollaborationCreative Collaboration & Feedback SystemsCHI
Spatial Context Switches During Knowledge Tasks in Extended RealityCurrent Extended Reality (XR) devices are increasingly being used as productivity tools. Compared to conventional setups, they allow for more flexibility in dynamic work locations beyond the desktop while providing a large virtual workspace. Recent research has explored how users organize digital documents and how virtual interfaces could be adapted to different locations and scenarios. However, there has been limited research on how location changes affect productivity tasks in XR environments and how users manually adapt virtual content layouts after such task interruptions. To address this, we conducted an exploratory user study (N=17) in which participants worked on a document-centered organization and planning task while changing locations every five minutes. We examined how these spatial transitions interfered with the task and identified layout strategies and patterns. From our observations and participant responses, we derived a set of design guidelines to inform the development of future XR knowledge work systems in mobile contexts.2026WBWolfgang Büschel et al.University of StuttgartMixed Reality WorkspacesImmersion & Presence ResearchSmart Cities & Urban SensingCHI
Challenges in Synchronous & Remote Collaboration Around VisualizationWe characterize 16 challenges faced by those investigating and developing remote and synchronous collaborative experiences around visualization. Our work reflects the perspectives and prior research efforts of an international group of 29 experts from across human-computer interaction and visualization sub-communities. The challenges are anchored around five collaborative activities that exhibit a centrality of visualization and multimodal communication. These activities include exploratory data analysis, creative ideation, visualization-rich presentations, joint decision making grounded in data, and real-time data monitoring. The challenges also reflect the changing dynamics of these activities in the face of recent advances in extended reality (XR) and artificial intelligence (AI). As an organizing scheme for future research at the intersection of visualization and computer-supported cooperative work, we align the challenges with a sequence of four sets of research and development activities: technological choices, social factors, AI assistance, and evaluation.2026MBMatthew Brehmer et al.University of WaterlooInteractive Data VisualizationRemote Work Tools & ExperienceMulti-User Large Display CollaborationCHI
Surveillance, Spacing, Screaming and Scabbing: How Digital Technology Facilitates Union BustingDespite high approval ratings for unions and growing worker interest in organizing, employees in the United States still face significant barriers to securing collective bargaining agreements. A key factor is employer counter-organizing: efforts to suppress unionization through rule changes, retaliation, and disruption. Designing sociotechnical tools and strategies to resist these tactics requires a deeper understanding of the role computing technologies play in counter-organizing against unionization. In this paper, we examine three high-profile organizing efforts–at Amazon, Starbucks, and Boston University–using publicly available sources to identify four recurring technological tactics: surveillance, spacing, screaming and scabbing. We analyze how these tactics operate across contexts, highlighting their digital dimensions and strategic deployment. We conclude with implications for organizing in digitally-mediated workplaces, directions for future research, and emergent forms of worker resistance.2026FRFrederick Reiber et al.Boston UniversityTechnology Ethics & Critical HCIImpact of Automation on WorkOnline Harassment & Counter-ToolsCHI
MEDebiaser: A Human-AI Feedback System for Mitigating Bias in Multi-label Medical Image ClassificationMedical images often contain multiple labels with imbalanced distributions and co-occurrence, leading to bias in multi-label medical image classification. Close collaboration between medical professionals and machine learning practitioners has significantly advanced medical image analysis. However, traditional collaboration modes struggle to facilitate effective feedback between physicians and AI models, as integrating medical expertise into the training process via engineers can be time-consuming and labor-intensive. To bridge this gap, we introduce MEDebiaser, an interactive system enabling physicians to directly refine AI models using local explanations. By combining prediction with attention loss functions and employing a customized ranking strategy to alleviate scalability, MEDebiaser allows physicians to mitigate biases without technical expertise, reducing reliance on engineers, and thus enhancing more direct human-AI feedback. Our mechanism and user studies demonstrate that it effectively reduces biases, improves usability, and enhances collaboration efficiency, providing a practical solution for integrating medical expertise into AI-driven healthcare.2025SSShaohan Shi et al.Brain-Computer Interface (BCI) & NeurofeedbackExplainable AI (XAI)AI-Assisted Decision-Making & AutomationUIST
VisRing: A Display-Extended Smartring for Nano VisualizationsWe introduce VisRing, the first smartring incorporating a bendable 160 x 32 4-bit grayscale organic light-emitting diode display. VisRing stands out by displaying nano visualizations while maintaining a compact design and minimal weight of 6.6 g, with an overall cost of around $35. We exploit opportunities for a system-on-a-chip architecture to tightly integrate an inertial measurement unit, a photoplethysmograph sensor, a temperature sensor, Bluetooth, a microcontroller, and a display unit that spans 270° to 360°, depending on finger size. Our contributions include the hardware design and implementation of VisRing, along with a software library that supports visualizing various data types. A qualitative study with 12 participants demonstrated the comfort, likability, and social acceptance of VisRing’s hardware and software. The participants liked the visualizations and found the ring lightweight, but also pointed out possible improvements. All materials are shared under an open-source license to enable the community to extend and improve VisRing.2025TLTaiting Lu et al.Haptic WearablesData PhysicalizationSmartwatches & Fitness BandsUIST
GuitarPie: Using the Fretboard of an Electric Guitar for Audio-Based Pie Menu InteractionNowadays, electric guitars are often used together with digital interfaces. For instance, tablature applications can support guitar practice by rendering and playing back the tabs of individual instrument tracks of a song (guitar, drums, etc.). However, those interfaces are typically controlled via mouse and keyboard or via touch input. This means that controlling and configuring playback during practice can lead to high switching costs, as learners often need to switch between playing and interface control. In this paper, we explore the use of audio input from an unmodified electric guitar to enable interface control without letting go of the guitar. We present GuitarPie, an audio-based pie menu interaction method. GuitarPie utilizes the grid-like structure of a fretboard to spatially represent audio-controlled operations, avoiding the need to memorize note sequences. Furthermore, we implemented TabCtrl, a tablature interface that uses GuitarPie and other audio-based interaction methods for interface control.2025FHFrank Heyen et al.Electrical Muscle Stimulation (EMS)Shape-Changing Interfaces & Soft Robotic MaterialsFood Culture & Food InteractionUIST
HAGI: Head-Assisted Gaze Imputation for Mobile Eye TrackersMobile eye tracking plays a vital role in capturing human visual attention across both real-world and extended reality (XR) environments, making it an essential tool for applications ranging from behavioural research to human-computer interaction. However, missing values due to blinks, pupil detection errors, or illumination changes pose significant challenges for further gaze data analysis. To address this challenge, we introduce HAGI – a multi-modal diffusion-based approach for gaze data imputation that, for the first time, uses the integrated head orientation sensors to exploit the inherent correlation between head and eye movements. Our method includes a head-movement feature extraction module alongside a novel hybrid feature fusion mechanism that effectively integrates gaze and head motion features at multiple levels. Additionally, we introduce a tailored loss function to enhance gaze imputation accuracy further. Extensive evaluations on the large-scale Nymeria, Ego-Exo4D, and HOT3D datasets demonstrate that HAGI consistently outperforms conventional interpolation methods and deep learning-based time-series imputation baselines, reducing mean angular error by up to 22%. Furthermore, statistical analyses confirm that HAGI produces gaze velocity distributions that more closely match actual human gaze behaviour than baselines, ensuring more realistic gaze imputations. Our method paves the way for more complete and accurate eye gaze recordings in real-world settings and has significant potential for enhancing gaze-based analysis and interaction across various application domains.2025CJChuhan Jiao et al.Eye Tracking & Gaze InteractionHuman Pose & Activity RecognitionContext-Aware ComputingUIST
Text-to-Image Generation for Vocabulary Learning Using the Keyword MethodThe 'keyword method' is an effective technique for learning vocabulary of a foreign language. It involves creating a memorable visual link between what a word means and what its pronunciation in a foreign language sounds like in the learner's native language. However, these memorable visual links remain implicit in the people's mind and are not easy to remember for a large number of words. To enhance the memorisation and recall of the vocabulary, we developed an application that combines the keyword method with text-to-image generators to externalise the memorable visual links into visuals. These visuals represent additional stimuli during the memorisation process. To explore the effectiveness of this approach we first run a pilot study to investigate how difficult it is to externalise the descriptions of mental visualisations of memorable links, by asking participants to write them down. We used these descriptions as prompts for text-to-image generator (DALL-E2) to convert them into images and asked participants to select their favourites. Next, we compared different text-to-image generators (DALL-E2, Midjourney, Stable and Latent Diffusion) to evaluate the perceived quality of the generated images by each. Despite heterogeneous results, participants mostly preferred images generated by DALL-E2, which was used also for the final study. In this study, we investigated whether providing such images enhances the retention of vocabulary being learned, compared to the keyword method alone. Our results indicate that people did not encounter difficulties describing their visualisations of memorable links and that providing corresponding images significantly increases memory retention.2025NANuwan T Attygalle et al.Generative AI (Text, Image, Music, Video)Intelligent Tutoring Systems & Learning AnalyticsIUI
SummAct: Uncovering User Intentions Through Interactive Behaviour SummarisationRecent work has highlighted the potential of modelling interactive behaviour analogously to natural language. We propose interactive behaviour summarisation as a novel computational task and demonstrate its usefulness for automatically uncovering latent user goals while interacting with graphical user interfaces. We introduce SummAct – a novel hierarchical method to summarise low-level input actions into high-level goals to tackle this task. SummAct first identifies sub-goals from user actions using a large language model and in-context learning. In a second step, high-level goals are obtained by fine-tuning the model using a novel UI element weighting mechanism to preserve detailed context information embedded within UI elements during summarisation. Through a series of evaluations, we demonstrate that SummAct significantly outperforms baseline methods across desktop and mobile user interfaces and interactive tasks by up to 21.9%. We further introduce two exciting example use cases enabled by our method: interactive behaviour forecasting and automatic behaviour synonym identification.2025GZGuanhua Zhang et al.University of Stuttgart, Institute for Visualisation and Interactive SystemsHuman-LLM CollaborationAI-Assisted Decision-Making & AutomationCHI
Traversing Dual Realities: Investigating Techniques for Transitioning 3D Objects between Desktop and Augmented Reality EnvironmentsDesktop environments can integrate augmented reality (AR) head-worn devices to support 3D representations, visualizations, and interactions in a novel yet familiar setting. As users navigate across the dual realities---desktop and AR---a way to move 3D objects between them is needed. We devise three baseline transition techniques based on common approaches in the literature and evaluate their usability and practicality in an initial user study (N=18). After refining both our transition techniques and the surrounding technical setup, we validate the applicability of the overall concept for real-world activities in an expert user study (N=6). In it, computational chemists followed their usual desktop workflows to build, manipulate, and analyze 3D molecular structures, but now aided with the addition of AR and our transition techniques. Based on our findings from both user studies, we provide lessons learned and takeaways for the design of 3D object transition techniques in desktop + AR environments.2025TRTobias Rau et al.University of Stuttgart, Visualization Research CenterAR Navigation & Context AwarenessMixed Reality WorkspacesCHI
Who is in Control? Understanding User Agency in AR-assisted Construction AssemblyAdaptive AR assistance can automatically trigger content to support users based on their context. Such intelligent automation offers many benefits but also alters users' degree of control, which is seldom explored in existing research. In this paper, we compare high- and low-agency control in AR-assisted construction assembly to understand the role of user agency. We designed cognitive and physical assembly scenarios and conducted a lab study (N=24), showing that low-agency control reduced mental workloads and perceived autonomy in several tasks. A follow-up domain expert study with trained carpenters (N=8) contextualised these results in an ecologically valid setting. Through semi-structured interviews, we examined the carpenters' perspectives on AR support in their daily work and the trade-offs of automating interactions. Based on these findings, we summarise key design considerations to inform future adaptive AR designs in the context of timber construction.2025XYXiliu Yang et al.Institute of Computational Design and ConstructionAR Navigation & Context AwarenessKnowledge Worker Tools & WorkflowsComputational Methods in HCICHI
Blending the Worlds: An evaluation of World-Fixed Visual Appearances in Automotive Augmented RealityWith the transition to fully autonomous vehicles, non-driving related tasks (NDRTs) become increasingly important, allowing passengers to use their driving time more efficiently. In-car Augmented Reality (AR) gives the possibility to engage in NDRTs while also allowing passengers to engage with their surroundings, for example, by displaying world-fixed points of interest (POIs). This can lead to new discoveries, provide information about the environment, and improve locational awareness. To explore the optimal visualization of POIs using in-car AR, we conducted a field study (N = 38) examining six parameters: positioning, scaling, rotation, render distance, information density, and appearance. We also asked for intention of use, preferred seat positions and preferred automation level for the AR function in a post-study questionnaire. Our findings reveal user preferences and general acceptance of the AR functionality. Based on these results, we derived UX-guidelines for the visual appearance and behavior of location-based POIs in in-car AR.2025RSRobin Connor Schramm et al.Mercedes-Benz Tech Motion GmbH; RheinMain University of Applied SciencesHead-Up Display (HUD) & Advanced Driver Assistance Systems (ADAS)AR Navigation & Context AwarenessCHI
TutorCraftEase: Enhancing Pedagogical Question Creation with Large Language ModelsPedagogical questions are crucial for fostering student engagement and learning. In daily teaching, teachers pose hundreds of questions to assess understanding, enhance learning outcomes, and facilitate the transfer of theory-rich content. However, even experienced teachers often struggle to generate a large volume of effective pedagogical questions. To address this, we introduce TutorCraftEase, an interactive generation system that leverages large language models (LLMs) to assist teachers in creating pedagogical questions. TutorCraftEase enables the rapid generation of questions at varying difficulty levels with a single click, while also allowing for manual review and refinement. In a comparative user study with 39 participants, we evaluated TutorCraftEase against a traditional manual authoring tool and a basic LLM tool. The results show that TutorCraftEase can generate pedagogical questions comparable in quality to those created by experienced teachers, while significantly reducing their workload and time.2025WKWenhui Kang et al.University of Chinese Academy of Sciences; Institute of Software, Chinese Academy of Sciences, Beijing Key Laboratory of Human-Computer InteractionHuman-LLM CollaborationOnline Learning & MOOC PlatformsIntelligent Tutoring Systems & Learning AnalyticsCHI
Examining the Effects of Immersive and Non-Immersive Presenter Modalities on Engagement and Social Interaction in Co-located Augmented PresentationsHead-worn augmented reality (AR) allows audiences to be immersed and engaged in stories told by live presenters. While presenters may also be in AR to have the same level of immersion and awareness as their audience, this symmetric presentation style may diminish important social cues such as eye contact. In this work, we examine the effects this (a)symmetry has on engagement, group awareness, and social interaction in co-located one-on-one augmented presentations. We developed a presentation system incorporating 2D/3D content that audiences can view and interact with in AR, with presenters controlling and delivering the presentation in either a symmetric style in AR, or an asymmetric style with a handheld tablet. We conducted a within- and between-subjects evaluation with 12 participant pairs to examine the differences between these symmetric and asymmetric presentation modalities. From our findings, we extracted four themes and derived strategies and guidelines for designers interested in augmented presentations.2025MGMatt Gottsacker et al.J.P. Morgan Chase & Co., Global Technology Applied Research; University of Central Florida, SREALAR Navigation & Context AwarenessInteractive Narrative & Immersive StorytellingCHI
Chartist: Task-driven Eye Movement Control for Chart ReadingTo design data visualizations that are easy to comprehend, we need to understand how people with different interests read them. Computational models of predicting scanpaths on charts could complement empirical studies by offering estimates of user performance inexpensively; however, previous models have been limited to gaze patterns and overlooked the effects of tasks. Here, we contribute Chartist, a computational model that simulates how users move their eyes to extract information from the chart in order to perform analysis tasks, including value retrieval, filtering, and finding extremes. The novel contribution lies in a two-level hierarchical control architecture. At the high level, the model uses LLMs to comprehend the information gained so far and applies this representation to select a goal for the lower-level controllers, which, in turn, move the eyes in accordance with a sampling policy learned via reinforcement learning. The model is capable of predicting human-like task-driven scanpaths across various tasks. It can be applied in fields such as explainable AI, visualization design evaluation, and optimization. While it displays limitations in terms of generalizability and accuracy, it takes modeling in a promising direction, toward understanding human behaviors in interacting with charts.2025DSDanqing Shi et al.Aalto UniversityInteractive Data VisualizationComputational Methods in HCICHI
Pixel Memories: Do Lifelog Summaries Fail to Enhance Memory but Offer Privacy-Aware Memory Assessments?We explore the metaphorical "daily memory pill" concept – a brief pictorial lifelog recap aimed at reviving and preserving memories. Leveraging psychological strategies, we explore the potential of such summaries to boost autobiographical memory. We developed an automated lifelogging memory prosthesis and a research protocol (Automated Memory Validation ``AMV'') for conducting privacy-aware, in-situ evaluations. We conducted a real-world lifelogging experiment for a month (n=11). We also designed a browser ``Pixel Memories’’ for browsing one-week worth of lifelogs. The results suggest that daily timelapse summaries, while not yielding significant memory augmentation effects, also do not lead to memory degradation. Participants' confidence in recalled content remains unaltered, but the study highlights the challenge of users' overestimation of memory accuracy. Our core contributions, the AMV protocol and "Pixel Memories" browser, advance our understanding of memory augmentations and offer a privacy-preserving method for evaluating future ubicomp systems.2025PEPassant ElAgroudy et al.German Research Centre for Artificial Intelligence (DFKI); RPTU KaiserslauternContext-Aware ComputingUbiquitous ComputingCHI