Operationalizing Perceptions of Agent Gender: Foundations and GuidelinesThe “gender” of intelligent agents, virtual characters, social robots, and other agentic machines has emerged as a fundamental topic in studies of people's interactions with computers. Perceptions of agent gender can help explain user attitudes and behaviours—from preferences to toxicity to stereotyping—across a variety of systems and contexts of use. Yet, standards in capturing perceptions of agent gender do not exist. A scoping review was conducted to clarify how agent gender has been operationalized—labelled, defined, and measured—as a perceptual variable. One-third of studies manipulated but did not measure agent gender. Norms in operationalizations remain obscure, limiting comprehension of results, congruity in measurement, and comparability for meta-analyses. The dominance of the gender binary model and latent anthropocentrism have placed arbitrary limits on knowledge generation and reified the status quo. We contribute a systematically-developed and theory-driven meta-level framework that offers operational clarity and practical guidance for greater rigour and inclusivity.2026KSKatie Seaborn et al.Institute of Science TokyoAgent Personality & AnthropomorphismGender & Race Issues in HCITechnology Ethics & Critical HCICHI
KNIT: Computational Boundary Objects for Real-Time Convergence in Interdisciplinary TeamsInterdisciplinary teams developing complex technologies such as healthtech struggle to align disciplinary perspectives, stakeholder priorities, and evolving problem framings, particularly during rapid iteration, when existing collaboration tools offer limited support for in-session negotiation. We present KNIT, an AI-mediated framework that conceptualises AI-generated artefacts as computational boundary objects. KNIT supports convergence by externalising anonymised individual inputs into shared artefacts, including semantic clusters and stakeholder-centred problem reframings, that surface differences in interpretation and make them available for negotiation. We evaluated KNIT in workshops with seven early-stage healthtech teams (28 participants), analysing 190 interaction episodes using Carlile’s 3T framework. KNIT supported knowledge boundary crossing across syntactic (95.0%), semantic (86.3%), and pragmatic (84.8%) levels. We contribute empirical evidence and design principles showing how computational boundary objects mediate distinct boundary-crossing mechanisms, demonstrating that representational transformation rather than automation is the primary mechanism through which AI enables convergence across disciplinary boundaries.2026EWShigeki Saito et al.Imperial College LondonHuman-LLM CollaborationAI-Assisted Decision-Making & AutomationParticipatory DesignCHI
Design and Evaluation of a Photorealistic AI Virtual Peer in Elementary Collaborative ClassroomIn elementary education, students struggle to articulate uncertainties, limiting diverse perspectives in classroom discussions, particularly in small schools where limited participants constrain collaborative learning. This study designed and evaluated ``Saya,'' a photorealistic AI virtual peer functioning as an additional student. We implemented five teacher-controlled speech acts (expand, probe, summarize, lighten, and incorrect answer) through dynamic classroom dialogue generation using GPT-4o-mini. Field studies in Japanese elementary schools (large class: 27 students, small class: 2 students) demonstrated that Saya integration increased the proportion of student speaking time by 1.28 times and 2.07 times respectively, with 95.6% and 100% of students expressing desire for future Saya-integrated lessons. Teachers reported enhanced student concentration and listening behaviors, noting that interactions with Saya prompted students to reconstruct their own understanding of the learning material. This research provides new insights into design principles for collaborative learning agents in elementary education settings, effective implementation scenarios based on class size, and the future potential of AI-enhanced collaborative learning.2026STSatomi Tokida et al.The University of TokyoProgramming Education & Computational ThinkingCollaborative Learning & Peer TeachingHuman-LLM CollaborationCHI
Sensing Your Vocals: Exploring the Activity of Vocal Cord Muscles for Pitch Assessment Using Electromyography and UltrasonographyVocal training is difficult because the muscles that control pitch, resonance, and phonation are internal and invisible to learners. This paper investigates how Electromyography (EMG) and ultrasonic imaging (UI) can make these muscles observable for training purposes. We report three studies. First, we analyze the EMG and UI data from 16 singers (beginners, experienced \& professionals), revealing differences among three vocal groups of the muscle control proficiency. Second, we use the collected data to create a system that visualizes an expert's muscle activity as reference. This system is tested in a user study with 12 novices, showing that EMG highlighted muscle activation nuances, while UI provided insights into vocal cord length and dynamics. Third, to compare our approach to traditional methods (audio analysis and coach instructions), we conducted a focus group study with 15 experienced singers. Our results suggest that EMG is promising for improving vocal skill development and enhancing feedback systems. We conclude the paper with a detailed comparison of the analyzed modalities (EMG, UI and traditional methods), resulting in recommendations to improve vocal muscle training systems.2026KCKanyu Chen et al.Graduate School of Media DesignBiosensors & Physiological MonitoringEmotion Recognition & DetectionAffective Feedback & Emotion Regulation InterfacesCHI
Radical Gender Neutrality: Agender Euphoria in Gaming and Play ExperiencesAgender euphoria is a new term representing the powerful feelings of happiness, joy, and contentment derived from experiences in gender-free embodiments, spaces, and activities. People with and without agender and adjacent identities (e.g., genderless, gender-free, non-binary, gender-apathetic) may have such experiences under the right circumstances. Video games can offer gender minorities a safe haven for gender euphoric experiences. However, the possibility of agender euphoric experiences was unexplored. We considered this overlooked frame of self-actualization with 142 people who identified as having or desiring agender euphoric experiences. Using the critical incident technique (CIT), we uncovered how games and play experiences create (and inhibit) agender euphoria. We surface this experiential phenomenon and provide empirically-grounded criteria for the design of games to elicit agender euphoric experiences for everyone, but especially agender and agender adjacent players. This work adds to the growing critical literatures on marginalized experiences in games research and human-computer interaction.2026KSKatie Seaborn et al.Institute of Science TokyoGame UX & Player BehaviorGender & Race Issues in HCIEmpowerment of Marginalized GroupsCHI
SoleCoach: Sole Pressure and IMU-based MLLMs for Skill CoachingIn sports training, individualized skill assessment and feedback are essential for athletes to master complex movements and enhance performance. Existing approaches for generating coaching comments primarily rely on externally captured pose information, which limits their applicability in outdoor sports such as skiing that involve large-scale movement. To address this challenge, we propose a method for presenting athletes' postures and generating coaching feedback solely based on foot pressure and IMU data collected from insole sensors. In our approach, a large language model directly interprets foot pressure signals to provide actionable coaching, thereby supporting independent practice. Through model evaluation and user studies, we demonstrate that the proposed method generates expert-level feedback and outperforms pose-based approaches. Furthermore, the user study shows that the feedback helps athletes identify body parts requiring correction and enhances their motivation for training.2026THToshihiro Hirano et al.Institute of Science TokyoHuman Pose & Activity RecognitionGenerative AI (Text, Image, Music, Video)Fitness Tracking & Physical Activity MonitoringCHI
Informal Embodied Auditing: Exploring Facial Emotion AI (FEAI) through Community WorkshopsEmotion AI (EAI) is increasingly deployed and ethically controversial-motivating a need for greater public understanding, critique, and ethical discussions. Facial Emotion AI (FEAI) is a common type of EAI that infers emotions from facial expressions. We developed Explore-FEAI, an FEAI model and accompanying interactive website that offers open-ended exploration with FEAI firsthand. We designed a workshop wherein participants learn about FEAI using Explore-FEAI and discuss societal implications, partnering with local organizations to host community workshops (N=30). Our findings analyze participants’ growing critical AI literacy through exploring inputs/outputs, mechanistic reasoning, data critiques, sociocultural critiques, ethical concerns, and embodied and material exploration of FEAI. Our discussion offers informal embodied auditing as an approach for critical engagement with AI through embodied and material exploration, as well as reflections on informal auditing for supporting AI literacy, informal auditing for questioning EAI ethics, and expanding participation roles for more holistic EAI training.2026XLXingyu Li et al.Georgia Institute of Technology,Emotion Recognition & DetectionAffective Feedback & Emotion Regulation InterfacesAI Ethics, Fairness & AccountabilityCHI
Chromotion: Controlling Motion-Induced Color on Object Motion Paths via High-Speed Temporal Additive ProjectionWe present Chromotion, a high-speed projection method that renders intended colors along the motion trajectories of moving objects. When an object moves across a temporally multiplexed sequence, its occlusion of the projected patterns can, through persistence of vision, produce motion dependent colors along its path. Chromotion exploits this phenomenon by decomposing each static image into a short sequence in which target color frames are interleaved with a single complementary color frame. This temporal design allows moving objects to sample the sequence so that the perceived color along their motion paths converges to the target color, while stationary regions still integrate to the original static color. We built a prototype and conducted a camera based technical evaluation and user evaluations. The results show that Chromotion reliably produces the target color on motion trajectories without degrading static color fidelity. Because the approach requires no body or gaze tracking and no decoding of embedded information, it scales to public settings and supports multiuser and multimodal interactions. We also discuss limitations, and outline application scenarios such as public, ambient displays that blend into the environment.2026SMShio Miyafuji et al.Institute of Science TokyoDigital Signage & Ambient DisplaysInteractive Floors & Spatial InterfacesCHI
Access Over Deception: Fighting Deceptive Patterns through AccessibilityDeceptive patterns, i.e. dark patterns and manipulative user interfaces (UI), are a widely used design method that aims to manipulate users to act against their own interests. These patterns may particularly influence people with less education, visual impairments, and older adults. Yet, access is a critical feature of the user experience (UX), development standards, and law. We considered whether and how the Web Content Accessibility Guidelines (WCAG) and related legislation, such as the European Accessibility Act (EAA), can act as a tool against deceptive patterns. We used these guidelines and legal statues in a heuristic evaluation to analyze whether and how deceptive patterns violate or conform to these standards. Although statistical analysis revealed no significant relationship, we identified three patterns implicated by the WCAG guidelines: Countdown Timer, Auto-Play, and Hidden Information. We offer this approach as one tool in the fight against UI-based deception and in support of inclusive design.2026TPTobias Pellkvist et al.TU WienDark Patterns RecognitionUniversal & Inclusive DesignPrivacy by Design & User ControlCHI
Hearing Ambiguity: Exploring Beyond-Gender Impressions of Artificial Ambiguous VoicesVoice perception plays a fundamental role in all types of interactions, from human-to-human communication to human-technology interaction. When it comes to technology, we sometimes have the option to choose the type of voice we want to hear. But why is the default (almost) always a feminine or masculine voice? In this research, we evaluated user perceptions of gender-ambiguous voices, a relatively unexplored option. In our novel comparative study, we evaluated six gender-ambiguous voices with participants of diverse gender identities (men, women, and non-binary individuals), with 74 participants in each group. Additionally, half of the participants were told in advance that the voices had been designed to be gender-ambiguous, and half were not. We aimed to move beyond subjective perceptions of voice gender by exploring how such voices are perceived across different dimensions: trustworthiness, appeal, comfort, anthropomorphism, and aversion. Our findings reveal that while men and women had similar perceptions, non-binary participants rated the voices more negatively, with lower trust and higher aversion. Interestingly, priming participants about the voices' ambiguity did not significantly affect overall perceptions, though it increased critical evaluations from non-binary individuals. These findings contribute to growing research on gender-ambiguous voices by providing perceptual comparisons of multiple voices and highlighting the need for more inclusive voice designs that appeal to non-binary users.2025MCMartina De Cet et al.Voice User Interface (VUI) DesignMultilingual & Cross-Cultural Voice InteractionAgent Personality & AnthropomorphismCUI
Crafting the Unspoken: Engaging Japanese Older Adults with Data Physicalization WorkshopsEngaging individuals in creating physical representations of personal data facilitates storytelling and collaborative reflection. However, its potential to encourage older adults to share their personal stories remains underexplored. This study introduces a craft-centered data physicalization approach where participants create tangible representations of their emotions and experiences related to community events, involving both event attendees and event organizers, through a series of workshops. This collaborative crafting process encourage group discussions and collective reflections on past experiences, enabling participants to express thoughts and feelings that would otherwise remain unspoken. Our work contributes a practical workshop approach that merges craft practice with data physicalization to support deep social expressions and connections among older adults.2025CLChengtian Li et al.Data PhysicalizationMakerspace CultureEmpowerment of Marginalized GroupsDIS
Unintended, Percolated Work: Overlooked Opportunities for Collaboration Between Informal Caregivers and Healthcare Professionals During the End-Of-Life Care ProcessBereavement often places a psychological burden on families and should be addressed appropriately. Although end-of-life care is a collaborative activity with interaction between family caregivers and medical professionals, further research is needed to explore family caregivers’ support needs as collaborative workers and the challenges they face. This study examined the collaboration during the end-of-life process between family caregivers and medical professionals to understand the cooperative activities and factors surrounding them based on unrealized or regrettable experiences during end-of-life care. Semi-structured interviews with bereaved family caregivers who provided end-of-life care and medical professionals who provided support revealed that family caregivers’ aspirations and medical professionals’ support for family caregivers crossed paths, steering end-of-life caregiving in an unintended direction. Characteristic work carried out by each actor in this situation is defined as "unintended, percolated work" and considered an overlooked collaboration opportunity, proposing support suggestions for handling family caregivers’ original intentions and needs.2025SSShun Saito et al.Institute of Science Tokyo, School of Environment and Society, Department of Innovation ScienceElderly Care & Dementia SupportAging-in-Place Assistance SystemsCHI
Inter(sectional) Alia(s): Ambiguity in Voice Agent Identity via Intersectional Japanese Self-ReferentsConversational agents that mimic people have raised questions about the ethics of anthropomorphizing machines with human social identity cues. Critics have also questioned assumptions of identity neutrality in humanlike agents. Recent work has revealed that intersectional Japanese pronouns can elicit complex and sometimes evasive impressions of agent identity. Yet, the role of other ``neutral'' non-pronominal self-referents (NPSR) and voice as a socially expressive medium remains unexplored. In a crowdsourcing study, Japanese participants (N=204) evaluated three ChatGPT voices (Juniper, Breeze, and Ember) using seven self-referents. We found strong evidence of voice gendering alongside the potential of intersectional self-referents to evade gendering, i.e., ambiguity through neutrality and elusiveness. Notably, perceptions of age and formality intersected with gendering as per sociolinguistic theories, especially ぼく (boku) and わたくし (watakushi). This work provides a nuanced take on agent identity perceptions and champions intersectional and culturally-sensitive work on voice agents.2025TFTakao Fujii et al.Institute of Science Tokyo, Department of Industrial Engineering and EconomicsIntelligent Voice Assistants (Alexa, Siri, etc.)Multilingual & Cross-Cultural Voice InteractionAgent Personality & AnthropomorphismCHI
Super Kawaii Vocalics: Amplifying the “Cute” Factor in Computer Voice"Kawaii" is the Japanese concept of cute, which carries sociocultural connotations related to social identities and emotional responses. Yet, virtually all work to date has focused on the visual side of kawaii, including in studies of computer agents and social robots. In pursuit of formalizing the new science of kawaii vocalics, we explored what elements of voice relate to kawaii and how they might be manipulated, manually and automatically. We conducted a four-phase study (grand 𝑁 = 512) with two varieties of computer voices: text-to-speech (TTS) and game character voices. We found kawaii "sweet spots" through manipulation of fundamental and formant frequencies, but only for certain voices and to a certain extent. Findings also suggest a ceiling effect for the kawaii vocalics of certain voices. We offer empirical validation of the preliminary kawaii vocalics model and an elementary method for manipulating kawaii perceptions of computer voice.2025YMYuto Mandai et al.Tokyo Institute of Technology, Department of Industrial Engineering and EconomicsIntelligent Voice Assistants (Alexa, Siri, etc.)Agent Personality & AnthropomorphismCHI
PiaMuscle: Improving Piano Skill Acquisition by Cost-effectively Estimating and Visualizing Activities of Miniature Hand MusclesUnderstanding neuromusculoskeletal mechanisms significantly impacts skill specialization and proficiency. While existing methods can infer large muscle activities during gross motor movements, the estimation of dexterous motor control involving miniature muscles remains underexplored. Targeting the coordinated hand muscles in advanced piano performance, we learn spatiotemporal discrete representations of electromyography (EMG) data and hand postures utilizing a multimodal dataset. Subsequently, we train a precise and cost-effective neural network model. Based on this model, PiaMuscle is introduced to investigate if visualizing muscle activities during piano training enhances piano performance. Quantitative and qualitative results of a user study with highly skilled professional pianists demonstrate that PiaMuscle provides reliable muscle activation data to support and optimize force control. Our research underscores the potential of a naturalistic workflow to estimate small muscles' activities from readily accessible human-centric information and more accurately when combined with tool-centric data, thereby enhancing skill acquisition.2025RLRuofan Liu et al.Tokyo Institute of Technology, School of Computing; Sony Computer Science Laboratories Inc.Human Pose & Activity RecognitionBiosensors & Physiological MonitoringCHI
SolePoser: Real-Time 3D Human Pose Estimation using Insole Pressure SensorsWe propose SolePoser, a real-time 3D pose estimation system that leverages only a single pair of insole sensors. Unlike conventional methods relying on fixed cameras or bulky wearable sensors, our approach offers minimal and natural setup requirements. The proposed system utilizes pressure and IMU sensors embedded in insoles to capture the body weight's pressure distribution at the feet and its 6 DoF acceleration. This information is used to estimate the 3D full-body joint position by a two-stream transformer network. A novel double-cycle consistency loss and a cross-attention module are further introduced to learn the relationship between 3D foot positions and their pressure distributions. We also introduced two different datasets of sports and daily exercises, offering 908k frames across eight different activities. Our experiments show that our method's performance is on par with top-performing approaches, which utilize more IMUs and even outperform third-person-view camera-based methods in certain scenarios.2024EWErwin Wu et al.Foot & Wrist InteractionHuman Pose & Activity RecognitionUIST
Cross-Cultural Validation of Partner Models for Voice User Interfaces Recent research has begun to assess people's perceptions of voice user interfaces (VUIs) as dialogue partners, termed partner models. Current self-report measures are only available in English, limiting research to English-speaking users. To improve the diversity of user samples and contexts that inform partner modelling research, we translated, localized, and evaluated the Partner Modelling Questionnaire (PMQ) for non-English speaking Western (German, n=185) and East Asian (Japanese, n=198) cohorts where VUI use is popular. Through confirmatory factor analysis (CFA), we find that the scale produces equivalent levels of “goodness-to-fit” for both our German and Japanese translations, confirming its cross-cultural validity. Still, the structure of the communicative flexibility factor did not replicate directly across Western and East Asian cohorts. We discuss how our translations can open up critical research on cultural similarities and differences in partner model use and design, whilst highlighting the challenges for ensuring accurate translation across cultural contexts.2024KSKatie Seaborn et al.Voice User Interface (VUI) DesignMultilingual & Cross-Cultural Voice InteractionCUI
Silver-Tongued and Sundry: Exploring Intersectional Pronouns with ChatGPTChatGPT is a conversational agent built on a large language model. Trained on a significant portion of human output, ChatGPT can mimic people to a degree. As such, we need to consider what social identities ChatGPT simulates (or can be designed to simulate). In this study, we explored the case of identity simulation through Japanese first-person pronouns, which are tightly connected to social identities in intersectional ways, i.e., intersectional pronouns. We conducted a controlled online experiment where people from two regions in Japan (Kanto and Kinki) witnessed interactions with ChatGPT using ten sets of first-person pronouns. We discovered that pronouns alone can evoke perceptions of social identities in ChatGPT at the intersections of gender, age, region, and formality, with caveats. This work highlights the importance of pronoun use for social identity simulation, provides a language-based methodology for culturally-sensitive persona development, and advances the potential of intersectional identities in intelligent agents.2024TFTakao Fujii et al.Tokyo Institute of TechnologyMultilingual & Cross-Cultural Voice InteractionAgent Personality & AnthropomorphismHuman-LLM CollaborationCHI
MOSion: Gaze Guidance with Motion-triggered Visual Cues by Mosaic PatternsWe propose a gaze-guiding method called MOSion to adjust the guiding strength reacted to observers’ motion based on a high-speed projector and the afterimage effect in the human vision system. Our method decomposes the target area into mosaic patterns to embed visual cues in the perceived images. The patterns can only direct the attention of the moving observers to the target area. The stopping observer can see the original image with little distortion because of light integration in the visual perception. The pre computation of the patterns provides the adaptive guiding effect without tracking devices and computational costs depending on the movements. The evaluation and the user study show that the mosaic decomposition enhances the perceived saliency with a few visual artifacts, especially in moving conditions. Our method embedded in white lights works in various situations such as planar posters, advertisements, and curved objects.2024AKArisa Kohtani et al.Tokyo Institute of TechnologyEye Tracking & Gaze InteractionVisualization Perception & CognitionCHI
MR Microsurgical Suture Training System with Level-Appropriate SupportThe integration of advanced technologies in healthcare necessitates the development of systems accommodating the daily routines in medical practices. Neurosurgeons, in particular, require extensive practice in microsurgical suturing in the long term, even in the busy routine of a medical practice. This study collaboratively developed a Mixed Reality system with neurosurgeons to support self-training in microscopic suturing. Based on the neurosurgeons' opinions, we implemented a level-appropriate microsurgical suture training system. For novices, the system offers shadow-matching training to support the practice of precise movements under the high-sensitivity environment of the microscope. For intermediates, it provides a real-time feedback system, which allows users to practice attention to details. Evaluation involved testing the novice system on students with no medical background and the intermediate system on neurosurgery residents. The effectiveness of the system was demonstrated through the experimental results and subsequent discussion.2024YTYuka Tashiro et al.Tokyo Institute of TechnologyMixed Reality WorkspacesVR Medical Training & RehabilitationRobots in Education & HealthcareCHI