Treading the Transparency Tightrope: A Taxonomy of Risks and Benefits of Foundation Model Data Transparency for Transparency AdvocatesData powering AI is often opaque. Researchers, NGOs, and law and policy leaders have called for greater transparency about how data is used for training, fine-tuning, and evaluation. While data transparency is often championed as crucial, what it concretely enables is largely implicit. Similarly, the concerns developers seem to have about transparency go unstated. This lack of clarity has led some researchers to critique transparency demands as disconnected from the actual benefits—or risks—to specific stakeholders. We analyze documentation from four stakeholder groups to create a taxonomy of the risks and benefits of dataset transparency. Data transparency is perceived as either a risk or a benefit given a stakeholder's position, rather than wholesale. We also propose data availability and data documentation as two lenses through which to consider transparency. We discuss how best to strategically promote situational data transparency that takes into account the relationship between stakeholder position, transparency modality, and benefits/risks.2026MSMorgan Klaus Scheuerman et al.Sony AIExplainable AI (XAI)Algorithmic Transparency & AuditabilityPrivacy by Design & User ControlCHI
Emergent, not Immanent: A Baradian Reading of Explainable AIExplainable AI (XAI) is frequently positioned as a technical problem of revealing the inner workings of an AI model. This position is affected by unexamined onto-epistemological assumptions: meaning is treated as immanent to the model, the explainer is positioned outside the system, and a causal structure is presumed recoverable through computational techniques. In this paper, we draw on Barad’s agential realism to develop an alternative onto-epistemology of XAI. We propose that interpretations are material-discursive performances that emerge from situated entanglements of the AI model with humans, context, and the interpretative apparatus. To develop this position, we read a comprehensive set of XAI methods through agential realism and reveal the assumptions and limitations that underpin several of these methods. We then articulate the framework’s ethical dimension and propose design directions for XAI interfaces that support emergent interpretation, using a speculative text-to-music interface as a case study.2026FMFabio Morreale et al.Sony AIExplainable AI (XAI)AI-Assisted Decision-Making & AutomationCHI
NasoVoce: A Nose-Mounted Low-Audibility Speech Interface for Always-Available Speech InteractionSilent and whispered speech offer promise for always-available voice interaction with AI, yet existing methods struggle to balance vocabulary size, wearability, silence, and noise robustness. We present NasoVoce, a nose-bridge–mounted interface that integrates a microphone and a vibration sensor. Positioned at the nasal pads of smart glasses, it unobtrusively captures both acoustic and vibration signals. The nasal bridge, close to the mouth, allows access to bone- and skin-conducted speech and enables reliable capture of low-volume utterances such as whispered speech. While the microphone captures high-quality audio, it is highly sensitive to environmental noise. Conversely, the vibration sensor is robust to noise but yields lower signal quality. By fusing these complementary inputs, NasoVoce generates high-quality speech robust against interference. Evaluation with Whisper Large-v2, PESQ, STOI, and MUSHRA ratings confirms improved recognition and quality. NasoVoce demonstrates the feasibility of a practical interface for always-available, continuous, and discreet AI voice conversations.2026JRJun Rekimoto et al.Sony Computer Science Laboratories, KyotoIntelligent Voice Assistants (Alexa, Siri, etc.)Affective Human-Computer DialogueContext-Aware ComputingCHI
PCGEF: A Framework for Diagnosing Subjective Alignment in Human-Centered Persona-Conditioned GenerationEvaluating LLMs in subjective and expressive domains is challenging, as standard accuracy metrics overlook affect, style, and coherence. We present the Persona-Conditioned Generation Evaluation Framework (PCGEF), which disentangles the effects of persona and continuity controls (via summarization-based memory) across five axes: Affective Alignment, Preference Alignment, Stylistic Expressiveness, Semantic Grounding, and Contextual Coherence. These two levers influence alignment through different mechanisms, and the five axes capture key forms of subjective drift. Unlike prior persona-aware approaches, PCGEF compares model generations under controlled conditions rather than relying on gold standards or LLM judges. We instantiate PCGEF in a red-wine description task with a 2×2 factorial design involving 34 participants and four mid-scale, open-weight LLMs. Results show that persona control improves affective and preference alignment and tends to enhance style, continuity control stabilizes coherence, and semantic grounding remains weak. PCGEF offers a reusable, interpretable framework transferable to sensory/creative domains and interactive dialogue.2026KMKana Maruyama et al.Sony AIHuman-LLM CollaborationGenerative AI (Text, Image, Music, Video)Explainable AI (XAI)CHI
'The plan is just survival': Data Work in Kenya and the Regime of EntrapmentThe rapid expansion of the AI industry relies heavily on the production, verification, and maintenance of data, otherwise known as "data work". Companies outsource and offshore this work through global AI supply chains that operate under exploitative conditions. Drawing on semi-structured interviews with Kenyan data workers across platforms and BPOs, this paper examines how such conditions take shape and persist. We argue that workers are caught within a regime of entrapment, a system of interconnected mechanisms that make it difficult for workers to leave or improve their positions. These mechanisms include the push to invest in the promise of ‘AI’ jobs, the use of precarious contracts to govern workers, the capture of regulatory institutions, and the exploitation of global labor arbitrage. Using complementary lenses of neoliberal governmentality, precarity, and supply chain capitalism, we analyze why labor mobilization in this sector remains uniquely constrained. We conclude by outlining an orientation for research and scholarly practice that can support workers' organizing efforts and contest the structural conditions sustaining this regime.2026SKShivani Kapania et al.Carnegie Mellon UniversityDeveloping Countries & HCI for Development (HCI4D)Surgical Assistance & Medical TrainingGig Economy PlatformsCHI
How Data Workers Shape Datasets: The Role of Positionality in Data Collection and Annotation for Computer VisionData workers play a key role in the big data industry. Clients hire data workers to collect and annotate data with human identity concepts, like demographic categories or clothing items. Often, such workers are treated as computational—they are expected to quickly and objectively conduct their work, with the goal of having unbiased datasets for training and evaluating models. Computer vision is especially interested in fair and impartial data due to biases and unethical practices in the field. However, far from impartial, data workers imbue computer vision data with "biases" beyond correct versus incorrect answers. Data workers embed their own specific positional perspectives about identity concepts in both collection and annotation processes. Through interviews and ethnographic observations of data workers (both freelance and business process outsourcing (BPO) employees), we show how worker positionality influences decisions during data work. We also show how unintended outcomes, generally portrayed as "biases," occur when positionality is not explicitly considered in client instructions. We discuss how employing a lens of positionality in data work reveals the gulfs between data worker perspectives and client expectations, which are colored by a web of positional actors beyond isolated data workers. We propose positional (il)legibility as an approach to data work that embraces the reality of positionality in classification practices that the lens of "bias" fails to appropriately account for.2025MSMorgan Klaus Scheuerman et al.The Gig EconomyCSCW
Conversational Agents on Your Behalf: Opportunities and Challenges of Shared Autonomy in Voice Communication for MultitaskingAdvancements in computational agents will enable them to act as surrogates for users in online communication, promising enhanced productivity by supporting multitasking. This capability may be especially powerful when combined with human control, allowing users to retain agency while achieving better performance than either human or agent alone. However, it remains unclear how people might leverage this technology to multitask effectively. We present a study with 18 dyads exploring how users employ automated responses to support an arithmetic task while staying engaged in a voice call. Participants multitasked with a conversational agent under three levels of autonomy: none, shared, and full. Our findings indicate that fully automated systems can maintain conversational engagement, enabling users to multitask effectively. Surprisingly, shared autonomy hindered this ability. Based on our results, we discuss implications for designing shared autonomy in conversations, highlighting new considerations and challenges.2025YCYi Fei Cheng et al.Carnegie Mellon University, Human-Computer Interaction InstituteConversational ChatbotsAgent Personality & AnthropomorphismCHI
Realism Drives Interpersonal Reciprocity but Yields to AI-Assisted Egocentrism in a Coordination ExperimentVirtual reality technologies that enhance realism and artificial intelligence (AI) systems that assist human behavior are increasingly interwoven in social applications. However, how these technologies might jointly influence interpersonal coordination remains unclear. We conducted an experiment with 240 participants in 120 pairs who interacted through remote-controlled robot cars in a physical space or virtual cars in a digital space, with or without autosteering assistance, using the chicken game, an established model of interpersonal coordination. We find that both realism and AI assistance help improve user performance but through opposing mechanisms. Real-world contexts enhanced communication, fostering reciprocal actions and collective benefits. In contrast, autosteering assistance diminished the need for interpersonal coordination, shifting participants’ focus towards self-interest. Notably, when combined, the egocentric effects of autosteering assistance outweighed the prosocial effects of realism. The design of HCI systems that involve social coordination will, we believe, need to take such effects into account.2025HSHirokazu Shirado et al.Carnegie Mellon University, School of Computer ScienceAutomated Driving Interface & Takeover DesignHuman-Robot Collaboration (HRC)Technology Ethics & Critical HCICHI
PiaMuscle: Improving Piano Skill Acquisition by Cost-effectively Estimating and Visualizing Activities of Miniature Hand MusclesUnderstanding neuromusculoskeletal mechanisms significantly impacts skill specialization and proficiency. While existing methods can infer large muscle activities during gross motor movements, the estimation of dexterous motor control involving miniature muscles remains underexplored. Targeting the coordinated hand muscles in advanced piano performance, we learn spatiotemporal discrete representations of electromyography (EMG) data and hand postures utilizing a multimodal dataset. Subsequently, we train a precise and cost-effective neural network model. Based on this model, PiaMuscle is introduced to investigate if visualizing muscle activities during piano training enhances piano performance. Quantitative and qualitative results of a user study with highly skilled professional pianists demonstrate that PiaMuscle provides reliable muscle activation data to support and optimize force control. Our research underscores the potential of a naturalistic workflow to estimate small muscles' activities from readily accessible human-centric information and more accurately when combined with tool-centric data, thereby enhancing skill acquisition.2025RLRuofan Liu et al.Tokyo Institute of Technology, School of Computing; Sony Computer Science Laboratories Inc.Human Pose & Activity RecognitionBiosensors & Physiological MonitoringCHI
Morphing Identity: Exploring Self-Other Identity Continuum through Interpersonal Facial Morphing ExperienceWe explored continuous changes in self-other identity by designing an interpersonal facial morphing experience where the facial images of two users are blended and then swapped over time. Both users' facial images are displayed side by side, with each user controlling their own morphing facial images, allowing us to create and investigate a multifaceted interpersonal experience. To explore this with diverse social relationships, we conducted qualitative and quantitative investigations through public exhibitions. We found that there is a window of self-identification as well as a variety of interpersonal experiences in the facial morphing process. From these insights, we synthesized a Self-Other Continuum represented by a sense of agency and facial identity. This continuum has implications in terms of the social and subjective aspects of interpersonal communication, which enables further scenario design and could complement findings from research on interactive devices for remote communication.2023KSKye Shimizu et al.Sony Computer Science Laboratories, IncIdentity & Avatars in XRInteractive Narrative & Immersive StorytellingCHI
“I am both here and there” Parallel Control of Multiple Robotic Avatars by Disabled Workers in a CafeRobotic avatars can help disabled people extend their reach in interacting with the world. Technological advances make it possible for individuals to embody multiple avatars simultaneously. However, existing studies have been limited to laboratory conditions and did not involve disabled participants. In this paper, we present a real-world implementation of a parallel control system allowing disabled workers in a café to embody multiple robotic avatars at the same time to carry out different tasks. Our data corpus comprises semi-structured interviews with workers, customer surveys, and videos of café operations. Results indicate that the system increases workers' agency, enabling them to better manage customer journeys. Parallel embodiment and transitions between avatars create multiple interaction loops where the links between disabled workers and customers remain consistent, but the intermediary avatar changes. Based on our observations, we theorize that disabled individuals possess specific competencies that increase their ability to manage multiple avatar bodies.2023GBGiulia Barbareschi et al.Keio UniversityDomestic RobotsSocial Robot InteractionRobots in Education & HealthcareCHI
Upvotes? Downvotes? No Votes? Understanding the relationship between reaction mechanisms and political discourse on RedditA significant share of political discourse occurs online on social media platforms. Policymakers and researchers try to understand the role of social media design in shaping the quality of political discourse around the globe. In the past decades, scholarship on political discourse theory has produced distinct characteristics of different types of prominent political rhetoric such as deliberative, civic, or demagogic discourse. This study investigates the relationship between social media reaction mechanisms (i.e., upvotes, downvotes) and political rhetoric in user discussions by engaging in an in-depth conceptual analysis of political discourse theory. First, we analyze 155 million user comments in 55 political subforums on Reddit between 2010 and 2018 to explore whether users' style of political discussion aligns with the essential components of deliberative, civic, and demagogic discourse. Second, we perform a quantitative study that combines confirmatory factor analysis with difference in differences models to explore whether different reaction mechanism schemes (e.g., upvotes only, upvotes and downvotes, no reaction mechanisms) correspond with political user discussion that is more or less characteristic of deliberative, civic, or demagogic discourse. We produce three main takeaways. First, despite being "ideal constructs of political rhetoric," we find that political discourse theories describe political discussions on Reddit to a large extent. Second, we find that discussions in subforums with only upvotes, or both up- and downvotes black are associated with} user discourse that is more deliberate and civic. Third, and perhaps most strikingly, social media discussions are most demagogic in subreddits with no reaction mechanisms at all. These findings offer valuable contributions for ongoing policy discussions on the relationship between social media interface design and respectful political discussion among users.2023OPOrestis Papakyriakopoulos et al.Sony AISocial Platform Design & User BehaviorContent Moderation & Platform GovernanceActivism & Political ParticipationCHI
Machine-Mediated Teaming: Mixture of Human and Machine in Physical Gaming ExperienceTechnological advancement has opened up opportunities for new sports and physical activities. We introduce a concept called {\it machine-mediated teaming}, in which a human and a surrogate machine form a team to participate in physical sports games. To understand the experience of machine-mediated teaming and the guidelines for designing the system to achieve the concept, we built a case study system based on tug-of-war. Our system is a sports game played by two against two. One team consists of a player who actually pulls the rope and another player who participates in the physical game by controlling the machine's actuators. We conducted user studies using this system to investigate the sport experience in this form and to reveal insights to inform future research on machine-mediated teaming. Based on the data obtained from the user studies, we clarified three perspectives, machine stamina, action space, and explicit feedback, that should be considered when designing future machine-mediated teaming systems. The research presented in this paper offers a first step towards exploring how humans and machines can coexist in highly dynamic physical interactions.2022AMAzumi Maekawa et al.The University of TokyoFull-Body Interaction & Embodied InputSerious & Functional GamesCHI
Preserving Agency During Electrical Muscle Stimulation Training Speeds up Reaction Time Directly After Removing EMSForce feedback devices, such as motor-based exoskeletons or wearables based on electrical muscle stimulation (EMS), have the unique potential to accelerate users’ own reaction time (RT). However, this speedup has only been explored while the device is attached to the user. In fact, very little is known regarding whether this faster reaction time still occurs after the user removes the device from their bodies–this is precisely what we investigated by means of a simple reaction time (RT) experiment, in which participants were asked to tap as soon as they saw an LED flashing. Participants experienced this in three EMS conditions: (1) fast-EMS, the electrical impulses were synced with the LED; (2) agency-EMS, the electrical impulse was delivered 40ms faster than the participant’s own RT, which prior work has shown to preserve one’s sense of agency over this movement; and, (3) late-EMS: the impulse was delivered after the participant’s own RT. Our results revealed that the participants’ RT was significantly reduced by approximately 8ms(up to 20ms) only after training with the agency-EMS condition. This finding suggests that the prioritizing agency during EMS training is key to motor-adaptation, i.e., it enables a faster motor response even after the user has removed the EMS device from their body.2021SKShunichi Kasahara et al.Sony CSL, The University of TokyoVibrotactile Feedback & Skin StimulationElectrical Muscle Stimulation (EMS)CHI
Evaluation of Machine Learning Techniques for Hand Pose Estimation on Handheld Device with Proximity SensorTracking finger movement for natural interaction using hand is commonly studied. For vision-based implementations of finger tracking in virtual reality (VR) application, finger movement is occluded by a handheld device which is necessary for auxiliary input, thus tracking finger movement using cameras is still challenging. Finger tracking controllers using capacitive proximity sensors on the surface are starting to appear. However, research on estimating articulated hand pose from curved capacitance sensing electrodes is still immature. Therefore, we built a prototype with 62 electrodes and recorded training datasets using an optical tracking system. We have introduced 2.5D representation to apply convolutional neural network methods on a capacitive image of the curved surface, and two types of network architectures based on recent achievements in the computer vision field were evaluated with our dataset. We also implemented real-time interactive applications using the prototype and demonstrated the possibility of intuitive interaction using fingers in VR applications.2020KAKazuyuki Arimatsu et al.Sony Interactive Entertainment Inc.Hand Gesture RecognitionContext-Aware ComputingCHI
Preemptive Action: Accelerating Human Reaction using Electrical Muscle Stimulation Without Compromising AgencyWe enable preemptive force-feedback systems to speed up human reaction time without fully compromising the user's sense of agency. Typically these interfaces actuate by means of electrical muscle stimulation (EMS) or mechanical actuators; they preemptively move the user to perform a task, such as to improve movement performance (e.g., EMS-assisted drumming). Unfortunately, when using preemptive force-feedback users do not feel in control and loose their sense of agency. We address this by actuating the user's body, using EMS, within a particular time window (160 ms after visual stimulus), which we found to speed up reaction time by 80 ms in our first study. With this preemptive timing, when the user and system move congruently, the user feels that they initiated the motion, yet their reaction time is faster than usual. As our second study demonstrated, this particular timing significantly increased agency when compared to the current practice in EMS-based devices. We conclude by illustrating, using examples from the HCI literature, how to leverage our findings to provide more agency to automated haptic interfaces.2019SKShunichi Kasahara et al.Sony CSL & University of TokyoForce Feedback & Pseudo-Haptic WeightElectrical Muscle Stimulation (EMS)CHI