ActivitySeeker: Towards Collaborative Personalized Human Activity Discovery and Recognition on SmartphonesSmartphones provide an attractive yet challenging platform for human activity recognition (HAR). They are ubiquitous, but also limit the input of HAR systems to a single IMU. These systems are also challenged by the inherent diversity of human activities and varying phone placement on the user's body. This results in traditional smartphone HAR systems having limited personalization potential or imposing a high user burden. We propose ActivitySeeker, a personalized smartphone HAR system that combines self-supervised activity discovery and low-burden user interaction to collaboratively label IMU data and adapt HAR models to individual users on-device through transfer learning. We evaluated ActivitySeeker through simulated online learning and in-the-wild user experiments, where it discovered 95.5% of personal activity types and achieved high recognition accuracy (93.3%) while maintaining a positive user experience. Leveraging the synergy between user and smartphone, ActivitySeeker opens up new possibilities for HAR-based applications like fitness, health and personalized recommendation.2026ZYZhoutong Ye et al.Tsinghua UniversityHuman Pose & Activity RecognitionFitness Tracking & Physical Activity MonitoringBehavior Change & Reflection TechnologyCHI
LubDubDecoder: Bringing Micro-Mechanical Cardiac Monitoring to HearablesWe present LubDubDecoder, a system that enables fine-grained monitoring of micro-cardiac vibrations associated with the opening and closing of heart valves across a range of hearables. Our system transforms the built-in speaker, the only transducer common to all hearables, into an acoustic sensor that captures the coarse "lub-dub" heart sounds, leverages their shared temporal and spectral structure to reconstruct the subtle seismocardiography (SCG) and gyrocardiography (GCG) waveforms, and extract the timing of key micro-cardiac events. In an IRB-approved feasibility study with 25 users, our system achieves correlations of 0.88-0.95 compared to chest-mounted reference measurements in within-user and cross-user evaluations, and generalizes to unseen hearables using a zero-effort adaptation scheme with a correlation of 0.91. Our system is robust across remounting sessions and music playback.2026SZSiqi Zhang et al.Carnegie Mellon UniversityBiosensors & Physiological MonitoringEmotion-Sensing WearablesFitness Tracking & Physical Activity MonitoringCHI
Routine Computing: A Systematic Review of Sensing Daily Life Dimensions Towards Human-Centered GoalsHuman routines structure daily life, yet remain challenging for computational systems to understand. This paper presents the first systematic review of routine computing, a previously implicit but increasingly recognized field that focuses on computationally sensing and modeling human behaviors. It synthesizes 203 studies published up to August 2025. The paper presents a new taxonomy of the literature, focusing on temporal structures, behavioral interactions, cognitive aspects, and how variability and deviations are addressed. The common goals of routine computing extend across four major application domains, including accessibility care, the promotion of healthy habits, adaptive and context-aware support, and large-scale population insights. Persistent challenges that limit the design of truly human-centered systems are identified, including the gap between low-level activity recognition and high-level intent, the tension between personalization and generalization, unresolved privacy concerns, and data-related limitations. By consolidating these findings, this paper provides a foundational framework for HCI researchers, outlining principles for designing ethical, adaptive, and human-centered routine-aware systems.2026BPBorislav Pavlov et al.Tsinghua UniversityHuman Pose & Activity RecognitionBehavior Change & Reflection TechnologyContext-Aware ComputingCHI
Enabling Adaptive Cardio-Respiratory Biofeedback Training on Ubiquitous Hand-Worn DevicesWe introduce an adaptive cardio-respiratory biofeedback system implemented on ubiquitous hand-worn devices such as smart watches and rings, enabling accessible and real-time physiological training outside clinical settings. Users place a hand on their abdomen to promote embodied awareness of breathing rhythms, while PPG and IMU sensors continuously capture cardio-respiratory signals. Unlike conventional open-loop biofeedback that delivers fixed breathing guidance irrespective of user response, our system employs a closed-loop adaptation: real-time physiological signals adjust breathing cues to optimize cardio-respiratory coupling, ensuring personalized training trajectories. This shift from static to adaptive guidance markedly improves user engagement and training efficacy. A user performance evaluation study further showed that adaptive biofeedback significantly boosts HRV, prolongs high-HRV states, and enhances user experience, demonstrating clear advantages over non-adaptive methods. Together, these findings position adaptive, hand-worn biofeedback as a promising approach for ubiquitous, user-centered mental health interventions.2026RYRuotong Yu et al.Tsinghua UniversityEmotion-Sensing WearablesHealth Self-TrackingBehavior Change & Reflection TechnologyCHI
FlowRing: Integrated Microgesture and Surface Interaction Ring for Versatile XR InputAs Extended Reality (XR) advances, a device has the potential to be used across contexts from immersive productivity at a desk to on-the-go, public scenarios. Existing input solutions lack the versatility to provide both high-throughput, mouse-grade input and subtle, ergonomic interaction. We introduce FlowRing, a novel ring-form device that combines microgestures with precise 2D mouse-like input on surfaces. FlowRing supports five microgestures for discreet interaction and 2D input for richer tasks, using an optical flow sensor, skin-contact microphone, and IMU at the base of the finger. In a study with 11 participants, FlowRing achieved 93.6% microgesture recognition accuracy across sessions and 85.2% across unseen users, rising to 90.1% with just four gesture set examples from a new user. A separate 2D Fitts’ law study demonstrated its effectiveness for continuous input on various surfaces. FlowRing emerges as a versatile, user-friendly solution for the future of interactive technology.2025ICIshan Chatterjee et al.Hand Gesture RecognitionFoot & Wrist InteractionMobileHCI
The Odyssey Journey: Top-Tier Medical Resource Seeking for Specialized Disorder in ChinaIt is pivotal for patients to receive accurate health information, diagnoses, and timely treatments. However, in China, the significant imbalanced doctor-to-patient ratio intensifies the information and power asymmetries in doctor-patient relationships. Health information-seeking, which enables patients to collect information from sources beyond doctors, is a potential approach to mitigate these asymmetries. While HCI research predominantly focuses on common chronic conditions, our study focuses on specialized disorders, which are often familiar to specialists but not to general practitioners and the public. With Hemifacial Spasm (HFS) as an example, we aim to understand patients' health information and top-tier medical resource seeking journeys in China. Through interviews with three neurosurgeons and 12 HFS patients from rural and urban areas, and applying Actor-Network Theory, we provide empirical insights into the roles, interactions, and workflows of various actors in the health information-seeking network. We also identified five strategies patients adopted to mitigate asymmetries and access top-tier medical resources, illustrating these strategies as subnetworks within the broader health information-seeking network and outlining their advantages and challenges.2025KCKa I Chan et al.Tsinghua University, Global Innovation ExchangeChronic Disease Self-Management (Diabetes, Hypertension, etc.)Telemedicine & Remote Patient MonitoringCHI
VAction: A Lightweight and Integrated VR Training System for Authentic Film-Shooting ExperienceThe film industry exerts significant economic and cultural influence, and its rapid development is contingent upon the expertise of industry professionals, underscoring the critical importance of film-shooting education. However, this process typically necessitates multiple practice in complex professional venues using expensive equipment, presenting a significant obstacle for ordinary learners who struggle to access such training environments. Despite VR technology has already shown its potential in education, existing research has not addressed the crucial learning component of replicating the shooting process. Moreover, the limited functionality of traditional controllers hinder the fulfillment of the educational requirements. Therefore, we developed VAction VR system, combining high-fidelity virtual environments with a custom-designed controller to simulate the real-world camera operation experience. The system’s lightweight design ensures cost-effective and efficient deployment. Experiment results demonstrated that VAction significantly outperforms traditional methods in both practice effectiveness and user experience, indicating its potential and usefulness in film-shooting education.2025SWShaocong Wang et al.Tsinghua University, Department of Computer Science and TechnologyMixed Reality WorkspacesHome Energy ManagementCHI
Actual Achieved Gain and Optimal Perceived Gain: Modeling Human Take-over Decisions Towards Automated Vehicles' SuggestionsDriver decision quality in take-overs is critical for effective human-Autonomous Driving System (ADS) collaboration. However, current research lacks detailed analysis of its variations. This paper introduces two metrics--Actual Achieved Gain (AAG) and Optimal Perceived Gain (OPG)--to assess decision quality, with OPG representing optimal decisions and AAG reflecting actual outcomes. Both are calculated as weighted averages of perceived gains and losses, influenced by ADS accuracy. Study 1 (N=315) used a 21-point Thurstone scale to measure perceived gains and losses—key components of AAG and OPG—across typical tasks: route selection, overtaking, and collision avoidance. Studies 2 (N=54) and 3 (N=54) modeled decision quality under varying ADS accuracy and decision time. Results show with sufficient time (>3.5s), AAG converges towards OPG, indicating rational decision-making, while limited time leads to intuitive and deterministic choices. Study 3 also linked AAG-OPG deviations to irrational behaviors. An intervention study (N=8) and a pilot (N=4) employing voice alarms and multi-modal alarms based on these deviations demonstrated AAG's potential to improve decision quality.2025SZShuning Zhang et al.Tsinghua University, Institute for Network Sciences and CyberspaceAutomated Driving Interface & Takeover DesignHead-Up Display (HUD) & Advanced Driver Assistance Systems (ADAS)AI-Assisted Decision-Making & AutomationCHI
BIT: Battery-free, IC-less and Wireless Smart Textile Interface and Sensing SystemThe development of smart textile interfaces is hindered by the inclusion of rigid hardware components and batteries within the fabric, which pose challenges in terms of manufacturability, usability, and environmental concerns related to electronic waste. To mitigate these issues, we propose a smart textile interface and its wireless sensing system to eliminate the need for ICs, batteries, and connectors embedded into textiles. Our technique is established on the integration of multi-resonant circuits in smart textile interfaces, and utilizing near-field electromagnetic coupling between two coils to facilitate wireless power transfer and data acquisition from smart textile interface.A key aspect of our system is the development of a mathematical model that accurately represents the equivalent circuit of the sensing system. Using this model, we developed a novel algorithm to accurately estimate sensor signals based on changes in system impedance. Through simulation-based experiments and a user study, we demonstrate that our technique effectively supports multiple textile sensors of various types.2025WXWeiye Xu et al.Tsinghua UniversityElectronic Textiles (E-textiles)Shape-Changing Materials & 4D PrintingCHI
Modeling the Impact of Visual Stimuli on Redirection Noticeability with Gaze Behavior in Virtual RealityWhile users could embody virtual avatars that mirror their physical movements in Virtual Reality, these avatars' motions can be redirected to enable novel interactions. Excessive redirection, however, could break the user's sense of embodiment due to perceptual conflicts between vision and proprioception. While prior work focused on avatar-related factors influencing the noticeability of redirection, we investigate how the visual stimuli in the surrounding virtual environment affect user behavior and, in turn, the noticeability of redirection. Given the wide variety of different types of visual stimuli and their tendency to elicit varying individual reactions, we propose to use users' gaze behavior as an indicator of their response to the stimuli and model the noticeability of redirection. We conducted two user studies to collect users' gaze behavior and noticeability, investigating the relationship between them and identifying the most effective gaze behavior features for predicting noticeability. Based on the data, we developed a regression model that takes users' gaze behavior as input and outputs the noticeability of redirection. We then conducted an evaluation study to test our model on unseen visual stimuli, achieving an accuracy of 0.012 MSE. We further implemented an adaptive redirection technique and conducted a proof-of-concept study to evaluate its effectiveness with complex visual stimuli in two applications. The results indicated that participants experienced less physical demanding and a stronger sense of body ownership when using our adaptive technique, demonstrating the potential of our model to support real-world use cases.2025ZLZhipeng Li et al.ETH Zürich, Department of Computer ScienceEye Tracking & Gaze InteractionMixed Reality WorkspacesImmersion & Presence ResearchCHI
UbiPhysio: Support Daily Functioning, Fitness, and Rehabilitation with Action Understanding and Feedback in Natural LanguageWang 等人开发 UbiPhysio 系统,通过动作理解和自然语言反馈,帮助用户进行日常功能锻炼、健身和康复训练。2024CWChongyang Wang et al.Vibrotactile Feedback & Skin StimulationFull-Body Interaction & Embodied InputUbiComp
The EarSAVAS Dataset: Enabling Subject-Aware Vocal Activity Sensing on EarablesZhang 等人构建 EarSAVAS 数据集,支持智能耳穿戴设备进行主体感知的语音活动检测,推动相关算法研究。2024XZXiyuxing Zhang et al.Biosensors & Physiological MonitoringUbiComp
ReHEarSSE: Recognizing Hidden-in-the-Ear Silently Spelled ExpressionsSilent speech interaction (SSI) allows users to discreetly input text without using their hands. Existing wearable SSI systems typically require custom devices and are limited to a small lexicon, limiting their utility to a small set of command words. This work proposes ReHearSSE, an earbud-based ultrasonic SSI system capable of generalizing to words that do not appear in its training dataset, providing support for nearly an entire dictionary's worth of words. As a user silently spells words, ReHearSSE uses autoregressive features to identify subtle changes in ear canal shape. ReHearSSE infers words using a deep learning model trained to optimize connectionist temporal classification (CTC) loss with an intermediate embedding that accounts for different letters and transitions between them. We find that ReHearSSE recognizes 100 unseen words with an accuracy of 89.3%.2024XDXuefu Dong et al.The University of TokyoElectrical Muscle Stimulation (EMS)Augmentative & Alternative Communication (AAC)CHI
PepperPose: Full-Body Pose Estimation with a Companion RobotAccurate full-body pose estimation across diverse actions in a user-friendly and location-agnostic manner paves the way for interactive applications in realms like sports, fitness, and healthcare. This task becomes challenging in real-world scenarios due to factors like the user's dynamic positioning, the diversity of actions, and the varying acceptability of the pose-capturing system. In this context, we present PepperPose, a novel companion robot system tailored for optimized pose estimation. Unlike traditional methods, PepperPose actively tracks the user and refines its viewpoint, facilitating enhanced pose accuracy across different locations and actions. This allows users to enjoy a seamless action-sensing experience. Our evaluation, involving 30 participants undertaking daily functioning and exercise actions in a home-like space, underscores the robot's promising capabilities. Moreover, we demonstrate the opportunities that PepperPose presents for human-robot interaction, its current limitations, and future developments.2024CWChongyang Wang et al.Tsinghua UniversityHuman Pose & Activity RecognitionHuman-Robot Collaboration (HRC)CHI
Time2Stop: Adaptive and Explainable Human-AI Loop for Smartphone Overuse InterventionDespite a rich history of investigating smartphone overuse intervention techniques, AI-based just-in-time adaptive intervention (JITAI) methods for overuse reduction are lacking. We develop Time2Stop, an intelligent, adaptive, and explainable JITAI system that leverages machine learning to identify optimal intervention timings, introduces interventions with transparent AI explanations, and collects user feedback to establish a human-AI loop and adapt the intervention model over time. We conducted an 8-week field experiment (N=71) to evaluate the effectiveness of both the adaptation and explanation aspects of Time2Stop. Our results indicate that our adaptive models significantly outperform the baseline methods on intervention accuracy (>32.8% relatively) and receptivity (>8.0%). In addition, incorporating explanations further enhances the effectiveness by 53.8% and 11.4% on accuracy and receptivity, respectively. Moreover, Time2Stop significantly reduces overuse, decreasing app visit frequency by 7.0∼8.9%. Our subjective data also echoed these quantitative measures. Participants preferred the adaptive interventions and rated the system highly on intervention time accuracy, effectiveness, and level of trust. We envision our work can inspire future research on JITAI systems with a human-AI loop to evolve with users.2024AOAdiba Orzikulova et al.KAISTExplainable AI (XAI)AI-Assisted Decision-Making & AutomationNotification & Interruption ManagementCHI
MMTSA: Multi-Modal Temporal Segment Attention Network for Efficient Human Activity Recognition"Multimodal sensors provide complementary information to develop accurate machine-learning methods for human activity recognition (HAR), but introduce significantly higher computational load, which reduces efficiency. This paper proposes an efficient multimodal neural architecture for HAR using an RGB camera and inertial measurement units (IMUs) called Multimodal Temporal Segment Attention Network (MMTSA). MMTSA first transforms IMU sensor data into a temporal and structure-preserving gray-scale image using the Gramian Angular Field (GAF), representing the inherent properties of human activities. MMTSA then applies a multimodal sparse sampling method to reduce data redundancy. Lastly, MMTSA adopts an inter-segment attention module for efficient multimodal fusion. Using three well-established public datasets, we evaluated MMTSA's effectiveness and efficiency in HAR. Results show that our method achieves superior performance improvements (11.13% of cross-subject F1-score on the MMAct dataset) than the previous state-of-the-art (SOTA) methods. The ablation study and analysis suggest that MMTSA's effectiveness in fusing multimodal data for accurate HAR. The efficiency evaluation on an edge device showed that MMTSA achieved significantly better accuracy, lower computational load, and lower inference latency than SOTA methods." https://doi.org/10.1145/36108722023ZGZiqi Gao et al.Human Pose & Activity RecognitionUbiComp
DRG-Keyboard: Enabling Subtle Gesture Typing on the Fingertip with Dual IMU Rings"We present DRG-Keyboard, a gesture keyboard enabled by dual IMU rings, allowing the user to swipe the thumb on the index fingertip to perform word gesture typing as if typing on a miniature QWERTY keyboard. With dual IMUs attached to the user's thumb and index finger, DRG-Keyboard can 1) measure the relative attitude while mapping it to the 2D fingertip coordinates and 2) detect the thumb's touch-down and touch-up events combining the relative attitude data and the synchronous frequency domain data, based on which a fingertip gesture keyboard can be implemented. To understand users typing behavior on the index fingertip with DRG-Keyboard, we collected and analyzed user data in two typing manners. Based on the statistics of the gesture data, we enhanced the elastic matching algorithm with rigid pruning and distance measurement transform. The user study showed DRG-Keyboard achieved an input speed of 12.9 WPM (68.3% of their gesture typing speed on the smartphone) for all participants. The appending study also demonstrated the superiority of DRG-Keyboard for better form factors and wider usage scenarios. To sum up, DRG-Keyboard not only achieves good text entry speed merely on a tiny fingertip input surface, but is also well accepted by the participants for the input subtleness, accuracy, good haptic feedback, and availability. https://doi.org/10.1145/3569463"2023CLChen Liang et al.Vibrotactile Feedback & Skin StimulationHaptic WearablesHand Gesture RecognitionUbiComp
Modeling the Trade-off of Privacy Preservation and Activity Recognition on Low-Resolution ImagesA computer vision system using low-resolution image sensors can provide intelligent services (e.g., activity recognition) but preserve unnecessary visual privacy information from the hardware level. However, preserving visual privacy and enabling accurate machine recognition have adversarial needs on image resolution. Modeling the trade-off of privacy preservation and machine recognition performance can guide future privacy-preserving computer vision systems using low-resolution image sensors. In this paper, using the at-home activity of daily livings (ADLs) as the scenario, we first obtained the most important visual privacy features through a user survey. Then we quantified and analyzed the effects of image resolution on human and machine recognition performance in activity recognition and privacy awareness tasks. We also investigated how modern image super-resolution techniques influence these effects. Based on the results, we proposed a method for modeling the trade-off of privacy preservation and activity recognition on low-resolution images.2023YWYuntao Wang et al.Tsinghua UniversityHuman Pose & Activity RecognitionPrivacy Perception & Decision-MakingCHI
Color-to-Depth Mappings as Depth Cues in Virtual RealityDespite significant improvements to Virtual Reality (VR) technologies, most VR displays are fixed focus and depth perception is still a key issue that limits the user experience and the interaction performance. To supplement humans' inherent depth cues (e.g., retinal blur, motion parallax), we investigate users' perceptual mappings of distance to virtual objects' appearance to generate visual cues aimed to enhance depth perception. As a first step, we explore color-to-depth mappings for virtual objects so that their appearance differs in saturation and value to reflect their distance. Through a series of controlled experiments, we elicit and analyze users' strategies of mapping a virtual object's hue, saturation, value and a combination of saturation and value to its depth. Based on the collected data, we implement a computational model that generates color-to-depth mappings fulfilling adjustable requirements on confusion probability, number of depth levels, and consistent saturation/value changing tendency. We demonstrate the effectiveness of color-to-depth mappings in a 3D sketching task, showing that compared to single-colored targets and strokes, with our mappings, the users were more confident in the accuracy without extra cognitive load and reduced the perceived depth error by 60.8%. We also implement four VR applications and demonstrate how our color cues can benefit the user experience and interaction performance in VR.2022ZLZhipeng Li et al.Immersion & Presence ResearchMedical & Scientific Data VisualizationUIST
FaceOri: Tracking Head Position and Orientation Using Ultrasonic Ranging on EarphonesFace orientation can often indicate users’ intended interaction target. In this paper, we propose FaceOri, a novel face tracking technique based on acoustic ranging using earphones. FaceOri can leverage the speaker on a commodity device to emit an ultrasonic chirp, which is picked up by the set of microphones on the user’s earphone, and then processed to calculate the distance from each microphone to the device. These measurements are used to derive the user’s face orientation and distance with respect to the device. We conduct a ground truth comparison and user study to evaluate FaceOri’s performance. The results show that the system can determine whether the user orients to the device at a 93.5% accuracy within a 1.5 meters range. Furthermore, FaceOri can continuously track the user’s head orientation with a median absolute error of 10.9 mm in the distance, 3.7◦ in yaw, and 5.8◦ in pitch. FaceOri can allow for convenient hands-free control of devices and produce more intelligent context-aware interaction.2022YWYuntao Wang et al.Tsinghua UniversityEye Tracking & Gaze InteractionContext-Aware ComputingCHI