Talk2Care: An LLM-based Voice Assistant for Communication between Healthcare Providers and Older AdultsYang等人开发Talk2Care,基于大型语言模型为医疗工作者与老年人提供语音沟通助手,改善老年护理交流。2024ZYZiqi Yang et al.Intelligent Voice Assistants (Alexa, Siri, etc.)Mental Health Apps & Online Support CommunitiesUbiComp
SmartASL: “Point-of-Care” Comprehensive ASL Interpreter Using WearablesSign language builds up an important bridge between the d/Deaf and hard-of-hearing (DHH) and hearing people. Regrettably, most hearing people face challenges in comprehending sign language, necessitating sign language translation. However, state-of-the-art wearable-based techniques mainly concentrate on recognizing manual markers (e.g., hand gestures), while frequently overlooking non-manual markers, such as negative head shaking, question markers, and mouthing. This oversight results in the loss of substantial grammatical and semantic information in sign language. To address this limitation, we introduce SmartASL, a novel proof-of-concept system that can 1) recognize both manual and non-manual markers simultaneously using a combination of earbuds and a wrist-worn IMU, and 2) translate the recognized American Sign Language (ASL) glosses into spoken language. Our experiments demonstrate the SmartASL system's significant potential to accurately recognize the manual and non-manual markers in ASL, effectively bridging the communication gaps between ASL signers and hearing people using commercially available devices. https://dl.acm.org/doi/10.1145/35962552023YJYINCHENG JIN et al.Foot & Wrist InteractionDeaf & Hard-of-Hearing Support (Captions, Sign Language, Vibration)Augmentative & Alternative Communication (AAC)UbiComp
GLOBEM: Cross-Dataset Generalization of Longitudinal Human Behavior ModelingThere is a growing body of research revealing that longitudinal passive sensing data from smartphones and wearable devices can capture daily behavior signals for human behavior modeling, such as depression detection. Most prior studies build and evaluate machine learning models using data collected from a single population. However, to ensure that a behavior model can work for a larger group of users, its generalizability needs to be verified on multiple datasets from different populations. We present the first work evaluating cross-dataset generalizability of longitudinal behavior models, using depression detection as an application. We collect multiple longitudinal passive mobile sensing datasets with over 500 users from two institutes over a two-year span, leading to four institute-year datasets. Using the datasets, we closely re-implement and evaluated nine prior depression detection algorithms. Our experiment reveals the lack of model generalizability of these methods. We also implement eight recently popular domain generalization algorithms from the machine learning community. Our results indicate that these methods also do not generalize well on our datasets, with barely any advantage over the naive baseline of guessing the majority. We then present two new algorithms with better generalizability. Our new algorithm, Reorder, significantly and consistently outperforms existing methods on most cross-dataset generalization setups. However, the overall advantage is incremental and still has great room for improvement. Our analysis reveals that the individual differences (both within and between populations) may play the most important role in the cross-dataset generalization challenge. Finally, we provide an open-source benchmark platform GLOBEM- short for Generalization of Longitudinal BEhavior Modeling - to consolidate all 19 algorithms. GLOBEM can support researchers in using, developing, and evaluating different longitudinal behavior modeling methods. We call for researchers' attention to model generalizability evaluation for future longitudinal human behavior modeling studies. https://dl.acm.org/doi/10.1145/35694852023XXXuhai Xu et al.Human Pose & Activity RecognitionMental Health Apps & Online Support CommunitiesBiosensors & Physiological MonitoringUbiComp
Modeling the Trade-off of Privacy Preservation and Activity Recognition on Low-Resolution ImagesA computer vision system using low-resolution image sensors can provide intelligent services (e.g., activity recognition) but preserve unnecessary visual privacy information from the hardware level. However, preserving visual privacy and enabling accurate machine recognition have adversarial needs on image resolution. Modeling the trade-off of privacy preservation and machine recognition performance can guide future privacy-preserving computer vision systems using low-resolution image sensors. In this paper, using the at-home activity of daily livings (ADLs) as the scenario, we first obtained the most important visual privacy features through a user survey. Then we quantified and analyzed the effects of image resolution on human and machine recognition performance in activity recognition and privacy awareness tasks. We also investigated how modern image super-resolution techniques influence these effects. Based on the results, we proposed a method for modeling the trade-off of privacy preservation and activity recognition on low-resolution images.2023YWYuntao Wang et al.Tsinghua UniversityHuman Pose & Activity RecognitionPrivacy Perception & Decision-MakingCHI
Reviewing and Reflecting Smart Home Research from the Human-Centered PerspectiveWhile there has been rapid growth in smart home research from a technical perspective – focusing on home automation, devices, software, and protocols – few review papers examine the human-centered perspective. A human-centered focus is crucial for achieving the goals of providing natural, convenient, comfortable, friendly, and safe user experiences in the smart home. To understand key innovations in human-centered smart home research, we analyzed keyword changes over time via 19,091 papers from 2000 to 2022, then selected 55 papers from high-impact venues in the last five years, and summarized them through a combination of qualitative and quantitative methods. Our analysis revealed five research trends with unique characteristics and interdependence. Drawing on this review, we elaborate on the future of smart home design research with respect to multidisciplinary development, stakeholder involvement, and the shift of design implications.2023YYYuan Yao et al.School of Architecture and Design, Academy of Arts & DesignUniversal & Inclusive DesignSmart Home Interaction DesignCHI
XAIR: A Framework of Explainable AI in Augmented RealityExplainable AI (XAI) has established itself as an important component of AI-driven interactive systems. With Augmented Reality (AR) becoming more integrated in daily lives, the role of XAI also becomes essential in AR because end-users will frequently interact with intelligent services. However, it is unclear how to design effective XAI experiences for AR. We propose XAIR, a design framework that addresses when, what, and how to provide explanations of AI output in AR. The framework was based on a multi-disciplinary literature review of XAI and HCI research, a large-scale survey probing 500+ end-users’ preferences for AR-based explanations, and three workshops with 12 experts collecting their insights about XAI design in AR. XAIR's utility and effectiveness was verified via a study with 10 designers and another study with 12 end-users. XAIR can provide guidelines for designers, inspiring them to identify new design opportunities and achieve effective XAI designs in AR.2023XXXuhai Xu et al.Reality Labs Research, University of WashingtonAR Navigation & Context AwarenessExplainable AI (XAI)CHI
TypeOut: Leveraging Just-in-Time Self-Affirmation for Smartphone Overuse ReductionSmartphone overuse is related to a variety of issues such as lack of sleep and anxiety. We explore the application of Self-Affirmation Theory on smartphone overuse intervention in a just-in-time manner. We present \projectname{}, a just-in-time intervention technique that integrates two components: an in-situ typing-based unlock process to improve user engagement, and self-affirmation-based typing content to enhance effectiveness. We hypothesize that the integration of typing and self-affirmation content can better reduce smartphone overuse. We conducted a 10-week within-subject field experiment (N=54) and compared \projectname{} against two baselines: one only showing the self-affirmation content (a common notification-based intervention), and one only requiring typing non-semantic content (a state-of-the-art method). \projectname{} reduces app usage by over 50\%, and both app opening frequency and usage duration by over 25\%, all significantly outperforming baselines. \projectname{} can potentially be used in other domains where an intervention may benefit from integrating self-affirmation exercises with an engaging just-in-time mechanism.2022XXXuhai Xu et al.University of WashingtonMental Health Apps & Online Support CommunitiesNotification & Interruption ManagementCHI
ReflecTrack: Enabling 3D Acoustic Position Tracking Using Commodity Dual-Microphone Smartphones3D position tracking on smartphones has the potential to unlock a variety of novel applications, but has not been made widely available due to limitations in smartphone sensors. In this paper, we propose ReflecTrack, a novel 3D acoustic position tracking method for commodity dual-microphone smartphones. A ubiquitous speaker (e.g., smartwatch or earbud) generates inaudible Frequency Modulated Continuous Wave (FMCW) acoustic signals that are picked up by both smartphone microphones. To enable 3D tracking with two microphones, we introduce a reflective surface that can be easily found in everyday objects near the smartphone. Thus, the microphones can receive sound from the speaker and echoes from the surface for FMCW-based acoustic ranging. To simultaneously estimate the distances from the direct and reflective paths, we propose the echo-aware FMCW technique with a new signal pattern and target detection process. Our user study shows that ReflecTrack achieves a median error of 28.4 mm in the 60cm*60cm*60cm space and 22.1 mm in the 30cm*30cm*30cm space for 3D positioning. We demonstrate the easy accessibility of ReflecTrack using everyday surfaces and objects with several typical applications of 3D position tracking, including 3D input for smartphones, fine-grained gesture recognition, and motion tracking in smartphone-based VR systems.2021YZYuzhou Zhuang et al.Full-Body Interaction & Embodied InputBiosensors & Physiological MonitoringUIST
Understanding the Design Space of Mouth MicrogesturesAs wearable devices move toward the face (i.e. smart earbuds, glasses), there is an increasing need to facilitate intuitive interactions with these devices. Current sensing techniques can already detect many mouth-based gestures; however, users’ preferences of these gestures are not fully understood. In this paper, we investigate the design space and usability of mouth-based microgestures. We first conducted brainstorming sessions (N=16) and compiled an extensive set of 86 user-defined gestures. Then, with an online survey (N=50), we assessed the physical and mental demand of our gesture set and identified a subset of 14 gestures that can be performed easily and naturally. Finally, we conducted a remote Wizard-of-Oz usability study (N=11) mapping gestures to various daily smartphone operations under a sitting and walking context. From these studies, we develop a taxonomy for mouth gestures, finalize a practical gesture set for common applications, and provide design guidelines for future mouth-based gesture interactions.2021VCVictor Chen et al.Haptic WearablesHand Gesture RecognitionDIS
HulaMove: Using Commodity IMU for Waist InteractionWe present HulaMove, a novel interaction technique that leverages the movement of the waist as a new eyes-free and hands-free input method for both the physical world and the virtual world. We first conducted a user study (N=12) to understand users’ ability to control their waist. We found that users could easily discriminate eight shifting directions and two rotating orientations, and quickly confirm actions by returning to the original position (quick return). We developed a design space with eight gestures for waist interaction based on the results and implemented an IMU-based real-time system. Using a hierarchical machine learning model, our system could recognize waist gestures at an accuracy of 97.5%. Finally, we conducted a second user study (N=12) for usability testing in both real-world scenarios and virtual reality settings. Our usability study indicated that HulaMove significantly reduced interaction time by 41.8% compared to a touch screen method, and greatly improved users’ sense of presence in the virtual world. This novel technique provides an additional input method when users’ eyes or hands are busy, accelerates users’ daily operations, and augments their immersive experience in the virtual world.2021XXXuhai Xu et al.University of WashingtonFull-Body Interaction & Embodied InputImmersion & Presence ResearchCHI
Voicemoji: Emoji Entry Using Voice for Visually Impaired PeopleKeyboard-based emoji entry can be challenging for people with visual impairments: users have to sequentially navigate emoji lists using screen readers to find their desired emojis, which is a slow and tedious process. In this work, we explore the design and benefits of emoji entry with speech input, a popular text entry method among people with visual impairments. After conducting interviews to understand blind or low vision (BLV) users’ current emoji input experiences, we developed Voicemoji, which (1) outputs relevant emojis in response to voice commands, and (2) provides context-sensitive emoji suggestions through speech output. We also conducted a multi-stage evaluation study with six BLV participants from the United States and six BLV participants from China, finding that Voicemoji significantly reduced entry time by 91.2% and was preferred by all participants over the Apple iOS keyboard. Based on our findings, we present Voicemoji as a feasible solution for voice-based emoji entry.2021MZMingrui Ray Zhang et al.University of WashingtonIntelligent Voice Assistants (Alexa, Siri, etc.)Visual Impairment Technologies (Screen Readers, Tactile Graphics, Braille)CHI
LightWrite: Teach Handwriting to The Visually Impaired with A SmartphoneLearning to write is challenging for blind and low vision (BLV) people because of the lack of visual feedback. Regardless of the drastic advancement of digital technology, handwriting is still an essential part of daily life. Although tools designed for teaching BLV to write exist, many are expensive and require the help of sighted teachers. We propose LightWrite, a low-cost, easy-to-access smartphone application that uses voice-based descriptive instruction and feedback to teach BLV users to write English lowercase letters and Arabian digits in a specifically designed font. A two-stage study with 15 BLV users with little prior writing knowledge shows that LightWrite can successfully teach users to learn handwriting characters in an average of 1.09 minutes for each letter. After initial training and 20-minute daily practice for 5 days, participants were able to write an average of 19.9 out of 26 letters that are recognizable by sighted raters.2021ZWZihan Wu et al.Tsinghua University, University of MichiganVisual Impairment Technologies (Screen Readers, Tactile Graphics, Braille)CHI
EarBuddy: Enabling On-Face Interaction via Wireless EarbudsPast research regarding on-body interaction typically requires custom sensors, limiting their scalability and generalizability. We propose EarBuddy, a real-time system that leverages the microphone in commercial wireless earbuds to detect tapping and sliding gestures near the face and ears. We develop a design space to generate 27 valid gestures and conducted a user study (N=16) to select the eight gestures that were optimal for both human preference and microphone detectability. We collected a dataset on those eight gestures (N=20) and trained deep learning models for gesture detection and classification. Our optimized classifier achieved an accuracy of 95.3%. Finally, we conducted a user study (N=12) to evaluate EarBuddy's usability. Our results show that EarBuddy can facilitate novel interaction and that users feel very positively about the system. EarBuddy provides a new eyes-free, socially acceptable input method that is compatible with commercial wireless earbuds and has the potential for scalability and generalizability2020XXXuhai Xu et al.University of Washington & Tsinghua UniversityHaptic WearablesFoot & Wrist InteractionCHI
Clench Interface: Novel Biting Input TechniquesPeople eat every day and biting is one of the most fundamental and natural actions that they perform on a daily basis. Existing work has explored tooth click location and jaw movement as input techniques, however clenching has the potential to add control to this input channel. We propose clench interaction that leverages clenching as an actively controlled physiological signal that can facilitate interactions. We conducted a user study to investigate users' ability to control their clench force. We found that users can easily discriminate three force levels, and that they can quickly confirm actions by unclenching (quick release). We developed a design space for clench interaction based on the results and investigated the usability of the clench interface. Participants preferred the clench over baselines and indicated a willingness to use clench-based interactions. This novel technique can provide an additional input method in cases where users' eyes or hands are busy, augment immersive experiences such as virtual/augmented reality, and assist individuals with disabilities.2019XXXuhai Xu et al.University of Washington & Tsinghua UniversityHaptic WearablesFoot & Wrist InteractionCHI