Understanding Parents’ Desires in Moderating Children’s Interactions with GenAI Chatbots through LLM-Generated ProbesThis paper studies how parents want to moderate children’s interactions with Generative AI Chatbots, with the goal of informing the design of future GenAI parental control tools. We first used an LLM to generate synthetic Child--GenAI Chatbot interaction scenarios and worked with four parents to validate their realism. From this dataset, we carefully selected 12 diverse examples that evoked varying levels of concern and were rated the most realistic. Each example included a prompt and GenAI Chatbot response. We presented these to parents (N=24) and asked whether they found them concerning, why, and how they would prefer to modify the responses and be informed. Our findings reveal three key insights: (1) parents express concern about interactions that current GenAI Chatbot parental controls neglect; (2) parents want fine-grained transparency and moderation at the conversation level; and (3) parents need personalized controls that adapt to their desired strategies and children's ages.2026JDJohn Driscoll et al.University of California San DiegoConversational ChatbotsMental Health Technology for YouthChildren's AI Literacy & Data LiteracyCHI
Decentralized Web3 Non-Fungible Token Community for Societal Prosperity? A Social Capital PerspectiveIn the rapidly evolving Web3 world, non-fungible token (NFT) communities are reshaping the formation, distribution, and activation of social capital in ways distinct from traditional models. However, despite their growing impact on societal prosperity, a comprehensive understanding of social capital dynamics within Web3 NFT communities remains limited. This study explores the Mfers community, a key example within Web3 NFT ecosystems. By analyzing social media and blockchain data and using a Delphi method-based human-large language model (LLM) collaboration, we uncovered unique social capital patterns across six dimensions. Our findings highlight a compelling blend of decentralization, inclusion, trust, and empowerment but also raise critical questions about wealth inequality, content quality, and ethical challenges. Based on the findings, we discussed the uniqueness of social capital in Web3 NFT communities, the tension between technical and power decentralization, and the multidimensional nature of societal prosperity. We also suggested directions for future research on decentralized online communities in the CSCW field. This study provides a systematic perspective on social capital in Web3 NFT communities and introduces an innovative human-LLM collaborative analysis, offering insights into the design and governance of benign decentralized online communities.2025HCHongzhou Chen et al.Getting Things Done With AICSCW
PricoEye: The Eye of Primary Colors for Fast and Convenient 3D Reconstruction of Fine-grained Palmprint on SmartphonesPalm scanning, a cutting-edge payment technology, has garnered widespread attention from industry leaders such as Amazon, Tencent, and Mastercard due to the stability and non-intrusive properties of palmprints. Unlike facial biometrics, palmprints are not easily forged using publicly available social media data. Furthermore, contactless scanning solutions better meets the public hygiene needs than contact ones such as fingerprint. However, all current solutions rely on costly dedicated devices, such as Amazon One. Moreover, existing palmprint payment technologies remain limited to coarse-level textures (i.e., creases), while fine-level textures (i.e., ridges) have not been leveraged by these solutions. Currently, the reconstruction of fine-level palmprints is typically achieved using costly dedicated scanning equipment. This paper pioneers the design of an ultra-compact smartphone add-on (less than 3 mm thick), costing less than $2, which, paired with ubiquitous smartphones and a custom companion app, delivers the first holistic solution for fine-grained 3D palmprint reconstruction on smartphones. This ubiquitous smartphone-based solution significantly enhances user-friendliness (e.g., remote registration), usability (i.e., ultra-low-cost and tiny form factor) for interactions with payment devices, and hygiene in authenticating others (e.g., biometric-based ticketing at concerts or stadiums). Extensive evaluations demonstrate the robustness and effectiveness of the reconstructed 3D texture, which can be reliably used as a biometric feature and has received positive feedback from participants.2025DDDi Duan et al.Electrical Muscle Stimulation (EMS)Motor Impairment Assistive Input TechnologiesUniversal & Inclusive DesignUIST
NoteIt: A System Converting Instructional Videos to Interactable Notes Through Multimodal Video UnderstandingUsers often take notes for instructional videos to access key knowledge later without revisiting long videos. Automated note generation tools enable users to obtain informative notes efficiently. However, notes generated by existing research or off-the-shelf tools fail to preserve the information conveyed in the original videos comprehensively, nor can they satisfy users’ expectations for diverse presentation formats and interactive features when using notes digitally. In this work, we present NoteIt, a system, which automatically converts instructional videos to interactable notes using a novel pipeline that faithfully extracts hierarchical structure and multimodal key information from videos. With NoteIt’s interface, users can interact with the system to further customize the content and presentation formats of the notes according to their preferences. We conducted both a technical evaluation and a comparison user study (N=36). The solid performance in objective metrics and the positive user feedback demonstrated the effectiveness of the pipeline and the overall usability of NoteIt.2025RZRunning Zhao et al.Voice User Interface (VUI) DesignData StorytellingOnline Learning & MOOC PlatformsUIST
"Salt is the Soul of Hakka Baked Chicken": Reimagining Traditional Chinese Culinary ICH for Modern Contexts Without Losing TraditionIntangible Cultural Heritage (ICH) like traditional culinary practices face increasing pressure to adapt to globalization while maintaining their cultural authenticity. Centuries-old traditions in Chinese cuisine are subject to rapid changes for adaptation to contemporary tastes and dietary preferences. The preservation of these cultural practices requires approaches that can enable ICH practitioners to reimagine and recreate ICH for modern contexts. To address this, we created workshops where experienced practitioners of traditional Chinese cuisine co-created recipes using GenAI tools and realized the dishes. We found that GenAI inspired ICH practitioners to innovate recipes based on traditional workflows for broader audiences and adapt to modern dining contexts. However, GenAI-inspired co-creation posed challenges in maintaining the accuracy of original ICH workflows and preserving traditional flavors in the culinary outcomes. This study offers implications for designing human-AI collaborative processes for safeguarding and enhancing culinary ICH.2025SLSijia Liu et al.Generative AI (Text, Image, Music, Video)Participatory DesignFood Culture & Food InteractionC&C
Where is the Boundary? Understanding How People Recognize and Evaluate Generative AI-extended VideosThe rise of video generative models that produce high-quality content has made it increasingly difficult to discern video authenticity. AI-extended videos, which mix real-world footage with generative content, pose new challenges in distinguishing real from manipulated segments. AI-extended videos might be utilized to deceive humans, but they also have the capacity to assist video creators and offer people novel video experiences. Despite these concerns, research on how people recognize and evaluate AI-extended videos remains limited. To address this, we conducted a user study where participants interacted with AI-extended videos on a web-based system, identifying boundaries between raw and generated content, followed by a survey and one-on-one interviews. Our quantitative and qualitative analyses revealed how individuals perceive these videos, the factors influencing their perception, evaluations and attitudes. We believe that these insights will aid the future development of AI-extended video technologies and ecosystems.2025KWKe Wang et al.The Chinese University of Hong Kong, Shenzhen, School of Science and EngineeringGenerative AI (Text, Image, Music, Video)Explainable AI (XAI)Misinformation & Fact-CheckingCHI
SeQR: A User-Friendly and Secure-by-Design Configurator for Enterprise Wi-FiA classic problem in enterprise Wi-Fi is client-side misconfiguration, which enables credential theft via “Evil Twin” (ET) attacks. To mitigate this, we design, develop, and evaluate a new configurator, SeQR, which allows users to effortlessly and securely set up an enterprise Wi-Fi connection. Utilizing existing authenticated channels, SeQR fully automates the client-side enterprise Wi-Fi configuration process with a simple scan, leaving no room for misconfigurations. Specifically, SeQR thwarts ET by making it impossible for users to opt-out from the security-critical certificate validation. We evaluate the efficacy of SeQR on two fronts. First, we implement a prototype of SeQR in Android, and test its functionality and runtime performance. Next, we compare the usability of SeQR against two existing Wi-Fi configuration interfaces of Android in an in-person user study (n=41) with real devices. Our evaluation shows that SeQR achieves noticeable usability improvements over existing designs, and prevents users from misconfiguring.2025SHS Mahmudul Hasan et al.Syracuse UniversityPasswords & AuthenticationPrivacy Perception & Decision-MakingIoT Device PrivacyCHI
Designing Scaffolding Strategies for Conversational Agents in Dialog Task of Neurocognitive Disorders ScreeningRegular screening is critical for individuals at risk of neurocognitive disorders (NCDs) to receive early intervention. Conversational agents (CAs) have been adopted to administer dialog-based NCD screening tests for their scalability compared to human-administered tests. However, unique communication skills are required for CAs during NCD screening, e.g., clinicians often apply scaffolding to ensure subjects’ understanding of and engagement in screening tests. Based on scaffolding theories and analysis of clinicians' practices from human-administered test recordings, we designed a scaffolding framework for the CA. In an exploratory wizard-of-Oz study, the CA empowered by ChatGPT administered tasks in the Grocery Shopping Dialog Task with 15 participants (10 diagnosed with NCDs). Clinical experts verified the quality of the CA's scaffolding and we explored its effects on task understanding of the participants. Moreover, we proposed implications for the future design of CAs that enable scaffolding for scalable NCD screening.2024JHJiaxiong Hu et al.Hong Kong University of Science and TechnologyConversational ChatbotsCognitive Impairment & Neurodiversity (Autism, ADHD, Dyslexia)CHI
DeepTreeSketch: Neural Graph Prediction for Faithful 3D Tree Modeling from SketchesWe present DeepTreeSketch, a novel AI-assisted sketching system that enables users to create realistic 3D tree models from 2D freehand sketches. Our system leverages a tree graph prediction network, TGP-Net, to learn the underlying structural patterns of trees from a large collection of 3D tree models. The TGP-Net simulates the iterative growth of botanical trees and progressively constructs the 3D tree structures in a bottom-up manner. Furthermore, our system supports a flexible sketching mode for both precise and coarse control of the tree shapes by drawing branch strokes and foliage strokes, respectively. Combined with a procedural generation strategy, users can freely control the foliage propagation with diverse and fine details. We demonstrate the expressiveness, efficiency, and usability of our system through various experiments and user studies. Our system offers a practical tool for 3D tree creation, especially for natural scenes in games, movies, and landscape applications.2024ZLZhihao Liu et al.The University of Tokyo3D Modeling & AnimationCustomizable & Personalized ObjectsCHI
An Empathy-Based Sandbox Approach to Bridge the Privacy Gap among Attitudes, Goals, Knowledge, and BehaviorsManaging privacy to reach privacy goals is challenging, as evidenced by the privacy attitude-behavior gap. Mitigating this discrepancy requires solutions that account for both system opaqueness and users' hesitations in testing different privacy settings due to fears of unintended data exposure. We introduce an empathy-based approach that allows users to experience how privacy attributes may alter system outcomes in a risk-free sandbox environment from the perspective of artificially generated personas. To generate realistic personas, we introduce a novel pipeline that augments the outputs of large language models (e.g., GPT-4) using few-shot learning, contextualization, and chain of thoughts. Our empirical studies demonstrated the adequate quality of generated personas and highlighted the changes in privacy-related applications (e.g., online advertising) caused by different personas. Furthermore, users demonstrated cognitive and emotional empathy towards the personas when interacting with our sandbox. We offered design implications for downstream applications in improving user privacy literacy.2024CCChaoran Chen et al.University of Notre DameExplainable AI (XAI)Algorithmic Transparency & AuditabilityPrivacy by Design & User ControlCHI
PepperPose: Full-Body Pose Estimation with a Companion RobotAccurate full-body pose estimation across diverse actions in a user-friendly and location-agnostic manner paves the way for interactive applications in realms like sports, fitness, and healthcare. This task becomes challenging in real-world scenarios due to factors like the user's dynamic positioning, the diversity of actions, and the varying acceptability of the pose-capturing system. In this context, we present PepperPose, a novel companion robot system tailored for optimized pose estimation. Unlike traditional methods, PepperPose actively tracks the user and refines its viewpoint, facilitating enhanced pose accuracy across different locations and actions. This allows users to enjoy a seamless action-sensing experience. Our evaluation, involving 30 participants undertaking daily functioning and exercise actions in a home-like space, underscores the robot's promising capabilities. Moreover, we demonstrate the opportunities that PepperPose presents for human-robot interaction, its current limitations, and future developments.2024CWChongyang Wang et al.Tsinghua UniversityHuman Pose & Activity RecognitionHuman-Robot Collaboration (HRC)CHI
Altruistic and Profit-oriented: Making Sense of Roles in Web3 Community from Airdrop PerspectiveRegardless of which community, incentivizing users is a necessity for well-sustainable operations. In the blockchain-backed Web3 communities, known for their transparency and security, airdrop serves as a widespread incentive mechanism for allocating capital and power. However, it remains a controversy on how to justify airdrop to incentive and empower the decentralized governance. In this paper, we use ParaSwap as an example to propose a role taxonomy methodology through a data-driven study to understand the characteristic of community members and the effectiveness of airdrop. We find that users receive more rewards tend to take positive actions towards the community. We summarize several arbitrage patterns and confirm the current detection is not sufficient in screening out airdrop hunters. In conjunction with the results, we discuss from the aspects of interaction, financialization, and system design to conclude the challenges and possible research directions for decentralized communities.2023SFSizheng Fan et al.The Chinese University of Hong Kong, ShenzhenMisinformation & Fact-CheckingEmpowerment of Marginalized GroupsCHI
Competent but Rigid: Identifying the Gap in Empowering AI to Participate Equally in Group Decision-MakingExisting research on human-AI collaborative decision-making focuses mainly on the interaction between AI and individual decision-makers. There is a limited understanding of how AI may perform in group decision-making. This paper presents a wizard-of-oz study in which two participants and an AI form a committee to rank three English essays. One novelty of our study is that we adopt a speculative design by endowing AI equal power to humans in group decision-making. We enable the AI to discuss and vote equally with other human members. We find that although the voice of AI is considered valuable, AI still plays a secondary role in the group because it cannot fully follow the dynamics of the discussion and make progressive contributions. Moreover, the divergent opinions of our participants regarding an "equal AI" shed light on the possible future of human-AI relations.2023CZChengbo Zheng et al.Hong Kong University of Science and TechnologyHuman-LLM CollaborationAI-Assisted Decision-Making & AutomationCHI
TalkTive: A Conversational Agent Using Backchannels to Engage Older Adults in Neurocognitive Disorders ScreeningConversational agents (CAs) have the great potential in mitigating the clinicians' burden in screening for neurocognitive disorders among older adults. It is important, therefore, to develop CAs that can be engaging, to elicit conversational speech input from older adult participants for supporting assessment of cognitive abilities. As an initial step, this paper presents research in developing the backchanneling ability in CAs in the form of a verbal response to engage the speaker. We analyzed 246 conversations of cognitive assessments between older adults and human assessors, and derived the categories of reactive backchannels (e.g. ``hmm”) and proactive backchannels (e.g. ``please keep going”). This is used in the development of TalkTive, a CA which can predict both timing and form of backchanneling during cognitive assessments. The study then invited 36 older adult participants to evaluate the backchanneling feature. Results show that proactive backchanneling is more appreciated by participants than reactive backchanneling.2022ZDZijian Ding et al.University of MarylandIntelligent Voice Assistants (Alexa, Siri, etc.)Agent Personality & AnthropomorphismCognitive Impairment & Neurodiversity (Autism, ADHD, Dyslexia)CHI
"I can show what I really like.": Eliciting Preferences via Quadratic VotingSurveys are a common instrument to gauge self-reported opinions from the crowd for scholars in the CSCW community, the social sciences, and many other research areas. Researchers often use surveys to prioritize a subset of given options when there are resource constraints. Over the past century, researchers have developed a wide range of surveying techniques, including one of the most popular instruments, the Likert ordinal scale, to elicit individual preferences. However, the challenge to elicit accurate and rich self-reported responses with surveys in a resource-constrained context still persists today. In this study, we examine Quadratic Voting (QV), a voting mechanism powered by the affordances of a modern computer and straddles ratings and rankings approaches, as an alternative online survey technique. We argue that QV could elicit more accurate self-reported responses compared to the Likert scale when the goal is to understand relative preferences under resource constraints. We conducted two randomized controlled experiments on Amazon Mechanical Turk, one in the context of public opinion polling and the other in a human-computer interaction user study. Based on our Bayesian analysis results, a QV survey with a sufficient amount of voice credits, aligned significantly closer to participants' incentive-compatible behaviors than a Likert scale survey, with a medium to high effect size. In addition, we extended QV's application scenario from typical public policy and education research to a problem setting familiar to the CSCW community: a prototypical HCI user study. Our experiment results, QV survey design, and QV interface serve as a stepping stone for CSCW researchers to further explore this surveying methodology in their studies and encourage decision-makers from other communities to consider QV as a promising alternative.2021TCTi-Chung Cheng et al.Methods and Design ApproachesCSCW
SimpModeling: Sketching Implicit Field to Guide Mesh Modeling for 3D Animalmorphic Head DesignHead shapes play an important role in 3D character design. In this work, we propose SimpModeling, a novel sketch-based system for helping users, especially amateur users, easily model 3D animalmorphic heads - a prevalent kind of head in character design. Although sketching provides an easy way to depict desired shapes, it is challenging to infer dense geometric information from sparse line drawings. Recently, deepnet-based approaches have been taken to address this challenge and try to produce rich geometric details from very few strokes. However, while such methods reduce users' workload, they would cause less controllability of target shapes. This is mainly due to the uncertainty of the neural prediction. Our system tackles this issue and provides good controllability from three aspects: 1) we separate coarse shape design and geometric detail specification into two stages and respectively provide different sketching means; 2) in coarse shape designing, sketches are used for both shape inference and geometric constraints to determine global geometry, and in geometric detail crafting, sketches are used for carving surface details; 3) in both stages, we use the advanced implicit-based shape inference methods, which have strong ability to handle the domain gap between freehand sketches and synthetic ones used for training. Experimental results confirm the effectiveness of our method and the usability of our interactive system. We also contribute to a dataset of high-quality 3D animal heads, which are manually created by artists.2021ZLZhongjin Luo et al.3D Modeling & AnimationLaser Cutting & Digital FabricationUIST
Rapido: Prototyping Interactive AR Experiences through Programming by DemonstrationProgramming by Demonstration (PbD) is a well-known technique that allows non-programmers to describe interactivity by performing examples of the expected behavior, but it has not been extensively explored for AR. We present Rapido, a novel early-stage prototyping tool to create fully interactive mobile AR prototypes from non-interactive video prototypes using PbD. In Rapido, designers use a mobile AR device to record a video prototype to capture context, sketch assets, and demonstrate interactions. They can demonstrate touch inputs, animation paths, and rules to, e.g., have a sketch follow the focus area of the device or the user's world-space touches. Simultaneously, a live website visualizes an editable overview of all the demonstrated examples and infers a state machine of the user flow. Our key contribution is a method that enables designers to turn a video prototype into an executable state machine through PbD. The designer switches between these representations to interactively refine the final interactive prototype. We illustrate the power of Rapido's approach by prototyping the main interactions of three popular AR mobile applications.2021GLGerman Leiva et al.AR Navigation & Context AwarenessPrototyping & User TestingUIST
PTeacher: a Computer-Aided Personalized Pronunciation Training System with Exaggerated Audio-Visual Corrective FeedbackSecond language (L2) English learners often find it difficult to improve their pronunciations due to the lack of expressive and personalized corrective feedback. In this paper, we present Pronunciation Teacher—PTeacher, a Computer-Aided Pronunciation Training (CAPT) system that provides personalized exaggerated audio-visual corrective feedback for mispronunciations. Though the effectiveness of exaggerated feedback has been demonstrated, it is still unclear how to define the appropriate degrees of exaggeration when interacting with individual learners. To fill in this gap, we interview 100 L2 English learners and 22 professional native teachers to understand their needs and experiences. Three critical metrics are proposed for both learners and teachers to identify the best exaggeration levels in both audio and visual modalities. Additionally, we incorporate the personalized dynamic feedback mechanism given the English proficiency of learners. Based on the obtained insights, a comprehensive interactive pronunciation training course is designed to help L2 learners rectify mispronunciations in a more perceptible, understandable, and discriminative manner. Extensive user studies demonstrate that our system significantly promotes the learners' learning efficiency.2021YBYaohua Bu et al.Tsinghua UniversityVoice User Interface (VUI) DesignConversational ChatbotsSpecial Education TechnologyCHI
GrabAR: Occlusion-aware Grabbing Virtual Objects in ARExisting augmented reality (AR) applications often ignore the occlusion between real hands and virtual objects when incorporating virtual objects in user’s views. The challenges come from the lack of accurate depth and mismatch between real and virtual depth. This paper presents GrabAR1, a new approach that directly predicts the real-and-virtual occlusion and bypasses the depth acquisition and inference. Our goal is to enhance AR applications with interactions between hand (real) and grabbable objects (virtual). With paired images of hand and object as inputs, we formulate a compact deep neural network that learns to generate the occlusion mask. To train the network, we compile a large dataset, including synthetic data and real data. We then embed the trained network in a prototyping AR system to support real-time grabbing of virtual objects. Further, we demonstrate the performance of our method on various virtual objects, compare our method with others through two user studies, and showcase a rich variety of interaction scenarios, in which we can use bare hand to grab virtual objects and directly manipulate them.2020XTXiao TANG et al.Full-Body Interaction & Embodied InputAR Navigation & Context AwarenessUIST
Connecting Distributed Families: Camera Work for Three-party Mobile Video CallsMobile video calling technologies have become a critical link to connect distributed families. However, these technologies have been principally designed for video calling between two parties, whereas family video calls involve young children often comprise three parties, namely a co-present adult (a parent or grandparent) helping with the interaction between the child and another remote adult. We examine how manipulation of phone cameras and management of co-present children is used to stage parent-child interactions. We present results from a video-ethnographic study based on 40 video recordings of video calls between 'left-behind' children and their migrant parents in China. Our analysis reveals a key practice of 'facilitation work', performed by grandparents, as a crucial feature of three-party calls. Facilitation work offers a new concept for HCI's broader conceptualisation of mobile video calling, suggesting revisions that design might take into consideration for triadic interactions in general.2020YGYumei Gan et al.The Chinese University of Hong KongRemote Work Tools & ExperienceTeleoperation & TelepresenceCHI