Improving Low-Vision Chart Accessibility via On-Cursor Visual ContextDespite widespread use, charts remain largely inaccessible for Low-Vision Individuals (LVI). Reading charts requires viewing data points within a global context, which is difficult for LVI who may rely on magnification or experience a partial field of vision. We aim to improve exploration by providing visual access to critical context. To inform this, we conducted a formative study with five LVI. We identified four fundamental contextual elements common across chart types: axes, legend, grid lines, and the overview. We propose two pointer-based interaction methods to provide this context: Dynamic Context, a novel focus+context interaction, and Mini-map, which adapts overview+detail principles for LVI. In a study with N=22 LVI, we compared both methods and evaluated their integration to current tools. Our results show that Dynamic Context had significant positive impact on access, usability, and effort reduction; however, worsened visual load. Mini-map strengthened spatial understanding, but was less preferred for this task. We offer design insights to guide the development of future systems that support LVI with visual context while balancing visual load.2026YSYotam Sechayk et al.The University of TokyoVisual Impairment Technologies (Screen Readers, Tactile Graphics, Braille)Interactive Data VisualizationUncertainty VisualizationCHI
Peeking Ahead of the Field Study: Exploring VLM Personas as Support Tools for Embodied Studies in HCIField studies are irreplaceable but costly, time-consuming, and error-prone, which need careful preparation. Inspired by rapid-prototyping in manufacturing, we propose a fast, low-cost evaluation method using Vision-Language Model (VLM) personas to simulate outcomes comparable to field results. While LLMs show human-like reasoning and language capabilities, autonomous vehicle (AV)-pedestrian interaction requires spatial awareness, emotional empathy, and behavioral generation. This raises our research question: To what extent can VLM personas mimic human responses in field studies? We conducted parallel studies: 1) one real-world study with 20 participants, and 2) one video-study using 20 VLM personas, both on a street-crossing task. We compared their responses and interviewed five HCI researchers on potential applications. Results show that VLM personas mimic human response patterns (e.g., average crossing times of 5.25 s vs. 5.07 s) lack the behavioral variability and depth. They show promise for formative studies, field study preparation, and human data augmentation.2026XGXinyue Gui et al.The University of TokyoAutomated Driving Interface & Takeover DesignExternal HMI (eHMI) — Communication with Pedestrians & CyclistsUser Research Methods (Interviews, Surveys, Observation)CHI
Don't Worry, Just Follow Me: Prototyping and In-the-Wild Evaluation of Smart Pole Interaction Unit with MobilityPedestrian–automated vehicle(AV) encounters in shared spaces often involve hesitation and ambiguity. Vehicle-mounted external human–machine interfaces(eHMIs) can help, but obscured or poorly timed communications create significant challenges. To address this, we present a mobile smart pole interaction unit(SPIU) with integrated cameras and LED displays, designed as a pedestrian-side system to deliver explicit cues(``WALK,'' ``STOP''). An in-the-wild evaluation of the SPIU(N=21) using a four-factor analysis (CarBehavior, Mobility, eHMI, SPIU) showed that the SPIU improved understandability, trust, and perceived safety, and reduced workload compared with the baseline, with a combination(eHMI+SPIU) yielding the strongest results. Beyond these quantitative benefits, participants appreciated the mobility of the SPIU for its ``clear'' and ``easy to decide'' mediation. This work contributes to(1) a design and deployment framework for a mobile SPIU and(2) an in-the-wild evaluation protocol for pedestrian–AV interactions in nonsignalized spaces. Our work sparks discussions on real world evaluations involving detailed vehicle kinematics and accessible multimodality(e.g., audio), focusing on the role of personal robots as user-side eHMIs.2026VCVishal Chauhan et al.The University of TokyoExternal HMI (eHMI) — Communication with Pedestrians & CyclistsSocial Robot InteractionTeleoperation & TelepresenceCHI
No Pixel Left Behind: Filling Gaps in Anime ColorizationAnimation production workflows often involve digital colorization of line art, where small unpainted regions (``gaps'') frequently occur and remain an underexplored challenge. We conducted a formative study in Japanese animation (anime) pipelines and found that while the paint bucket tool is widely used for base coloring, tiny enclosed areas are frequently overlooked, resulting in time-consuming manual detection and filling. We introduce GapFill, a tool grounded in professional practices that reduces the effort of gap detection, zooming, and color selection. Our deep-learning method suggests appropriate fill colors by referencing surrounding regions, leveraging the flat-color nature of anime-style images. In a user study with 13 professional colorists, our system improved performance and usability in gap-filling tasks over conventional methods. The study also suggested that prediction accuracy alone is not the primary factor for usability, that appropriate colors can be contextually ambiguous, and that GapFill can complement existing tools depending on users' trust in new AI-powered assistance.2026MKMasahiro Kono et al.The University of TokyoGenerative AI (Text, Image, Music, Video)Creative Collaboration & Feedback SystemsGraphic Design & Typography ToolsCHI
skCAD: A Design Tool for Solid Knitting with Automatic Pattern GenerationSolid knitting is a fabrication technique for producing dense 3D volumes through knitting. Designing such objects is difficult because yarn paths must remain continuous—formed from one (or a few) yarn(s)—while handling increases and decreases across three dimensions. To address this challenge, we introduce skCAD, a block-based design tool that allows users to compose 3D forms by stacking rectangular blocks, which are then automatically converted into solid-knitting patterns. To build skCAD, we also standardized a grammar of solid knitting that formalizes stitch and row operations, making it possible to construct patterns beyond basic shapes. Our tool and grammar enable the creation of complex solid-knitted objects, and help those interested to learn and explore this technique. We evaluated the system in a workshop with knitters, yielding insights into design needs and directions for future development.2026YHYuichi Hirose et al.Carnegie Mellon UniversityShape-Changing Interfaces & Soft Robotic MaterialsDesktop 3D Printing & Personal FabricationCHI
MiniMates: Miniature Avatars for AR Remote Meetings within Limited Physical SpacesRemote meetings using 3D avatars in Augmented Reality (AR) allow effective communication and enable users to retain awareness of their surroundings. However, positioning 3D avatars effectively and consistently for all users in AR is challenging since most spaces, such as offices or living rooms, are not large enough to accommodate multiple life-sized avatars without interference. To address this issue, we contribute MiniMates---a novel approach leveraging miniature avatars, which make it possible to place multiple remote users in a limited physical space. We see MiniMates as complementary to traditional 2D video conferencing and immersive telepresence. Our approach automatically adjusts the formation of avatars and redirects users' head and body orientation to facilitate communication. Results from our user study (n = 24) show that participants experience a higher sense of co-presence compared to video conferencing, and that MiniMates enabled them to communicate the direction of their interactions non-verbally as well as manage multiple simultaneous conversations.2025AKAkihiro Kiuchi et al.The University of TokyoSocial & Collaborative VRMixed Reality WorkspacesContext-Aware ComputingCHI
SpineLoft: Interactive Spine-based 2D-to-3D Modeling3D artists (professionals and novices alike) often take inspiration from sketches or photos to guide their designs. Yet, existing modeling systems are not tailored to fully make use of such input. Consequently, significant effort and expertise are needed when creating model prototypes or exploring design options. In this work, we introduce a system to support the exploratory modeling process by enabling the transformation of 2D image elements into geometric 3D objects. Our solution relies on a novel d2 distance function, supporting a region-based lofting process, and delivers easily-editable 3D geometric "spine-rib" representations. The user draws a spine, and the system generates and modifies a generalized cylinder around it, considering image edges. The proposed approach, driven by simple user-defined scribble definitions, can robustly handle various image sources, ranging from photos to hand-drawn content.2025ATAlexandre Thiault et al.Institut Polytechnique de Paris, Telecom Paris3D Modeling & AnimationCustomizable & Personalized ObjectsCHI
FontCraft: Multimodal Font Design Using Interactive Bayesian OptimizationCreating new fonts requires a lot of human effort and professional typographic knowledge. Despite the rapid advancements of automatic font generation models, existing methods require users to prepare pre-designed characters with target styles using font-editing software, which poses a problem for non-expert users. To address this limitation, we propose FontCraft, a system that enables font generation without relying on pre-designed characters. Our approach integrates the exploration of a font-style latent space with human-in-the-loop preferential Bayesian optimization and multimodal references, facilitating efficient exploration and enhancing user control. Moreover, FontCraft allows users to revisit previous designs, retracting their earlier choices in the preferential Bayesian optimization process. Once users finish editing the style of a selected character, they can propagate it to the remaining characters and further refine them as needed. The system then generates a complete outline font in OpenType format. We evaluated the effectiveness of FontCraft through a user study comparing it to a baseline interface. Results from both quantitative and qualitative evaluations demonstrate that FontCraft enables non-expert users to design fonts efficiently.2025YTYuki Tatsukawa et al.The University of Tokyo, Igarashi LabGraphic Design & Typography ToolsCustomizable & Personalized ObjectsCHI
XR-penter: Material-Aware and In Situ Design of Scrap Wood AssembliesWoodworkers have to navigate multiple considerations when planning a project, including available resources, skill-level, and intended effort. Do it yourself (DIY) woodworkers face these challenges most acutely because of tight material constraints and a desire for custom designs tailored to specific spaces. To address these needs, we present XR-penter, an extended reality (XR) application that supports in situ, material-aware woodworking for casual makers. Our system enables users to design virtual scrap wood assemblies directly in their workspace, encouraging sustainable practices through the use of discarded materials. Users register physical material as virtual twins, manipulate these twins into an assembly in XR (while receiving feedback on material usage and alignment with their surroundings), and preview cuts needed for fabrication. We conducted a case study and feedback sessions demonstrating that XR-penter supports improvisational workflows in practice, and found that woodworkers who prioritize material-driven and adaptive workflows would benefit most from our system.2025RIRamya Iyer et al.Georgia Institute of TechnologyMixed Reality WorkspacesShape-Changing Materials & 4D PrintingCHI
Draw2Cut: Direct On-Material Annotations for CNC MillingCreating custom artifacts with computer numerical control (CNC) milling machines typically requires mastery of complex computer-aided design (CAD) software. To eliminate this user barrier, we introduced Draw2Cut, a novel system that allows users to design and fabricate artifacts by sketching directly on physical materials. Draw2Cut employs a custom-drawing language to convert user-drawn lines, symbols, and colors into toolpaths, thereby enabling users to express their creative intent intuitively. The key features include real-time alignment between material and virtual toolpaths, a preview interface for validation, and an open-source platform for customization. Through technical evaluations and user studies, we demonstrate that Draw2Cut lowers the entry barrier for personal fabrication, enabling novices to create customized artifacts with precision and ease. Our findings highlight the potential of the system to enhance creativity, engagement, and accessibility in CNC-based woodworking.2025XGXinyue Gui et al.The University of TokyoDesktop 3D Printing & Personal FabricationCustomizable & Personalized ObjectsCHI
CompAct: Designing Interconnected Compliant Mechanisms with Targeted Actuation TransmissionsCompliant mechanisms enable the creation of compact and easy-to-fabricate devices for tangible interaction. This work explores interconnected compliant mechanisms consisting of multiple joints and rigid bodies to transmit and process displacements as signals that result from physical interactions. As these devices are difficult to design due to their vast and complex design space, we developed a graph-based design algorithm and computational tool to help users program and customize such computational functions and procedurally model physical designs. When combined with active materials with actuation and sensing capabilities, these devices can also render and detect haptic interaction. Our design examples demonstrate the tool’s capability to respond to relevant HCI concepts, including building modular physical interface toolkits, encrypting tangible interactions, and customizing user augmentation for accessibility. We believe the tool will facilitate the generation of new interfaces with enriched affordance.2025HYHumphrey Yang et al.Carnegie Mellon University, Human-Computer Interaction InstituteShape-Changing Interfaces & Soft Robotic MaterialsCustomizable & Personalized ObjectsCHI
Proactive Conversational Agents with Inner ThoughtsOne of the long-standing aspirations in conversational AI is to allow them to autonomously take initiatives in conversations, i.e. being proactive. This is especially challenging for multi-party conversations. Prior NLP research focused mainly on predicting the next speaker from contexts like preceding conversations. In this paper, we demonstrate the limitations of such methods and rethink what it means for AI to be proactive in multi-party, human-AI conversations.We propose that just like humans, rather than merely reacting to turn-taking cues, a proactive AI formulates its own inner thoughts during a conversation, and seeks the right moment to contribute. Through a formative study with 24 participants and inspiration from linguistics and cognitive psychology, we introduce the Inner Thoughts framework. Our framework equips AI with a continuous, covert train of thoughts in parallel to the overt communication process, which enables it to proactively engage by modeling its intrinsic motivation to express these thoughts. We instantiated this framework into two real-time systems: an AI playground web app and a chatbot. Through a technical evaluation and user studies with human participants, our framework significantly surpasses existing baselines on aspects like anthropomorphism, coherence, intelligence, and turn-taking appropriateness.2025XLXingyu "Bruce" Liu et al.UCLA, HCI ResearchConversational ChatbotsAgent Personality & AnthropomorphismHuman-LLM CollaborationCHI
Shrinkable Arm-based eHMI on Autonomous Delivery Vehicle for Effective Communication with Other Road UsersWhen employing autonomous driving technology in logistics, small autonomous delivery vehicles (aka delivery robots) encounter challenges different from passenger vehicles when interacting with other road users. We conducted an online video survey as a pre-study and found that autonomous delivery vehicles need external human-machine interfaces (eHMIs) to ask for help due to their small size and functional limitations. Inspired by everyday human communication, we chose arms as eHMI to show their request through limb motion and gesture. We held an in-house workshop to identify the arm’s requirements for designing a specific arm with shrink-ability (conspicuous when delivering messages but not affect traffic at other times). We prototyped a small delivery robot with a shrinkable arm and filmed the experiment videos. We conducted two studies (a video-based and a 360-degree-photo VR-based) with 18 participants. We demonstrated that arm-on-delivery robots can increase interaction efficiency by drawing more attention and communicating specific information.2024XGXinyue Gui et al.External HMI (eHMI) — Communication with Pedestrians & CyclistsAutoUI
MR Microsurgical Suture Training System with Level-Appropriate SupportThe integration of advanced technologies in healthcare necessitates the development of systems accommodating the daily routines in medical practices. Neurosurgeons, in particular, require extensive practice in microsurgical suturing in the long term, even in the busy routine of a medical practice. This study collaboratively developed a Mixed Reality system with neurosurgeons to support self-training in microscopic suturing. Based on the neurosurgeons' opinions, we implemented a level-appropriate microsurgical suture training system. For novices, the system offers shadow-matching training to support the practice of precise movements under the high-sensitivity environment of the microscope. For intermediates, it provides a real-time feedback system, which allows users to practice attention to details. Evaluation involved testing the novice system on students with no medical background and the intermediate system on neurosurgery residents. The effectiveness of the system was demonstrated through the experimental results and subsequent discussion.2024YTYuka Tashiro et al.Tokyo Institute of TechnologyMixed Reality WorkspacesVR Medical Training & RehabilitationRobots in Education & HealthcareCHI
iPose: Interactive Human Pose Reconstruction from VideoReconstructing 3D human poses from video has wide applications, such as character animation and sports analysis. Automatic 3D pose reconstruction methods have demonstrated promising results, but failure cases can still appear due to the diversity of human actions, capturing conditions, and depth ambiguities. Thus, manual intervention remains indispensable, which can be time-consuming and require professional skills. We thus present iPose, an interactive tool that facilitates intuitive human pose reconstruction from a given video. Our tool incorporates both human perception in specifying pose appearance to achieve controllability, and video frame processing algorithms to achieve precision and automation. A user manipulates the projection of a 3D pose via 2D operations on top of video frames, and the 3D poses are updated correspondingly while satisfying both kinematic and video frame constraints. The pose updates are propagated temporally to reduce user workload. We evaluate the effectiveness of iPose with a user study on the 3DPW dataset and expert interviews.2024JLJingyuan Liu et al.The University of TokyoHuman Pose & Activity Recognition3D Modeling & AnimationCHI
SyncLabeling: A Synchronized Audio Segmentation Interface for Mobile DevicesManual audio segmentation is a time-consuming process, especially when there is more than one sound playing simultaneously that needs to be segmented and annotated (e.g., target and background sounds). In conventional audio annotation interfaces, users need to repeatedly pause and replay the audio to complete an overlap segmentation task, which is very inefficient. In this paper, we propose “SyncLabeling,” a synchronized audio segmentation interface for smartphones that allows users to segment and annotate two overlapping sounds in a single audio stream at a time using a game-like labeling interface on mobile devices. We conducted a user study to compare the proposed SyncLabeling interface with a conventional audio annotation interface on four types of audio segmentation tasks. The results showed that the proposed interface is much more efficient than the conventional interface (2.4× faster) under comparable annotation accuracy in most tasks. In addition, more than half of the participants enjoyed using the proposed SyncLabeling interface and showed willingness to use it.2023YTYi Tang et al.Gamification DesignMobileHCI
ODEN: Live Programming for Neural Network Architecture EditingIn deep learning application development, programmers tend to be trying different architectures and hyper-parameters until satisfied with the model performance. Although programmers may want to smoothly go back and forth between neural network(NN) architecture editing and experimentation, program crashes due to tensor shape mismatch and other issues prohibit them, especially novice programmers, from doing so. We propose to leverage live programming techniques in NN architecture editing to show an always-on visualization. When the user edits the program, the visualization can synchronously display tensor states and provide a warning message by continuously executing the program to prevent program crashes during experimentation. We implement the live visualization and integrate it into an IDE called ODEN that seamlessly supports the “edit→experiment→edit→···” repetition. With ODEN, the user can construct the neural network with the live visualization and transits into experimentation to instantly train and test the NN architecture. An exploratory user study is conducted to evaluate the usability, the limitations, and the potential of live visualization in ODEN.2022CZChunqi Zhao et al.Prototyping & User TestingComputational Methods in HCIIUI
Per Garment Capture and Synthesis for Real-time Virtual Try-onVirtual try-on is a promising application of computer graphics and human computer interaction that can have a profound real-world impact especially during this pandemic. Existing image-based works try to synthesize a try-on image from a single image of a target garment, but it inherently limits the ability to react to possible interactions. It is difficult to reproduce the change of wrinkles caused by pose and body size change, as well as pulling and stretching of the garment by hand. In this paper, we propose an alternative per garment capture and synthesis workflow to handle such rich interactions by training the model with many systematically captured images. Our workflow is composed of two parts: garment capturing and clothed person image synthesis. We designed an actuated mannequin and an efficient capturing process that collects the detailed deformations of the target garments under diverse body sizes and poses. Furthermore, we proposed to use a custom-designed measurement garment, and we captured paired images of the measurement garment and the target garments. We then learn a mapping between the measurement garment and the target garments using deep image-to-image translation. The customer can then try on the target garments interactively during online shopping. The proposed workflow requires certain manual labor, but we believe that the cost is acceptable given that the retailers are already paying significant costs for hiring professional photographers and models, stylists, and editors to take photographs for promotion. Our method can remove the need of hiring these costly professionals. We evaluated the effectiveness of the proposed system with ablation studies and quality comparison with previous virtual try-on methods. We perform a user study to show our promising virtual try-on performances. Moreover, we also demonstrate that we use our method for changing virtual costumes in video conferences. Finally, we provide the collected dataset as the cloth dataset parameterized by various viewing angles, body poses, and sizes.2021TCToby Chong et al.Full-Body Interaction & Embodied InputAR Navigation & Context AwarenessMixed Reality WorkspacesUIST
Interactive Hyperparameter Optimization with Paintable TimelinesWe propose a method to integrate more interactivity into automatic hyperparameter optimization systems to leverage the user's prior knowledge on parameter distribution. In our method, the user continuously observes automatic optimization's progress and dynamically specifies where to search in the parameter space. We present a prototype implementation of an interactive dashboard for an optimizer to show our method's feasibility. The interactive dashboard's main feature is ``paintable timeline'' where the user can not only observe the past parameter values tested as in standard timeline but also specify the range of future parameters to be tested with simple painting operations. We show three examples where user intervention might improve the performance of automatic optimizations. We run a user study with experts and the results show that, with prior knowledge about parameter distribution of the target problem, interactive optimization can reach better results compared to fully automatic optimization.2021KHKeita Higuchi et al.Human-LLM CollaborationAutoML InterfacesDIS
Data-centric disambiguation for data transformation with programming-by-exampleProgramming-by-example (PBE), can be a powerful tool to reduce manual work in repetitive data transformation tasks. However, few examples often leave ambiguity and may cause undesirable data transformation by the system. This ambiguity can be resolved by allowing the user to directly edit the synthesized programs; however, this is difficult for non-programmers. Here, we present a novel approach: data-centric disambiguation for data transformation, where users resolve the ambiguity in data transformation by examining and modifying the output rather than the program. The key idea is to focus on the given set of data the user wants to transform instead of pursuing the synthesized program's generality or completeness. Our system provides visualization and interaction methods that allow users to efficiently examine and fix the transformed outputs, which is much simpler than understanding and modifying the program itself. The user study suggests that our system can successfully help non-programmers to more easily and efficiently process data.2021MNMinori Narita et al.Interactive Data VisualizationIUI