Making Multimodal LLMs Reliable Chart Data Extractors: A Benchmark and Training FrameworkChart data extraction, which reverse-engineers data tables from chart images, is essential for reproducibility, analysis, retrieval, and redesign. Existing interactive tools are reliable but tedious, and mixed-initiative systems, while more efficient, lack generalizability. Recent multimodal large language models (MLLMs) offer a unified interface for chart interpretation, yet their ability to extract accurate data tables, especially without visible labels, remains unclear. We build a benchmark featuring diverse real-world charts without data labels to evaluate this capability. Results show that, while current MLLMs reliably reconstruct table structures, they struggle with precise value recovery. To address this, we revisit chart data extraction from a human-centered perspective and argue that extraction should follow a progressive learning process similar to how people read charts. Our training framework substantially improves numerical accuracy, achieving state-of-the-art performance with a 7B-parameter model. A user study further shows that our model effectively supports mixed-initiative workflows for reliable chart data extraction.2026YHYuchen He et al.Zhejiang UniversityGenerative AI (Text, Image, Music, Video)Human-LLM CollaborationInteractive Data VisualizationCHI
Cerebra: Aligning Implicit Knowledge in Interactive SQL AuthoringLLM-driven tools have significantly lowered barriers to writing SQL queries. However, user instructions are often underspecified, assuming the model understands implicit knowledge, such as dataset schemas, domain conventions, and task-specific requirements, that isn't explicitly provided. This results in frequently erroneous scripts that require users to repeatedly clarify their intent. Additionally, users struggle to validate generated scripts because they cannot verify whether the model correctly applied implicit knowledge. We present Cerebra, an interactive NL-to-SQL tool that aligns implicit knowledge between users and LLMs during SQL authoring. Cerebra automatically retrieves implicit knowledge from historical SQL scripts based on user instructions, presents this knowledge in an interactive tree view for code review, and supports iterative refinement to improve generated scripts. To evaluate the effectiveness and usability of Cerebra, we conducted a user study with 16 participants, demonstrating its improved support for customized SQL authoring. The source code of Cerebra is available at https://github.com/zjuidg/CHI26-Cerebra.2026YZYunfan Zhou et al.Zhejiang UniversityHuman-LLM CollaborationExplainable AI (XAI)Computational Methods in HCICHI
Visualizing Tree-of-analysis: Facilitating Conversational Visual Analytics for NovicesConversational visual analytics (CVA) make data exploration accessible to novices but often leave users disoriented during multi-turn conversations. Previous approaches provide data-centric recommendations, but fail to help users regain orientations. To bridge this gap, we conducted a formative study (N=12) revealing that novices are insensitive to analytical cues and rely on vague queries, leading to disorientation and task failures. In contrast, experts are sensitive to two types of analytical cues and use seven types of queries to organize workflows. Based on these findings, we propose ToA, a novel approach that structures the CVA process as an interactive analysis tree. Moreover, we visualize this tree, with AI outputs as nodes (containing two cue types) and user queries as edges (categorized by seven query types), to provide novices with an overview of their analysis journey. We evaluated ToA through user studies (N=12) and expert interviews (N=3). The results suggest that ToA eliminates task failure and increases per-turn insights (+58.3%), despite longer per-turn thinking time (+17.7%). Expert interviews further confirm its potential to democratize visual analytics.2026FQFeiyuan Qu et al.Zhejiang UniversityInteractive Data VisualizationExploratory Search & Information SeekingExplainable AI (XAI)CHI
NoteFlow: Leveraging Charts as Sight Glasses for Consistent and Continuous Data Flow TracingComputational notebooks offer a flexible environment for exploratory data analysis (EDA), but this flexibility often leads to disorganized and iterative execution of notebook cells, making it difficult to track how data states evolve. Consequently, data scientists must devote extra mental effort to staying aware of data states, which is both tedious and prone to overlooking anomalies. To address this challenge, we developed NoteFlow, a notebook extension that leverages charts as ``sight glasses'' to provide a consistent and continuous tracing of data flow. NoteFlow allows users to (1) validate various facets of the current data state using recommended charts provided immediately after each cell execution, and (2) trace the global evolution of selected charts to continuously observe how particular data attributes evolve throughout the EDA process. We evaluated NoteFlow's effectiveness through a controlled study with 12 participants and a one-month field study with 2 data scientists on real-world workflows.2026YTYuan Tian et al.Zhejiang UniversityInteractive Data VisualizationData-Driven Personal Decision-MakingUser Research Methods (Interviews, Surveys, Observation)CHI
TSEditor: Interactive Time Series Editing for Privacy PreservationPublishing time series datasets raises substantial privacy concerns, as the underlying patterns (e.g., trends, values) can lead to the disclosure of individual identification. Mitigating these concerns remains challenging due to difficulties in pinpointing specific privacy-leaking patterns and protecting them without significantly compromising the analytical utility of the published data. Existing methods remain vulnerable to identity attacks utilizing diverse temporal patterns and may compromise data utility for subsequent analytical tasks. To address these limitations, we collaborated with domain experts to summarize a taxonomy of privacy risks in time series data and developed TSEditor, an interactive editing system. TSEditor integrates coordinated views for multi-perspective analysis of privacy risks and introduces six editing operations for targeted modifications, providing visual feedback. We demonstrate the effectiveness and usability of TSEditor through two case studies, an expert interview, a model evaluation, and a user study.2026ZXZihan Xu et al.Zhejiang UniversityPrivacy Perception & Decision-MakingInteractive Data VisualizationExplainable AI (XAI)CHI
From Sports Videos to Immersive Training: Augmenting Human Motion to Enrich Basketball Training ExperienceVideo plays a crucial role in sports training, enabling participants to analyze their movements and identify opponents' weaknesses. Despite the easy access to sports videos, the rich motion data within them remains underutilized due to the lack of clear performance indicators and discrepancies from real-game conditions. To address this, we employed advanced computer vision algorithms to reconstruct human motions in an immersive environment, where users can freely observe and interact with the movements. Basketball shooting was chosen as a representative scenario to validate this framework, given its fast pace and extensive physical contact. Collaborating with experts, we iteratively designed motion-related visualizations to improve the understanding of complex movements. A one-on-one matchup simulating real games was also provided, allowing users to compete directly with the reconstructed motions. Our user studies demonstrate that this method enhances participants' movement comprehension and engagement, while insights derived from interviews inform future immersive training designs.2025YWYihong Wu et al.Full-Body Interaction & Embodied InputHuman Pose & Activity RecognitionUIST
ViseGPT: Towards Better Alignment of LLM-generated Data Wrangling Scripts and User PromptsLarge language models (LLMs) enable the rapid generation of data wrangling scripts based on natural language instructions, but these scripts may not fully adhere to user-specified requirements, necessitating careful inspection and iterative refinement. Existing approaches primarily assist users in understanding script logic and spotting potential issues themselves, rather than providing direct validation of correctness. To enhance debugging efficiency and optimize the user experience, we develop ViseGPT, a tool that automatically extracts constraints from user prompts to generate comprehensive test cases for verifying script reliability. The test results are then transformed into a tailored Gantt chart, allowing users to intuitively assess alignment with semantic requirements and iteratively refine their scripts. Our design decisions are informed by a formative study (N=8) that explores user practices and challenges. We further evaluate the effectiveness and usability of ViseGPT through a user study (N=18). Results indicate that ViseGPT significantly improves debugging efficiency for LLM-generated data-wrangling scripts, enhances users’ ability to detect and correct issues, and streamlines the workflow experience.2025JZJiajun Zhu et al.Human-LLM CollaborationExplainable AI (XAI)Interactive Data VisualizationUIST
CAnnotator: Photo-Guided Color Annotation for Degraded Ancient PaintingsAncient paintings suffer irreversible color degradation due to aging and improper conservation. Labeling degraded paintings with authentic colors becomes vital to protect these valuable cultural heritages, which is challenging due to missing color information. Users typically need to investigate relevant photos to infer authentic colors and then validate these colors by mixing traditional pigments. However, such a task could be exhausting. To ease the difficulty, we propose an interactive visualization tool, namely CAnnotator, that streamlines efficient human-AI collaboration for the color annotation of degraded ancient paintings. CAnnotator consists of three views: a paint-annotation view, a photo-reference view, and a pigment-mixing view. Given an ancient painting, the paint-annotation view is developed to help users extract its color-degraded object textures that would be propagated to the relevant photos using a texture tracking model. Based on the tracking results, the photo-reference view provides texture-color and object-posture filters to explore the photos that include the given texture colors and object postures. We train a deep learning model to simulate the mixing of physical pigments and employ the chain rule to support progressive pigment mixture using a novel flow-based color visualization. We demonstrate the usage of CAnnotator through a use case and evaluate its effectiveness through model experiments and an in-lab user study. Compared to the baseline, CAnnotator could improve user confidence of labeled colors and foster user engagement at the cost of additional time.2025TTTan Tang et al.Museum & Cultural Heritage DigitizationInteractive Narrative & Immersive StorytellingUIST
ReSpark: Leveraging Previous Data Reports as References to Generate New Reports with LLMsCreating data reports is a labor-intensive task involving iterative data exploration, insight extraction, and narrative construction. A key challenge lies in composing the analysis logic-from defining objectives and transforming data to identifying and communicating insights. Manually crafting this logic can be cognitively demanding. While experienced analysts often reuse scripts from past projects, finding a perfect match for a new dataset is rare. Even when similar analyses are available online, they usually share only results or visualizations, not the underlying code, making reuse difficult. To address this, we present ReSpark, a system that leverages large language models (LLMs) to reverse-engineer analysis logic from existing reports and adapt it to new datasets. By generating draft analysis steps, ReSpark provides a warm start for users. It also supports interactive refinement, allowing users to inspect intermediate outputs, insert objectives, and revise content. We evaluate ReSpark through comparative and user studies, demonstrating its effectiveness in lowering the barrier to generating data reports without relying on existing analysis code.2025YTYuan Tian et al.Human-LLM CollaborationInteractive Data VisualizationData StorytellingUIST
VisMimic: Integrating Motion Chain in Feedback Video Generation for Motor CoachingAugmented video is a common medium for remote sports coaching, facilitating communication between trainees and coaches. Existing video augmentation techniques struggle to simultaneously convey both the overall motion dynamics and static key poses. This limitation hinders feedback comprehension in motor learning, making it difficult to understand where errors occur and how to correct them. To address this, we first reviewed popular video augmentation solutions. In collaboration with professional coaches, we integrated motion chain into feedback videos to combine key poses with motion trajectories. It supports multi-view observation and feedback explanation from overview to detail. To assist coaches in creating feedback videos, we present VisMimic, a human-AI interaction system that automatically analyzes trainee videos against reference movements, generates animated feedback, and enables customization. User studies show VisMimic's usability and effectiveness in enhancing motion analysis and communication for motor coaching.2025LCLiqi Cheng et al.Full-Body Interaction & Embodied InputHuman Pose & Activity RecognitionUIST
TableCanoniser: Interactive Grammar-Powered Transformation of Messy, Non-Relational Tables to Canonical TablesTableCanoniser is a declarative grammar and interactive system for constructing relational tables from messy tabular inputs such as spreadsheets. We propose the concept of axis alignment to categorise input types and characterise the expanded scope of our system relative to existing tools. The declarative grammar consists of match conditions, which specify repeating patterns of input cells, and extract operations, which specify how matched values map to the output table. In the interactive interface, users can specify match and extract patterns by interacting with an input table, or author more advanced specifications in the coding panel. To refine and verify specifications, users interact with grammar-based provenance visualisations such as linked highlighting of input and output values, tree-based visualisation of matching patterns, and a mini-map overview of matched instances of patterns with annotations showing where cells are extracted to. We motivate and illustrate our work with real-world usage scenarios and workflows.2025KXKai Xiong et al.Zhejiang University, State Key Lab of CAD&CGInteractive Data VisualizationPrototyping & User TestingCHI
RidgeBuilder: Interactive Authoring of Expressive Ridgeline PlotsRidgeline plots are frequently employed to visualize the evolution or distributions of multiple series with a pile of overlapping line, area, or bar charts, highlighting the peak patterns. While traditionally viewed as small multiple visualizations, their ridge-like patterns have increasingly attracted graphic designers to create appealing customized ridgeline plots. However, many tools only support creating basic ridgeline plots and overlook their diverse layouts and styles. This paper introduces a comprehensive design space for ridgeline plots, focusing on their varied layouts and expressive styles. We present RidgeBuilder, an intuitive tool for creating expressive ridgeline plots with customizable layouts and styles. In particular, we summarize three goals for refining the layout of ridgeline plots and propose an optimization method. We assess RidgeBuilder's usability and usefulness through a reproduction study and evaluate the layout optimization algorithm through anonymized questionnaires. The effectiveness is demonstrated with a gallery of ridgeline plots created by RidgeBuilder.2025SLShuhan Liu et al.State Key Lab of CAD & CG, Zhejiang UniversityInteractive Data VisualizationData StorytellingCHI
ProTAL: A Drag-and-Link Video Programming Framework for Temporal Action LocalizationTemporal Action Localization (TAL) aims to detect the start and end timestamps of actions in a video. However, the training of TAL models requires a substantial amount of manually annotated data. Data programming is an efficient method to create training labels with a series of human-defined labeling functions. However, its application in TAL faces difficulties of defining complex actions in the context of temporal video frames. In this paper, we propose ProTAL, a drag-and-link video programming framework for TAL. ProTAL enables users to define \textbf{key events} by dragging nodes representing body parts and objects and linking them to constrain the relations (direction, distance, etc.). These definitions are used to generate action labels for large-scale unlabelled videos. A semi-supervised method is then employed to train TAL models with such labels. We demonstrate the effectiveness of ProTAL through a usage scenario and a user study, providing insights into designing video programming framework.2025YHYuchen He et al.Zhejiang University, State Key Lab of CAD&CGInteractive Data VisualizationComputational Methods in HCICHI
StructVizor: Interactive Profiling of Semi-Structured Textual DataData profiling plays a critical role in understanding the structure of complex datasets and supporting numerous downstream tasks, such as social media analytics and financial fraud detection. While existing research predominantly focuses on structured data formats, a substantial portion of semi-structured textual data still requires ad-hoc and arduous manual profiling to extract and comprehend its internal structures. In this work, we propose StructVizor, an interactive profiling system that facilitates sensemaking and transformation of semi-structured textual data. Our tool mainly addresses two challenges: a) extracting and visualizing the diverse structural patterns within data, such as how information is organized or related, and b) enabling users to efficiently perform various wrangling operations on textual data. Through automatic data parsing and structure mining, StructVizor enables visual analytics of structural patterns, while incorporating novel interactions to enable profile-based data wrangling. A comparative user study involving 12 participants demonstrates the system's usability and its effectiveness in supporting exploratory data analysis and transformation tasks.2025YHYanwei Huang et al.Zhejiang University, State Key Lab of CAD&CGInteractive Data VisualizationTime-Series & Network Graph VisualizationVisualization Perception & CognitionCHI
Xavier: Toward Better Coding Assistance in Authoring Tabular Data Wrangling ScriptsData analysts frequently employ code completion tools in writing custom scripts to tackle complex tabular data wrangling tasks. However, existing tools do not sufficiently link the data contexts such as schemas and values with the code being edited. This not only leads to poor code suggestions, but also frequent interruptions in coding processes as users need additional code to locate and understand relevant data. We introduce Xavier, a tool designed to enhance data wrangling script authoring in computational notebooks. Xavier maintains users' awareness of data contexts while providing data-aware code suggestions. It automatically highlights the most relevant data based on the user's code, integrates both code and data contexts for more accurate suggestions, and instantly previews data transformation results for easy verification. To evaluate the effectiveness and usability of Xavier, we conducted a user study with 16 data analysts, showing its potential to streamline data wrangling scripts authoring.2025YZYunfan Zhou et al.Zhejiang University, State Key Lab of CAD&CGInteractive Data VisualizationComputational Methods in HCICHI
VisCourt: In-Situ Guidance for Interactive Tactic Training in Mixed RealityIn team sports like basketball, understanding and executing tactics---coordinated plans of movements among players---are crucial yet complex, requiring extensive practice. These tactics require players to develop a keen sense of spatial and situational awareness. Traditional coaching methods, which mainly rely on basketball tactic boards and video instruction, often fail to bridge the gap between theoretical learning and the real-world application of tactics, due to shifts in view perspectives and a lack of direct experience with tactical scenarios. To address this challenge, we introduce VisCourt, a Mixed Reality (MR) tactic training system, in collaboration with a professional basketball team. To set up the MR training environment, we employed semi-automatic methods to simulate realistic 3D tactical scenarios and iteratively designed visual in-situ guidance. This approach enables full-body engagement in interactive training sessions on an actual basketball court and provides immediate feedback, significantly enhancing the learning experience. A user study with athletes and enthusiasts shows the effectiveness and satisfaction with VisCourt in basketball training and offers insights for the design of future SportsXR training systems.2024LCLiqi Cheng et al.Full-Body Interaction & Embodied InputMixed Reality WorkspacesImmersion & Presence ResearchUIST
Understanding Nonlinear Collaboration between Human and AI Agents: A Co-design Framework for Creative DesignCreative design is a nonlinear process where designers generate diverse ideas in the pursuit of an open-ended goal and converge towards consensus through iterative remixing. In contrast, AI-powered design tools often employ a linear sequence of incremental and precise instructions to approximate design objectives. Such operations violate customary creative design practices and thus hinder AI agents' ability to complete creative design tasks. To explore better human-AI co-design tools, we first summarize human designers’ practices through a formative study with 12 design experts. Taking graphic design as a representative scenario, we formulate a nonlinear human-AI co-design framework and develop a proof-of-concept prototype, OptiMuse. We evaluate OptiMuse and validate the nonlinear framework through a comparative study. We notice a subconscious change in people's attitudes towards AI agents, shifting from perceiving them as mere executors to regarding them as opinionated colleagues. This shift effectively fostered the exploration and reflection processes of individual designers.2024JZJiayi Zhou et al.Zhejiang UniversityGenerative AI (Text, Image, Music, Video)Human-LLM CollaborationCreative Collaboration & Feedback SystemsCHI
VAID: Indexing View Designs in Visual Analytics SystemVisual analytics (VA) systems have been widely used in various application domains. However, VA systems are complex in design, which imposes a serious problem: although the academic community constantly designs and implements new designs, the designs are difficult to query, understand, and refer to by subsequent designers. To mark a major step forward in tackling this problem, we index VA designs in an expressive and accessible way, transforming the designs into a structured format. We first conducted a workshop study with VA designers to learn user requirements for understanding and retrieving professional designs in VA systems. Thereafter, we came up with an index structure VAID to describe advanced and composited visualization designs with comprehensive labels about their analytical tasks and visual designs. The usefulness of VAID was validated through user studies. Our work opens new perspectives for enhancing the accessibility and reusability of professional visualization designs.2024LYLu Ying et al.Zhejiang UniversityInteractive Data VisualizationVisualization Perception & CognitionCHI
Table Illustrator: Puzzle-based interactive authoring of plain tablesPlain tables excel at displaying data details and are widely used in data presentation, often polished to an elaborate appearance for readability in many scenarios. However, existing authoring tools fail to provide both flexible and efficient support for altering the table layout and styles, motivating us to develop an intuitive and swift tool for table prototyping. To this end, we contribute Table Illustrator, a table authoring system taking a novel visual metaphor, puzzle, as the primary interaction unit. Through combinations and configurations on puzzles, the system enables rapid table construction and supports a diverse range of table layouts and styles. The tool design is informed by practical challenges and requirements from interviews with 10 table practitioners and a structured design space based on an analysis of over 2,500 real-world tables. User studies showed that Table Illustrator achieved comparable performance to Microsoft Excel while reducing users' completion time and perceived workload.2024YHYanwei Huang et al.Zhejiang UniversityInteractive Data VisualizationData StorytellingCHI
PColorizor: Re-coloring Ancient Chinese Paintings with Ideorealm-congruent PoemsColor restoration of ancient Chinese paintings plays a significant role in Chinese culture protection and inheritance. However, traditional color restoration is challenging and time-consuming because it requires professional restorers to conduct detailed literature reviews on numerous paintings for reference colors. After that, they have to fill in the inferred colors on the painting manually. In this paper, we present PColorizor, an interactive system that integrates advanced deep-learning models and novel visualizations to ease the difficulties of color restoration. PColorizor is established on the principle of poem-painting congruence. Given a color-fading painting, we employ both explicit and implicit color guidance implied by ideorealm-congruent poems to associate reference paintings. To enable quick navigation of color schemes extracted from the reference paintings, we introduce a novel visualization based on a mountain metaphor that shows color distribution overtime at the ideorealm and imagery levels. Moreover, we demonstrate the ideorealm understood by deep learning models through intuitive visualizations to bridge the communication gap between human restorers and deep learning models. We also adopt intelligent color-filling techniques to accelerate manual color restoration further. To evaluate PColorizor, we collaborate with domain experts to conduct two case studies to collect their feedback. The results suggest that PColorizor could be beneficial in enabling the effective restoration of color-fading paintings.2023TTTan Tang et al.Generative AI (Text, Image, Music, Video)Data StorytellingMuseum & Cultural Heritage DigitizationUIST