Criticality: Scaffolding Decision-Making with Interactive Critical Thinking and Evidence-Based Reasoning TracesDecision-making requires examining underlying assumptions and concepts, considering diverse perspectives, and weighing potential consequences with clear, accurate reasoning. Recent large language models (LLMs) show promise for assisting decision-makers by combining reasoning capabilities with the ability to retrieve relevant information from large documents. However, our formative study with five professional decision-makers revealed key limitations of using LLM in workflow: time-consuming alignment of user goals, lack of evidence-based grounding, overwhelmingly long outputs, and unsurfaced assumptions undermined user trust in the LLM output and the validity of the final decision. We introduce Criticality, a system that operationalizes the Paul-Elder Critical Thinking framework to structure reasoning into interactive Elements of Thought (e.g., purpose, assumptions, perspectives, implications), and evaluates and guides reasoning using Intellectual Standards (e.g., clarity, fairness, logic). It also retrieves evidence for each claim, classifies it as supporting, neutral, or contradictory, and explains the claim-evidence link. A within-subjects study (n=13) comparing Criticality to ChatGPT 5 Pro, a state-of-the-art reasoning model in conversational interface, found that Criticality improved user interaction of steering and repairing through the decision-making process, producing better decision rationales compared to the baseline.2026MCMinsuk Chang et al.Georgia Institute of TechnologyHuman-LLM CollaborationExplainable AI (XAI)User Research Methods (Interviews, Surveys, Observation)IUI
"I Need to Find That One Chart": How Data Workers Navigate, Summarize and Communicate Analytical ConversationsConversational interfaces are increasingly used for data analysis, enabling data workers to express complex analytical intents in natural language. Yet, these interactions unfold as long, linear transcripts that are misaligned with the iterative, nonlinear nature of real-world analyses. Revisiting and summarizing conversations for different contexts is therefore challenging. This paper investigates how data workers navigate, make sense of, and communicate prior analytical conversations. To study behaviors beyond those supported by standard interfaces (i.e., scrolling and keyword search), we develop a design probe that supplements analytical conversations with structured elements and affordances (e.g., filtering, multi-level navigation and detail-on-demand). In a user study ($n = 10$), participants used the probe to navigate and communicate past analyses, fulfilling information needs (recall, reorient, prioritize) through navigation strategies (visual recall, sequential and abstractive) and summarization practices (adding process details and context). Based on these findings, we discuss design implications to support re-visitation and communication of analytical conversations.2026KGKen Gu et al.University of WashingtonUser Research Methods (Interviews, Surveys, Observation)Prototyping & User TestingInteractive Data VisualizationCHI
Lexara: A User-Centered Toolkit for Evaluating Large Language Models for Conversational Visual AnalyticsLarge Language Models (LLMs) are transforming Conversational Visual Analytics (CVA) by enabling data analysis through natural language. However, evaluating LLMs for CVA remains a challenge: requiring programming expertise, overlooking real-world complexity, and lacking interpretable metrics for multi-format (visualizations and text) outputs. Through interviews with 22 CVA developers and 16 end-users, we identified use-cases, evaluation criteria and workflows. We present Lexara, a user-centered evaluation toolkit for CVA that operationalizes these insights into: (i) test-cases spanning real-world scenarios; (ii) interpretable metrics covering visualization quality (data fidelity, semantic alignment, functional correctness, design clarity) and language quality (factual grounding, analytical reasoning, conversational coherence) using rule-based and LLM-as-a-judge methods; and (iii) an interactive toolkit enabling experimental setup and multi-format and multi-level exploration of results without programming expertise. We conducted a two-week diary study with six CVA developers, drawn from our initial cohort of 22. Their feedback demonstrated Lexara's effectiveness for guiding appropriate model and prompt selection.2026SPSrishti Palani et al.Tableau ResearchHuman-LLM CollaborationInteractive Data VisualizationExplainable AI (XAI)CHI
To Search or To Gen? Design Dimensions Integrating Web Search and Generative AI in Programmers' Information-Seeking ProcessProgrammers now use both generative AI (GenAI) and traditional web search for information-seeking, yet how these tools are used individually or in combination remains unclear. To answer this, we conducted a multi-phase investigation, including retrospective interviews to identify foraging behaviours and challenges and an observational study with a technology probe to analyze how contextual information flows across tools. Our findings reveal that effective information-seeking requires adaptable strategies and varying levels of contextual detail. Building on these insights, we propose five design dimensions for developing tools that integrate web search, GenAI, and code editors. We further demonstrated the generative power of these design dimensions with a proof-of-concept prototype, validated through a user study, offering actionable design implications for enhancing integrated information-seeking workflows across web search and GenAI in programming.2025RYRyan Yen et al.Human-LLM CollaborationRecommender System UXDIS
Pluto: Authoring Semantically Aligned Text and Charts for Data-Driven CommunicationTextual content (including titles, annotations, and captions) plays a central role in helping readers understand a visualization by emphasizing, contextualizing, or summarizing the depicted data. Yet, existing visualization tools provide limited support for jointly authoring the two modalities of text and visuals such that both convey semantically-rich information and are cohesively integrated. In response, we introduce Pluto, a mixed-initiative authoring system that uses features of a chart's construction (e.g., visual encodings) as well as any textual descriptions a user may have drafted to make suggestions about the content and presentation of the two modalities. For instance, a user can begin to type out a description and interactively brush a region of interest in the chart, and Pluto will generate a relevant auto-completion of the sentence. Similarly, based on a written description, Pluto may suggest lifting a sentence out as an annotation or the visualization's title, or may suggest applying a data transformation (e.g., sort) to better align the two modalities. A preliminary user study revealed that Pluto's recommendations were particularly useful for bootstrapping the authoring process and helped identify different strategies participants adopt when jointly authoring text and charts. Based on study feedback, we discuss design implications for integrating interactive verification features between charts and text, offering control over text verbosity and tone, and enhancing the bidirectional flow in unified text and chart authoring tools.2025ASArjun Srinivasan et al.Interactive Data VisualizationData StorytellingIUI
Plume: Scaffolding Text Composition in DashboardsText in dashboards plays multiple critical roles, including providing context, offering insights, guiding interactions, and summarizing key information. Despite its importance, most dashboarding tools focus on visualizations and offer limited support for text authoring. To address this gap, we developed Plume, a system to help authors craft effective dashboard text. Through a formative review of exemplar dashboards, we created a typology of text parameters and articulated the relationship between visual placement and semantic connections, which informed Plume’s design. Plume employs large language models (LLMs) to generate contextually appropriate content and provides guidelines for writing clear, readable text. A preliminary evaluation with 12 dashboard authors explored how assisted text authoring integrates into workflows, revealing strengths and limitations of LLM-generated text and the value of our human-in-the-loop approach. Our findings suggest opportunities to improve dashboard authoring tools by better supporting the diverse roles that text plays in conveying insights.2025MLMaxim Lisnic et al.University of UtahHuman-LLM CollaborationData StorytellingCHI
AI-Enabled Conversational Journaling for Advancing Parkinson's Disease Symptom TrackingJournaling plays a crucial role in managing chronic conditions by allowing patients to document symptoms and medication intake, providing essential data for long-term care. While valuable, traditional journaling methods often rely on static, self-directed entries, lacking interactive feedback and real-time guidance. This gap can result in incomplete or imprecise information, limiting its usefulness for effective treatment. To address this gap, we introduce PATRIKA, an AI-enabled prototype designed specifically for people with Parkinson's disease (PwPD). The system incorporates cooperative conversation principles, clinical interview simulations, and personalization to create a more effective and user-friendly journaling experience. Through two user studies with PwPD and iterative refinement of PATRIKA, we demonstrate conversational journaling's significant potential in patient engagement and collecting clinically valuable information. Our results showed that generating probing questions PATRIKA turned journaling into a bi-directional interaction. Additionally, we offer insights for designing journaling systems for healthcare and future directions for promoting sustained journaling.2025MRMashrur Rashik et al.University of Massachusetts AmherstHuman-LLM CollaborationChronic Disease Self-Management (Diabetes, Hypertension, etc.)Diet Tracking & Nutrition ManagementCHI
Jupybara: Operationalizing a Design Space for Actionable Data Analysis and Storytelling with LLMsMining and conveying actionable insights from complex data is a key challenge of exploratory data analysis (EDA) and storytelling. To address this challenge, we present a design space for actionable EDA and storytelling. Synthesizing theory and expert interviews, we highlight how semantic precision, rhetorical persuasion, and pragmatic relevance underpin effective EDA and storytelling. We also show how this design space subsumes common challenges in actionable EDA and storytelling, such as identifying appropriate analytical strategies and leveraging relevant domain knowledge. Building on the potential of LLMs to generate coherent narratives with commonsense reasoning, we contribute Jupybara, an AI-enabled assistant for actionable EDA and storytelling implemented as a Jupyter Notebook extension. Jupybara employs two strategies—design-space-aware prompting and multi-agent architectures—to operationalize our design space. An expert evaluation confirms Jupybara’s usability, steerability, explainability, and reparability, as well as the effectiveness of our strategies in operationalizing the design space framework with LLMs.2025HWHuichen Will Wang et al.University of Washington, Paul G. Allen School of Computer Science & Engineering; Tableau ResearchHuman-LLM CollaborationInteractive Data VisualizationData StorytellingCHI
SonicVista: Towards Creating Awareness of Distant Scenes through SonificationGupta 等人开发 SonicVista 系统,通过声化技术将远程场景信息转换为声音,增强用户对远处环境的感知能力。2024CGChitralekha Gupta et al.Context-Aware ComputingUbiComp
SlopeSeeker: A Search Tool for Exploring a Dataset of Quantifiable TrendsNatural language and search interfaces intuitively facilitate data exploration and provide visualization responses to diverse analytical queries based on the underlying datasets. However, these interfaces often fail to interpret more complex analytical intents, such as discerning subtleties and quantifiable differences between terms like “bump” and “spike” in the context of COVID cases, for example. We address this gap by extending the capabilities of a data exploration search interface for interpreting semantic concepts in time series trends. We first create a comprehensive dataset of semantic concepts by mapping quantifiable univariate data trends such as slope and angle to crowdsourced, semantically meaningful trend labels. The dataset contains quantifiable properties that capture the slope-scalar effect of semantic modifiers like “sharply” and “gradually”, as well as multi-line trends (e.g., “peak,” “valley”). We demonstrate the utility of this dataset in SlopeSeeker, a tool that supports natural language querying of quantifiable trends, such as “show me stocks that tanked in 2010.” The tool incorporates novel scoring and ranking techniques based on semantic relevance and visual prominence to present relevant trend chart responses containing these semantic trend concepts. In addition, SlopeSeeker provides a faceted search interface for users to navigate a semantic hierarchy of concepts from general trends (e.g., “increase”) to more specific ones (e.g., “sharp increase”). A preliminary user evaluation of the tool demonstrates that the search interface supports greater expressivity of queries containing concepts that describe data trends. We identify potential future directions for leveraging our publicly available quantitative semantics dataset in other data domains and for novel visual analytics interfaces.2024ABAlexander Bendeck et al.Time-Series & Network Graph VisualizationData StorytellingVisualization Perception & CognitionIUI
Olio: A Semantic Search Interface for Data RepositoriesSearch and information retrieval systems are becoming more expressive in interpreting user queries beyond the traditional weighted bag-of-words model of document retrieval. For example, searching for a flight status or a game score returns a dynamically generated response along with supporting, pre-authored documents contextually relevant to the query. In this paper, we extend this hybrid search paradigm to data repositories that contain curated data sources and visualization content. We introduce a semantic search interface, OLIO, that provides a hybrid set of results comprising both auto-generated visualization responses and pre-authored charts to blend analytical question-answering with content discovery search goals. We specifically explore three search scenarios - question-and-answering, exploratory search, and design search over data repositories. The interface also provides faceted search support for users to refine and filter the conventional best-first search results based on parameters such as author name, time, and chart type. A preliminary user evaluation of the system demonstrates that OLIO's interface and the hybrid search paradigm collectively afford greater expressivity in how users discover insights and visualization content in data repositories.2023VSVidya Setlur et al.Interactive Data VisualizationData StorytellingUIST
Troubling Collaboration: Matters of Care for Visualization Design StudyA common research process in visualization is for visualization researchers to collaborate with domain experts to solve particular applied data problems. While there is existing guidance and expertise around how to structure collaborations to strengthen research contributions, there is comparatively little guidance on how to navigate the implications of, and power produced through the socio-technical entanglements of collaborations. In this paper, we qualitatively analyze reflective interviews of past participants of collaborations from multiple perspectives: visualization graduate students, visualization professors, and domain collaborators. We juxtapose the perspectives of these individuals, revealing tensions about the tools that are built and the relationships that are formed --- a complex web of competing motivations. Through the lens of \textit{matters of care}, we interpret this web, concluding with considerations that both trouble and necessitate reformation of current patterns around collaborative work in visualization design studies to promote more equitable, useful, and care-ful outcomes.2023DADerya Akbaba et al.Linköping UniversityInteractive Data VisualizationUser Research Methods (Interviews, Surveys, Observation)CHI
Exploring Chart Question Answering for Blind and Low Vision UsersData visualizations can be complex or involve numerous data points, making them impractical to navigate using screen readers alone. Question answering (QA) systems have the potential to support visualization interpretation and exploration without overwhelming blind and low vision (BLV) users. To investigate if and how QA systems can help BLV users in working with visualizations, we conducted a Wizard of Oz study with 24 BLV people where participants freely posed queries about four visualizations. We collected 979 queries and mapped them to popular analytic task taxonomies. We found that retrieving value and finding extremum were the most common tasks, participants often made complex queries and used visual references, and the data topic notably influenced the queries. We compile a list of design considerations for accessible chart QA systems and make our question corpus publicly available to guide future research and development.2023JKJiho Kim et al.University of Wisconsin-MadisonVisual Impairment Technologies (Screen Readers, Tactile Graphics, Braille)Interactive Data VisualizationCHI
Tracing and Visualizing Human-ML/AI Collaborative Processes through Artifacts of Data WorkAutomated Machine Learning (AutoML) technology can lower barriers in data work yet still requires human intervention to be functional. However, the complex and collaborative process resulting from humans and machines trading off work makes it difficult to trace what was done, by whom (or what), and when. In this research, we construct a taxonomy of data work artifacts that captures AutoML and human processes. We present a rigorous methodology for its creation and discuss its transferability to the visual design process. We operationalize the taxonomy through the development of AutoML Trace, an interactive visual sketch showing both the context and temporality of human-ML/AI collaboration in data work. Finally, we demonstrate the utility of our approach via a usage scenario with an enterprise software development team. Collectively, our research process and findings explore challenges and fruitful avenues for developing data visualization tools that interrogate the sociotechnical relationships in automated data work.2023JRJen Rogers et al.Scientific Computing and Imaging InstituteHuman-LLM CollaborationInteractive Data VisualizationCHI
Augmented Chironomia for Presenting Data to Remote AudiencesTo facilitate engaging and nuanced conversations around data, we contribute a touchless approach to interacting directly with visualization in remote presentations. We combine dynamic charts overlaid on a presenter's webcam feed with continuous bimanual hand tracking, demonstrating interactions that highlight and manipulate chart elements appearing in the foreground. These interactions are simultaneously functional and deictic, and some allow for the addition of "rhetorical flourish", or expressive movement used when speaking about quantities, categories, and time intervals. We evaluated our approach in two studies with professionals who routinely deliver and attend presentations about data. The first study considered the presenter perspective, where 12 participants delivered presentations to a remote audience using a presentation environment incorporating our approach. The second study considered the audience experience of 17 participants who attended presentations supported by our environment. Finally, we reflect on observations from these studies and discuss related implications for engaging remote audiences in conversations about data.2022BHBrian D. Hall et al.Full-Body Interaction & Embodied InputInteractive Data VisualizationVisualization Perception & CognitionUIST
Recommendations for Visualization Recommendations: Exploring Preferences and Priorities in Visualization Recommendations for Public HealthThe promise of visualization recommendation systems is that analysts will be automatically provided with relevant and high-quality visualizations that will reduce the work of manual exploration or chart creation. However, little research to date has focused on what analysts \textit{value} in \revised{the design of} visualization recommendations. We interviewed 18 analysts in the public health sector and explored how they made sense of a popular in-domain dataset\footnote{National Health and Nutrition Examination Study 2013-2014~\cite{centers2013nhanes}.} in service of generating visualizations to recommend to others. We also explored how they interacted with a corpus of both automatically- and manually-generated visualization recommendations, with the goal of uncovering how the design values of these analysts are reflected in current visualization recommendation systems. We find that analysts \revised{champion} simple charts with clear takeaways that are nonetheless connected with existing semantic information or domain hypotheses. We conclude by recommending that visualization recommendation designers explore ways of integrating context and expectation into their systems.2022CBCalvin S. Bao et al.University of MarylandRecommender System UXInteractive Data VisualizationCHI
How do you Converse with an Analytical Chatbot? Revisiting Gricean Maxims for Designing Analytical Conversational BehaviorChatbots have garnered interest as conversational interfaces for a variety of tasks. While general design guidelines exist for chatbot interfaces, little work explores analytical chatbots that support conversing with data. We explore Gricean Maxims to help inform the basic design of effective conversational interaction. We also draw inspiration from natural language interfaces for data exploration to support ambiguity and intent handling. We ran Wizard of Oz studies with 30 participants to evaluate user expectations for text and voice chatbot design variants. Results identified preferences for intent interpretation and revealed variations in user expectations based on the interface affordances. We subsequently conducted an exploratory analysis of three analytical chatbot systems (text + chart, voice + chart, voice-only) that implement these preferred design variants. Empirical evidence from a second 30-participant study informs implications specific to data-driven conversation such as interpreting intent, data orientation, and establishing trust through appropriate system responses.2022VSVidya Setlur et al.Tableau ResearchConversational ChatbotsHuman-LLM CollaborationVisualization Perception & CognitionCHI
Snowy: Recommending Utterances for Conversational Visual AnalysisNatural language interfaces (NLIs) have become a prevalent medium for conducting visual data analysis, enabling people with varying levels of analytic experience to ask questions of and interact with their data. While there have been notable improvements with respect to language understanding capabilities in these systems, fundamental user experience and interaction challenges including the lack of analytic guidance (i.e., knowing what aspects of the data to consider) and discoverability of natural language input (i.e., knowing how to phrase input utterances) persist. To address these challenges, we investigate utterance recommendations that contextually provide analytic guidance by suggesting data features (e.g., attributes, values, trends) while implicitly making users aware of the types of phrasings that an NLI supports. We present SNOWY, a prototype system that generates and recommends utterances for visual analysis based on a combination of data interestingness metrics and language pragmatics. Through a preliminary user study, we found that utterance recommendations in SNOWY support conversational visual analysis by guiding the participants' analytic workflows and making them aware of the system's language interpretation capabilities. Based on the feedback and observations from the study, we discuss potential implications and considerations for incorporating recommendations in future NLIs for visual analysis.2021ASArjun Srinivasan et al.Human-LLM CollaborationInteractive Data VisualizationUIST
Collecting and Characterizing Natural Language Utterances for Specifying Data VisualizationsNatural language interfaces (NLIs) for data visualization are becoming increasingly popular both in academic research and in commercial software. Yet, there is a lack of empirical understanding of how people specify visualizations through natural language. We conducted an online study (N = 102), showing participants a series of visualizations and asking them to provide utterances they would pose to generate the displayed charts. From the responses, we curated a dataset of 893 utterances and characterized the utterances according to (1) their phrasing (e.g., commands, queries, questions) and (2) the information they contained (e.g., chart types, data aggregations). To help guide future research and development, we contribute this utterance dataset and discuss its applications toward the creation and benchmarking of NLIs for visualization.2021ASArjun Srinivasan et al.Tableau ResearchVoice User Interface (VUI) DesignInteractive Data VisualizationCHI
User Ex Machina : Simulation as a Design Probe in Human-in-the-Loop Text AnalyticsTopic models are widely used analysis techniques for clustering documents and surfacing thematic elements of text corpora. These models remain challenging to optimize and often require a ``human-in-the-loop'' approach where domain experts use their knowledge to steer and adjust. However, the fragility, incompleteness, and opacity of these models means even minor changes could induce large and potentially undesirable changes in resulting model. In this paper we conduct a simulation-based analysis of human-centered interactions with topic models, with the objective of measuring the sensitivity of topic models to common classes of user actions. We find that user interactions have impacts that differ in magnitude but often negatively affect the quality of the resulting modelling in a way that can be difficult for the user to evaluate. We suggest the incorporation of sensitivity and "multiverse" analyses to topic model interfaces to surface and overcome these deficiencies.2021ACAnamaria Crisan et al.Tableau ResearchExplainable AI (XAI)AI-Assisted Decision-Making & AutomationAlgorithmic Transparency & AuditabilityCHI