Interactive Reasoning: Visualizing and Controlling Chain-of-Thought Reasoning in Large Language ModelsThe output quality of large language models (LLMs) can be improved via “reasoning”: generating segments of chain-of-thought (CoT) content to further condition the model prior to producing user-facing output. While these chains contain valuable information, they are verbose and lack explicit organization, making them tedious to review. Moreover, they lack opportunities for user feedback, such as removing unwanted considerations, adding desired ones, or clarifying unclear assumptions. We introduce Interactive Reasoning, an interaction design that visualizes chain-of-thought outputs as a hierarchy of topics and enables user review and modification. We implement interactive reasoning in Hippo, a prototype for AI-assisted decision making in the face of uncertain trade-offs. In a user study with 16 participants, we find that interactive reasoning in Hippo allows users to quickly identify and interrupt erroneous generations, efficiently steer the model towards customized responses, and better understand both model reasoning and model outputs. Our work contributes to a new paradigm that incorporates user oversight into LLM reasoning processes.2026RPRock Yuren Pang et al.University of WashingtonHuman-LLM CollaborationExplainable AI (XAI)AI-Assisted Decision-Making & AutomationIUI
Depictions of Privacy Invasion and Surveillance in Artworks and Potential Lessons For Privacy CommunicationUser-facing communication about privacy (e.g., privacy policies, privacy tools' user interfaces) is frequently ignored and often ineffective. In contrast to these arguably staid interfaces, artworks often focus on provocation, engagement, and critical interpretation. For decades, artists have created privacy art—artistic media in galleries relating to the surveillance and privacy of individuals. What are artists saying about privacy, and how? Crucially, what lessons might they have for designing privacy-focused user interfaces? To this end, we compiled over 800 privacy artworks, qualitatively analyzing a sample. Common topics spanned artistic media (from paintings to immersive installations) and eras. Artworks built upon familiar concepts (e.g., cameras, homes) to speculate on society's future and present personal information (e.g., artist, viewer, public). We discuss lessons for making non-artistic privacy communication more engaging and powerful through directing attention (e.g., lighting, collage) and setting a tone (e.g., unsettling, fun, mundane).2026TETess Eschebach et al.University of ChicagoPrivacy by Design & User ControlDigital Art Installations & Interactive PerformanceTechnology Ethics & Critical HCICHI
Regulating AI: Where U.S. State Policy and HCI (Mis)alignArtificial intelligence (AI) technologies are increasingly adopted into everyday life, with most investment and development concentrated in the U.S. In response to rapid AI integration and scant federal guidelines, U.S. states have formed AI committees charged with studying AI-related societal trade-offs. We analyzed the 18 existing state-level AI committee reports to understand how policymakers discuss AI-related benefits and risks. We then compared the risks surfaced by policymakers to an established taxonomy of AI risks aggregated from literature and examined how policymakers’ concerns align---or misalign---from those of HCI scholars. These insights provide important mileposts for shaping currently ongoing policy initiatives and future research. Our findings reveal important gaps: while committees invoke responsible AI, their framings often omit broader socio-technical concerns emphasized in HCI. We discuss opportunities for HCI to support socio-technical perspectives, employ participatory design, and close the gap between research and policy.2026NMNino Migineishvili et al.University of WashingtonAI Ethics, Fairness & AccountabilityAlgorithmic Fairness & BiasTechnology Ethics & Critical HCICHI
Decoupling of Usefulness and Novelty: Evaluating the Impact of Generative AI on Design Outputs and Novice Designers' Creative ThinkingPeople are increasingly leveraging generative AI (GenAI) for design tasks, making it critical to understand GenAI's impact on design outcomes and users' creative capabilities. We conducted a within-subjects experiment where 36 participants designed advertisements both with and without GenAI. Evaluations from clients and online volunteers revealed that GenAI-supported designs were perceived as significantly more creative and unconventional. Additionally, online volunteers, but not clients, rated these designs as more visually appealing. However, neither group perceived differences in usefulness, and clients noted no improvement in brand alignment, highlighting a notable decoupling of novelty and usefulness (two established components of creativity) in GenAI-supported design outputs. Although short-term GenAI use did not broadly influence participants' creative thinking or experience, subgroup analyses indicated increases in divergent thinking among participants new to GenAI relative to participants with GenAI experience. We discuss the implications of the decoupling effect and GenAI's influence on humans' creativity.2026YFYue Fu et al.University of WashingtonGenerative AI (Text, Image, Music, Video)AI-Assisted Decision-Making & AutomationAI-Assisted Creative WritingCHI
Should AI Mimic People? Understanding AI-Supported Writing Technology Among Black UsersAI-supported writing technologies (AISWT) that provide grammatical suggestions, autocomplete sentences, or generate and rewrite text are now a regular feature integrated into many people's workflows. However, little is known about how people perceive the suggestions these tools provide. In this paper, we investigate how Black American users perceive AISWT, motivated by prior findings in natural language processing that highlight how the underlying large language models can contain racial biases. Using interviews and observational user studies with 13 Black American users of AISWT, we found a strong tradeoff between the perceived benefits of using AISWT to enhance their writing style and feeling like ``it wasn't built for us''. Specifically, participants reported AISWT's failure to recognize commonly used names and expressions in African American Vernacular English, experiencing its corrections as hurtful and alienating and fearing it might further minoritize their culture. We end with a reflection on the tension between AISWT that fail to include Black American culture and language, and AISWT that attempt to mimic it, with attention to accuracy, authenticity, and the production of social difference.2025JBJeffrey K Basoah et al.AI & WritingCSCW
Know Your Audience: The benefits and pitfalls of generating plain language summaries beyond the "general" audienceLanguage models (LMs) show promise as tools for communicating science to the general public by simplifying and summarizing complex language. Because models can be prompted to generate text for a specific audience (e.g., college-educated adults), LMs might be used to create multiple versions of plain language summaries for people with different familiarities of scientific topics. However, it is not clear what the benefits and pitfalls of adaptive plain language are. When is simplifying necessary, what are the costs in doing so, and do these costs differ for readers with different background knowledge? Through three within-subjects studies in which we surface summaries for different envisioned audiences to participants of different backgrounds, we found that while simpler text led to the best reading experience for readers with little to no familiarity in a topic, high familiarity readers tended to ignore certain details in overly plain summaries (e.g., study limitations). Our work provides methods and guidance on ways of adapting plain language summaries beyond the single "general" audience.2024TATal August et al.Allen Institute for AIHuman-LLM CollaborationExplainable AI (XAI)AI Ethics, Fairness & AccountabilityCHI
BLIP: Facilitating the Exploration of Undesirable Consequences of Digital TechnologiesDigital technologies have positively transformed society, but they have also led to undesirable consequences not anticipated at the time of design or development. We posit that insights into past undesirable consequences can help researchers and practitioners gain awareness and anticipate potential adverse effects. To test this assumption, we introduce BLIP, a system that extracts real-world undesirable consequences of technology from online articles, summarizes and categorizes them, and presents them in an interactive, web-based interface. In two user studies with 15 researchers in various computer science disciplines, we found that BLIP substantially increased the number and diversity of undesirable consequences they could list in comparison to relying on prior knowledge or searching online. Moreover, BLIP helped them identify undesirable consequences relevant to their ongoing projects, made them aware of undesirable consequences they “had never considered,” and inspired them to reflect on their own experiences with technology.2024RPRock Yuren Pang et al.University of WashingtonDark Patterns RecognitionTechnology Ethics & Critical HCICHI
Why, when, and from whom: considerations for collecting and reporting race and ethnicity data in HCIEngaging diverse participants in HCI research is critical for creating safe, inclusive, and equitable technology. However, there is a lack of guidelines on when, why, and how HCI researchers collect study participants' race and ethnicity. Our paper aims to take the first step toward such guidelines by providing a systematic review and discussion of the status quo of race and ethnicity data collection in HCI. Through an analysis of 2016--2021 CHI proceedings and a survey with 15 authors who published in these proceedings, we found that reporting race and ethnicity of participants is very rare (<3\%) and that researchers are far from consensus. Drawing from multidisciplinary literature and our findings, we devise considerations for HCI researchers to decide why, when, and from whom to collect race and ethnicity data. For truly inclusive, equitable technologies, we encourage deliberate decisions rather than default omissions.2023YCYiqun T. Chen et al.University of Washington, Seattle, Stanford UniversityAI Ethics, Fairness & AccountabilityInclusive DesignGender & Race Issues in HCICHI
How Language Formality in Security and Privacy Interfaces Impacts Intended ComplianceStrong end-user security practices benefit both the user and hosting platform, but it is not well understood how companies communicate with their users to encourage these practices. This paper explores whether web companies and their platforms use different levels of language formality in these communications and tests the hypothesis that higher language formality leads to users' increased intention to comply. We contribute a dataset and systematic analysis of 1,817 English language strings in web security and privacy interfaces across 13 web platforms, showing strong variations in language. An online study with 512 participants further demonstrated that people perceive differences in the language formality across platforms and that a higher language formality is associated with higher self-reported intention to comply. Our findings suggest that formality can be an important factor in designing effective security and privacy prompts. We discuss implications of these results, including how to balance formality with platform language style. In addition to being the first piece of work to analyze language formality in user security, these findings provide valuable insights into how platforms can best communicate with users about account security.2023JSJackson Stokes et al.University of WashingtonPrivacy by Design & User ControlPrivacy Perception & Decision-MakingCHI
An HCI Research Agenda for Online Science CommunicationSocial media, blogs, podcasts, and other computer-mediated communication technology have become an integral way for the public to access and engage with research. However, despite the evolving challenges researchers face navigating these platforms, and the high stakes of online science communication, relatively little HCI research has focused on understanding and supporting online science communication through these participatory platforms. Through a review of the literature and a set of interviews with HCI researchers (n = 24), we identify challenges currently facing researchers who try to engage with the public about their work, and establish a research agenda for HCI to study, design, and evaluate technology to support science communication. Specifically, we advocate for the design of tools to support audience analytics, automated summary and outreach workflows, and providing quantitative and qualitative feedback about online outreach efforts, as well as additional research to elucidate the impacts of self-directed science communication efforts and the evolving roles of scientists on the participatory web. With shifting online platforms placing researchers in the role of advocates and participants in science communication, understanding and supporting these interactions is now more important than ever.2022SWSpencer Williams et al.Current and Future Practices in HCI and CSCW; Current and Future Practices in HCI and CSCWCSCW
Apéritif: Scaffolding Preregistrations to Automatically Generate Analysis Code and Methods DescriptionsThe HCI community has been advocating preregistration as a practice to improve the credibility of scientific research. However, it remains unclear how HCI researchers preregister studies and what preregistration users perceive as benefits and challenges. By systematically reviewing the past four CHI proceedings and surveying 11 researchers, we found that only 1.11% of papers presented preregistered studies, though both authors and reviewers of preregistered studies perceive it as beneficial. Our formative studies revealed key challenges ranging from a lack of detail about the study design, hindering comprehensibility, to inconsistencies between preregistrations and published papers. To explore ways for addressing these issues, we developed Apéritif, a research prototype that scaffolds the preregistration process and automatically generates analysis code and a methods description. In an evaluation with 17 HCI researchers, we found that Apéritif reduces the effort of preregistering a study, facilitates researchers' workflows, and promotes consistency between research artifacts.2022YPYuren Pang et al.University of WashingtonComputational Methods in HCIResearch Ethics & Open ScienceCHI
VoxLens: Making Online Data Visualizations Accessible with an Interactive JavaScript Plug-InJavaScript visualization libraries are widely used to create online data visualizations but provide limited access to their information for screen-reader users. Building on prior findings about the experiences of screen-reader users with online data visualizations, we present VoxLens, an open-source JavaScript plug-in that--with a single line of code--improves the accessibility of online data visualizations for screen-reader users using a multi-modal approach. Specifically, VoxLens enables screen-reader users to obtain a holistic summary of presented information, play sonified versions of the data, and interact with visualizations in a "drill-down" manner using voice-activated commands. Through task-based experiments with 21 screen-reader users, we show that VoxLens improves the accuracy of information extraction and interaction time by 122% and 36%, respectively, over existing conventional interaction with online data visualizations. Our interviews with screen-reader users suggest that VoxLens is a "game-changer" in making online data visualizations accessible to screen-reader users, saving them time and effort.2022ASAther Sharif et al.University of WashingtonVisual Impairment Technologies (Screen Readers, Tactile Graphics, Braille)CHI
Do Cross-Cultural Differences in Visual Attention Patterns Affect Search Efficiency on Websites?Prior work in cross-cultural psychology and neuroscience has shown robust variations in visual attention patterns. People from \hl{East Asian societies,} in which a holistic thinking style predominates, have been found to attend to contextual information in scenes more than Westerners, whose tendency to think analytically expresses itself in greater attention to foreground objects. This paper applies these findings to website design, using an online study to evaluate whether Japanese (N=65) remember more and are faster at finding contextual website information than US Americans (N=84). Our results do not support this hypothesis. Instead, Japanese took overall significantly longer to find information than US participants---a difference that was exacerbated by an increase in website complexity---suggesting that Japanese may be holistically taking in a website before engaging with detailed information. We discuss implications of these findings for website design and cross-cultural research.2021ABAmanda Baughan et al.University of WashingtonEye Tracking & Gaze InteractionVoice User Interface (VUI) DesignMultilingual & Cross-Cultural Voice InteractionCHI
How WEIRD is CHI?Computer technology is often designed in technology hubs in Western countries, invariably making it "WEIRD", because it is based on the intuition, knowledge, and values of people who are Western, Educated, Industrialized, Rich, and Democratic. Developing technology that is universally useful and engaging requires knowledge about members of WEIRD and non-WEIRD societies alike. In other words, it requires us, the CHI community, to generate this knowledge by studying representative participant samples. To find out to what extent CHI participant samples are from Western societies, we analyzed papers published in the CHI proceedings between 2016-2020. Our findings show that 73% of CHI study findings are based on Western participant samples, representing less than 12% of the world's population. Furthermore, we show that most participant samples at CHI tend to come from industrialized, rich, and democratic countries with generally highly educated populations. Encouragingly, recent years have seen a slight increase in non-Western samples and those that include several countries. We discuss suggestions for further broadening the international representation of CHI participant samples.2021SLSebastian Linxen et al.University of BaselInclusive DesignTechnology Ethics & Critical HCIDeveloping Countries & HCI for Development (HCI4D)CHI
Keep it Simple: How Visual Complexity and Preferences Impact Search Efficiency on WebsitesWe conducted an online study with 165 participants in which we tested their search efficiency and information recall. We confirm that the visual complexity of a website has a significant negative effect on search efficiency and information recall. However, the search efficiency of those who preferred simple websites was more negatively affected by highly complex websites than those who preferred high visual complexity. Our results suggest that diverse visual preferences need to be accounted for when assessing search response time and information recall in HCI experiments, testing software, or A/B tests.2020ABAmanda Baughan et al.University of WashingtonVisualization Perception & CognitionPrototyping & User TestingCHI
Explain like I am a Scientist: The Linguistic Barriers of Entry to r/scienceAs an online community for discussing research findings, r/science has the potential to contribute to science outreach and communication with a broad audience. Yet previous work suggests that most of the active contributors on r/science are science-educated people rather than a lay general public. One potential reason is that r/science contributors might use a different, more specialized language than used in other subreddits. To investigate this possibility, we analyzed the language used in more than 68 million posts and comments from 12 subreddits from 2018. We show that r/science uses a specialized language that is distinct from other subreddits. Transient (newer) authors of posts and comments on r/science use less specialized language than more frequent authors, and those that leave the community use less specialized language than those that stay, even when comparing their first comments. These findings suggest that the specialized language used in r/science has a gatekeeping effect, preventing participation by people whose language does not align with that used in r/science. By characterizing r/science's specialized language, we contribute guidelines and tools for increasing the number of contributors in r/science.2020TATal August et al.University of WashingtonCommunity Collaboration & WikipediaActivism & Political ParticipationCHI
Tea: A High-level Language and Runtime System for Automating Statistical AnalysisThough statistical analyses are centered on research questions and hypotheses, current statistical analysis tools are not. Users must first translate their hypotheses into specific statistical tests and then perform API calls with functions and parameters. To do so accurately requires that users have statistical expertise. To lower this barrier to valid, replicable statistical analysis, we introduce Tea, a high-level declarative language and runtime system. In Tea, users express their study design, any parametric assumptions, and their hypotheses. Tea compiles these high-level specifications into a constraint satisfaction problem that determines the set of valid statistical tests, and then executes them to test the hypothesis. We evaluate Tea using a suite of statistical analyses drawn from popular tutorials. We show that Tea generally matches the choices of experts while automatically switching to non-parametric tests when parametric assumptions are not met. We simulate the effect of mistakes made by non-expert users and show that Tea automatically avoids both false negatives and false positives that could be produced by the application of incorrect statistical tests.2019EJEunice Jun et al.Computational Methods in HCIUIST
r/science: Challenges and Opportunities in Online Science CommunicationOnline discussion websites, such as Reddit's r/science forum, have the potential to foster science communication between researchers and the general public. However, little is known about who participates, what is discussed, and whether such websites are successful in achieving meaningful science discussions. To find out, we conducted a mixed-methods study analyzing 11,859 r/science posts and conducting interviews with 18 community members. Our results show that r/science facilitates rich information exchange and that the comments section provides a unique science communication document that guides engagement with scientific research. However, this community-sourced science communication comes largely from a knowledgeable public. We conclude with design suggestions for a number of critical problems that we uncovered: addressing the problem of topic newsworthiness and balancing broader participation and rigor.2019RJRidley Jones et al.University of WashingtonSTEM Education & Science CommunicationCommunity Collaboration & WikipediaUser Research Methods (Interviews, Surveys, Observation)CHI
The Impact of Web Browser Reader Views on Reading Speed and User ExperienceAs reading increasingly shifts from paper to online media, many web browsers now provide a "Reader View,'' which modifies web page layout and design for better readability. However, research has yet to establish whether Reader Views are effective in improving readability and how they might change the user experience. We characterize how Mozilla Firefox's Reader View significantly reduces the visual complexity of websites by excluding menus, images, and content. We then conducted an online study with 391 participants (including 42 who self-reported having been diagnosed with dyslexia), showing that compared to standard websites the Reader View increased reading speed by 5% for readers on average, and significantly improved perceived readability and visual appeal. We suggest guidelines for the design of websites and browsers that better support people with varying reading skills.2019QLQisheng Li et al.University of WashingtonUniversal & Inclusive DesignCHI
Pay Attention, Please: Formal Language Improves Attention in Volunteer and Paid Online ExperimentsParticipant engagement in online studies is key to collecting reliable data, yet achieving it remains an often discussed challenge in the research community. One factor that might impact engagement is the formality of language used to communicate with participants throughout the study. Prior work has found that language formality can convey social cues and power hierarchies, affecting people's responses and actions. We explore how formality influences engagement, measured by attention, dropout, time spent on the study and participant performance, in an online study with 369 participants on Mechanical Turk (paid) and LabintheWild (volunteer). Formal language improves participant attention compared to using casual language in both paid and volunteer conditions, but does not affect dropout, time spent, or participant performance. We suggest using more formal language in studies containing complex tasks where fully reading instructions is especially important. We also highlight trade-offs that different recruitment incentives provide in online experimentation.2019TATal August et al.University of WashingtonUser Research Methods (Interviews, Surveys, Observation)CHI