Polite But Boring? Trade-offs Between Engagement and Psychological Reactance to Chatbot Feedback StylesAs conversational agents become increasingly common in behaviour change interventions, understanding optimal feedback delivery mechanisms becomes increasingly important. However, choosing a style that both lessens psychological reactance (perceived threats to freedom) while simultaneously eliciting feelings of surprise and engagement represents a complex design problem. We explored how three different feedback styles: 'Direct', 'Politeness', and 'Verbal Leakage' (slips or disfluencies to reveal a desired behaviour) affect user perceptions and behavioural intentions. Matching expectations from literature, the 'Direct' chatbot led to lower behavioural intentions and higher reactance, while the 'Politeness' chatbot evoked higher behavioural intentions and lower reactance. However, 'Politeness' was also seen as unsurprising and unengaging by participants. In contrast, 'Verbal Leakage' evoked reactance, yet also elicited higher feelings of surprise, engagement, and humour. These findings highlight that effective feedback requires navigating trade-offs between user reactance and engagement, with novel approaches such as 'Verbal Leakage' offering promising alternative design opportunities.2026SCSamuel Rhys Cox et al.Aalborg UniversityConversational ChatbotsAffective Human-Computer DialogueBehavior Change & Reflection TechnologyCHI
Impact of Explanation Techniques and Representations on Users Comprehension and Confidence in Explainable AILocal explainability, an important sub-field of eXplainable AI, focuses on describing the decisions of AI models for individual use cases by providing the underlying relationships between a model's inputs and outputs. While the machine learning community has made substantial progress in improving explanation accuracy and completeness, these explanations are rarely evaluated by the final users. In this paper, we evaluate the impact of various explanation and representation techniques on users' comprehension and confidence. Through a user study on two different domains, we assessed three commonly used local explanation techniques---feature-attribution, rule-based, and counterfactual---and explored how their visual representation---graphical or text-based---influences users' comprehension and trust. Our results show that the choice of explanation technique primarily affects user comprehension, whereas the graphical representation impacts user confidence.2025JDJulien Delaunay et al.Explainable AI (XAI)CSCW
Cognitive Forcing for Better Decision-Making: Reducing Overreliance on AI Systems Through Partial ExplanationsIn AI-assisted decision-making, explanations aim to enhance transparency and user trust but can also lead to negligence. In two separate studies, we explore the use of partial explanations to activate cognitive forcing and increase user engagement. In Study~I ($N = 264$), we present participants with weighted graphs and ask them to identify the shortest paths. In Study~II ($N = 210$), participants correct spelling and grammar mistakes in short text segments. In both studies, we provide a solution suggestion accompanied by either no explanation, a full explanation, or a partial explanation. Our results show that partial explanations reduce overreliance on incorrect AI suggestions, performing significantly better than the baseline but not as well as full explanations. Individuals with a high need for cognition benefit more from AI explanations and consequently perform better. Our work suggests that partial explanations can be valuable in domains where reducing overreliance on AI is critical, like medical diagnosis. It also underscores the need to consider explanation effectiveness across different task difficulties, a factor often overlooked in contemporary human-AI studies.2025SJSander de Jong et al.Humans vs. AI for Decision MakingCSCW
The Impact of a Chatbot's Ephemerality-Framing on Self-Disclosure PerceptionsSelf-disclosure, the sharing of one's thoughts and feelings, is affected by the perceived relationship between individuals. While chatbots are increasingly used for self-disclosure, the impact of a chatbot's framing on users' self-disclosure remains under-explored. We investigated how a chatbot’s description of its relationship with users, particularly in terms of ephemerality, affects self-disclosure. Specifically, we compared a \textsc{Familiar} chatbot, presenting itself as a companion remembering past interactions, with a \textsc{Stranger} chatbot, presenting itself as a new, unacquainted entity in each conversation. In a mixed factorial design, participants engaged with either the \textsc{Familiar} or \textsc{Stranger} chatbot in two sessions across two days, with one conversation focusing on \textsc{Emotional}- and another \textsc{Factual}-disclosure. When \textsc{Emotional}-disclosure was sought in the first chatting session, \textsc{Stranger}-condition participants felt more comfortable self-disclosing. However, when \textsc{Factual}-disclosure was sought first, these differences were replaced by more enjoyment among \textsc{Familiar}-condition participants. Qualitative findings showed \textsc{Stranger} afforded anonymity and reduced judgement, whereas \textsc{Familiar} sometimes felt intrusive unless rapport was built via low-risk \textsc{Factual}-disclosure.2025SCSamuel Rhys Cox et al.Conversational ChatbotsAgent Personality & AnthropomorphismCUI
Beyond Productivity: Rethinking the Impact of Creativity Support ToolsCreativity Support Tools (CSTs) are widely used across diverse creative domains, with generative AI recently increasing the abilities of CSTs. To better understand how the success of CSTs is determined in the literature, we conducted a review of outcome measures used in CST evaluations. Drawing from (n=173) CST evaluations in the ACM Digital Library, we identified the metrics commonly employed to assess user interactions with CSTs. Our findings reveal prevailing trends in current evaluation practices, while exposing underexplored measures that could broaden the scope of future research. Based on these results, we argue for a more holistic approach to evaluating CSTs, encouraging the HCI community to consider not only user experience and the quality of the generated output, but also user-centric aspects such as self-reflection and well-being as critical dimensions of assessment. We also highlight a need for validated measures specifically suited to the evaluation of generative AI in CSTs.2025SCSamuel Rhys Cox et al.Generative AI (Text, Image, Music, Video)Creative Collaboration & Feedback SystemsC&C
General Practitioners’ Perspectives on a Pre-Consultation Chatbot for Shared Decision-MakingGeneral practitioner (GP) consultations are the typical starting point for a patient's healthcare journey. Here, GPs aim to support and inform patients to enable a shared decision-making process. In this work we explore how an interactive chatbot, designed to prepare patients for their GP consultation, is perceived by GPs to impact patient consultations, patient-GP interaction, and their work. We conducted an in-depth evaluation and interview with 15 GPs from 12 different practices. Our findings provide insights into common challenges in shared decision-making, GP perspectives on the role of chatbots in preparing patients, and how chatbot technology could impact and transform general practice. Finally, we reflect on patient and GP agency in shared decision-making and the impact of technology on this complex relationship.2025MSMana Samiee et al.Conversational ChatbotsHuman-LLM CollaborationAI-Assisted Decision-Making & AutomationDIS
Prompt Machine: A Tangible Generative AI Tool for Supporting Children's Learning and LiteracyGenerative AI technologies are moving into school settings. However, there is confusion about how, when, and why these technologies should be used. Our aim has been to provide insights on how AI technology can be meaningfully integrated into schools, with a specific focus on secondary school education. Informed by ten teachers, we developed Prompt Machine, a tangible learning tool that serves three central purposes; 1) scaffold curriculum learning, 2) support development of AI literacy, and 3) act as a focal point among pupils and teachers for discussing possibilities and limitations of AI. Based on a study with 33 pupils and their teachers, we present findings on tangible and collaborative AI interactions, facilitation of AI, and integration of AI into curricula. Additionally, we reflect on challenges and opportunities for AI in education from the perspective of teachers and learners and discuss future steps for tangible AI.2025MLMartin Lindrup et al.Generative AI (Text, Image, Music, Video)K-12 Digital Education ToolsIntelligent Tutoring Systems & Learning AnalyticsDIS
Coordination Mechanisms in AI Development: Practitioner Experiences on Integrating UX ActivitiesSoftware development relies on collaboration and alignment between a variety of roles, including software developers and user experience designers. The increasing focus on artificial intelligence in today's development projects has given rise to new challenges in this collaboration. We extend previous work on the process of designing human-AI systems by analysing collaborative practices between UX designers and AI developers through Mintzberg's theory on coordination mechanisms. We conducted 15 in-depth interviews with UX designers and AI developers currently working on AI projects. We contribute by identifying how coordination mechanisms impact the UX design process when developing AI systems, inter-team (a)symmetries in power relations, and a growing need for tools and cross-disciplinary knowledge to support these collaborative efforts. In particular, we outline the risks of coordinating AI development work through the standardisation of output and skills in separately organised UX and AI development teams.2025ABAnders Bruun et al.Computer Science, Aalborg UniversityHuman-LLM CollaborationKnowledge Worker Tools & WorkflowsImpact of Automation on WorkCHI
Chatbots for Data Collection in Surveys: A Comparison of Four Theory-Based Interview ProbesSurveys are a widespread method for collecting data at scale, but their rigid structure often limits the depth of qualitative insights obtained. While interviews naturally yield richer responses, they are challenging to conduct across diverse locations and large participant pools. To partially bridge this gap, we investigate the potential of using LLM-based chatbots to support qualitative data collection through interview probes embedded in surveys. We assess four theory-based interview probes: descriptive, idiographic, clarifying, and explanatory. Through a split-plot study design (N=64), we compare the probes' impact on response quality and user experience across three key stages of HCI research: exploration, requirements gathering, and evaluation. Our results show that probes facilitate the collection of high-quality survey data, with specific probes proving effective at different research stages. We contribute practical and methodological implications for using chatbots as research tools to enrich qualitative data collection.2025RJRune Møberg Jacobsen et al.Aalborg University, Department of Computer ScienceConversational ChatbotsHuman-LLM CollaborationCHI
Enhancing Self-Efficacy in Health Self-Examination through Conversational Agent's EncouragementHealth self-examination, such as checking for changes to skin moles, is key to identifying potential negative changes to one's body. A major barrier to initiating a self-examination is a perceived lack of confidence or knowledge. In this study, we use a 2 x 2 between-subjects design to evaluate the effect of an AI conversational agent (CA) on participant self-efficacy and trust. We manipulated both participants' perceived skill in self-examination (based on prior perceived Success vs. Failure) and the CA's verbal persuasions (Encouraging vs. Neutral), with participants asked to complete a series of skin self-assessment tasks. Our findings show that participants' self-efficacy increased when exposed to encouraging CA persuasion. Additionally, we observed that an encouraging CA significantly increased participants’ trust scores in perceived benevolence compared to a neutral-sounding CA. Our results inform the design of CAs to support users' independent self-examination.2025NKNaja Kathrine Kollerup et al.Department of Computer ScienceConversational ChatbotsMental Health Apps & Online Support CommunitiesCHI
Visual Augmentations for Ultrasound Assessment Training of Medical StudentsUltrasound assessments are key in assessing traumatic injuries to the human body during urgent medical emergencies. Obtaining proficiency in conducting ultrasound assessments is challenging, and relies on hands-on, individually instructed training provided by a scarce number of ultrasound experts. We investigate how to support medical students’ learning of ultrasound assessment through visual augmentations. By enhancing the learning process, we seek to support medical students in reaching higher proficiency in ultrasound assessments. We followed an ultrasound assessment course to identify the primary challenges faced by medical students learning to conduct ultrasound assessments. Based on our findings, we designed four distinct visual augmentations in collaboration with a course educator that guide students in achieving better ultrasound image quality.We evaluated these visual augmentations in a mixed-method study with 15 medical students. Our findings provide insights on the use of digital technology in supporting clinical training, and the possibilities of bridging existing training practices.2025HDHelena Bøjer Djernæs et al.Aalborg University, Department of Computer ScienceMedical & Scientific Data VisualizationSurgical Assistance & Medical TrainingCHI
Challenging Futures: Using Chatbots to Reflect on Aging and DementiaIntertemporal reflection, flexibly thinking forward and backward in time, is vital for one's future planning. Yet, cultivating intertemporal reflection about encountering difficult futures, e.g., developing a progressive cognitive condition like dementia, can be challenging. We assessed people's attitudes towards dementia following conversing with a chatbot presented as either neurotypical or simulating dementia symptoms. While neither the chatbot’s presentation nor the framing of participants’ future selves impacted attitudes toward dementia, it influenced participants' experiences. When framed as future selves, the chatbot evoked a strong emotional connection, leading to reflection on aging, particularly with the chatbot simulating dementia symptoms. Participants interacting with the chatbot framed as a stranger with simulated symptoms often felt frustrated, especially when they had a task-oriented mindset. Chatbots can be promising tools for prompting reflections on challenging futures, such as dementia, although their effectiveness varies due to the tensions between simulated cognitive decline and expectations for effective communication.2025RKRucha Khot et al.Eindhoven University of Technology, Industrial DesignAugmentative & Alternative Communication (AAC)Mental Health Apps & Online Support CommunitiesEmpowerment of Marginalized GroupsCHI
Effect of Explanation Conceptualisations on Trust in AI-assisted Credibility AssessmentAs misinformation increasingly proliferates on social media platforms, it has become crucial to explore how to best convey automated news credibility assessments to end-users, and foster trust in fact-checking AIs. In this paper, we investigate how model-agnostic, natural language explanations influence trust and reliance on a fact-checking AI. We construct explanations from four Conceptualisation Validations (CVs) – namely consensual, expert, internal (logical), and empirical – which are foundational units of evidence that humans utilise to validate and accept new information. Our results show that providing explanations significantly enhances trust in AI, even in a fact-checking context where influencing pre-existing beliefs is often challenging, with different CVs causing varying degrees of reliance. We find consensual explanations to be the least influential, with expert, internal, and empirical explanations exerting twice as much influence. However, we also find that users could not discern whether the AI directed them towards the truth, highlighting the dual nature of explanations to both guide and potentially mislead. Further, we uncover the presence of automation bias and aversion during collaborative fact-checking, indicating how users' previously established trust in AI can moderate their reliance on AI judgements. We also observe the manifestation of a 'boomerang'/backfire effect often seen in traditional corrections to misinformation, with individuals who perceive AI as biased or untrustworthy doubling down and reinforcing their existing (in)correct beliefs when challenged by the AI. We conclude by presenting nuanced insights into the dynamics of user behaviour during AI-based fact-checking, offering important lessons for social media platforms.2024SPSaumya Pareek et al.Session 3e: Trust and Understanding in Explainable AICSCW
Exploring VUI-Supported Mindfulness Techniques for Smoking CessationThis study investigates the effectiveness of Voice User Interfaces (VUIs) in supporting mindfulness techniques for smoking cessation. We conducted a month-long between-subject study involving nine participants, comparing a VUI on smart speakers against an augmented VUI (a blend of VUI and Graphical User Interface) on mobile devices. Specifically, we evaluated how these interfaces support individuals in quitting smoking through mindfulness practices. Our results include qualitative insights on participants' experiences with mindfulness, their smoking cessation motivation, and engagement with the VUI prototypes, alongside quantitative data on their usage patterns. Our findings offer insights into the potential application of VUIs in smoking cessation and suggest design guidelines for future health-oriented applications. The study underscores the importance of device context in designing effective health interventions and sets the direction for future work in HCI and mindfulness applications.2024SKSimon Bak Kjaerulff et al.Voice User Interface (VUI) DesignMental Health Apps & Online Support CommunitiesCUI
Manual, Hybrid, and Automatic Privacy Covers for Smart Home CamerasSmart home cameras (SHCs) offer convenience and security to users, but also cause greater privacy concerns than other sensors due to constant collection and processing of sensitive data. Moreover, privacy perceptions may differ between primary users and other users at home. To address these issues, we developed three physical cover prototypes for SHCs: Manual, Hybrid, and Automatic, based on design criteria of observability, understandability, and tangibility. With 90 SHC users, we ran an online survey using video vignettes of the prototypes. We evaluated how the physical covers alleviated privacy concerns by measuring perceived creepiness and trustworthiness. Our results show that the physical covers were well received, even though primary SHC users valued always-on surveillance. We advocate for the integration of physical covers into future SHC designs, emphasizing their potential to establish a shared understanding of surveillance status. Additionally, we provide design recommendations to support this proposition.2024SSSujay Shalawadi et al.Privacy by Design & User ControlSmart Home Privacy & SecurityDIS
How Can I Signal You To Trust Me: Investigating AI Trust Signalling in Clinical Self-AssessmentsIndividuals are increasingly interested in and responsible for assessing their own health. This study evaluates a fictional AI dermatologist for assistance in the self-assessment of moles. Building on the Signalling Theory, we tested the effect of textual descriptions provided by a virtual dermatologist, as manipulated across 'Ability', 'Integrity,' and 'Benevolence', along with the clinical assessment, 'benign' or 'malignant', affect users' trust in the aforementioned trust pillars. Our study (N = 40) follows a 2 (Ability low/high) x 2 (Integrity low/high) x 2 (Benevolence low/high) x 2 (mole assessment benign/malignant) within-subject factorial design. Our results demonstrate that we can successfully influence perceptions of ability and benevolence by manipulating the corresponding aspects of trust but not perceived integrity. Further, in the case of a malignant assessment, participants' perception of trust increased across all aspects. Our results provide insights into the design of AI support systems for sensitive use cases, such as clinical self-assessments.2024NKNaja Kathrine Kollerup et al.Explainable AI (XAI)AI-Assisted Decision-Making & AutomationMental Health Apps & Online Support CommunitiesDIS
“As an AI language model, I cannot”: Investigating LLM Denials of User RequestsUsers ask large language models (LLMs) to help with their homework, for lifestyle advice, or for support in making challenging decisions. Yet LLMs are often unable to fulfil these requests, either as a result of their technical inabilities or policies restricting their responses. To investigate the effect of LLMs denying user requests, we evaluate participants' perceptions of different denial styles. We compare specific denial styles (baseline, factual, diverting, and opinionated) across two studies, respectively focusing on LLM's technical limitations and their social policy restrictions. Our results indicate significant differences in users' perceptions of the denials between the denial styles. The baseline denial, which provided participants with brief denials without any motivation, was rated significantly higher on frustration and significantly lower on usefulness, appropriateness, and relevance. In contrast, we found that participants generally appreciated the diverting denial style. We provide design recommendations for LLM denials that better meet peoples' denial expectations.2024JWJoel Wester et al.Aalborg UniversityHuman-LLM CollaborationAI Ethics, Fairness & AccountabilityCHI
“If I Had All the Time in the World”: Ophthalmologists' Perceptions of Anchoring Bias Mitigation in Clinical AI SupportClinical needs and technological advances have resulted in increased use of Artificial Intelligence (AI) in clinical decision support. However, such support can introduce new and amplify existing cognitive biases. Through contextual inquiry and interviews, we set out to understand the use of an existing AI support system by ophthalmologists. We identified concerns regarding anchoring bias and a misunderstanding of the AI's capabilities. Following, we evaluated clinicians' perceptions of three bias mitigation strategies as integrated into their existing decision support system. While clinicians recognised the danger of anchoring bias, we identified a concern around the impact of bias mitigation on procedure time. Our participants were divided in their expectations of any positive impact on diagnostic accuracy, stemming from varying reliance on the decision support. Our results provide insights into the challenges of integrating bias mitigation into AI decision support.2023ABAnne Kathrine Petersen Bach et al.Aalborg UniversityExplainable AI (XAI)AI-Assisted Decision-Making & AutomationCHI
Method for Appropriating the Brief Implicit Association Test to Elicit Biases in UsersImplicit tendencies and cognitive biases play an important role in how information is perceived and processed, a fact that can be both utilised and exploited by computing systems. The Implicit Association Test (IAT) has been widely used to assess people's associations of target concepts with qualitative attributes, such as the likelihood of being hired or convicted depending on race, gender, or age. The condensed version--the Brief IAT--aims to implicit biases by measuring the reaction time to concept classifications. To use this measure in HCI research, however, we need a way to construct and validate target concepts, which tend to quickly evolve and depend on geographical and cultural interpretations. In this paper, we introduce and evaluate a new method to appropriate the BIAT using crowdsourcing to measure people's leanings on polarising topics. We present a web-based tool to test participants' bias on custom themes, where self-assessments often fail. We validated our approach with 14 domain experts and assessed the fit of crowdsourced test construction. Our method allows researchers of different domains to create and validate bias tests that can be geographically tailored and updated over time. We discuss how our method can be applied to surface implicit user biases and run studies where cognitive biases may impede reliable results.2022TDTilman Dingler et al.University of MelbourneAlgorithmic Fairness & BiasComputational Methods in HCICHI
Do You See What I Hear? — Peripheral Absolute and Relational Visualisation Techniques for Sound ZonesSound zone technology allows multiple simultaneous sound experiences for multiple people in the same room without interference. However, given the inherent invisible and intangible nature of sound zones, it is unclear how to communicate the position and size of sound zones to users. This paper compares two visualisation techniques; absolute visualisation, relational visualisation, as well as a baseline condition without visualisations. In a within-subject experiment (N=33), we evaluated these techniques for effectiveness and efficiency across four representative tasks. Our findings show that the absolute and relational visualisation techniques increase effectiveness in multi-user tasks but not in single-user tasks. The efficiency for all tasks was improved using visualisations. We discuss the potential of visualisations for sound zones and highlight future research opportunities for sound zone interaction.2022RJRune Møberg Jacobsen et al.Aalborg UniversityInteractive Data VisualizationVisualization Perception & CognitionCHI