"The AI tool can’t make it any worse." Investigating Developers’ Security Behavior with AI Assistants in a Password Storage StudyPast research showed that software developers often require explicit instructions to implement security measures. With the rapid rise of AI assistant tools such as ChatGPT, it remains unclear whether AI assistance supports or undermines secure practices, whether explicit security instructions are still essential, and how developers behave without guidance. To investigate these research questions, we conducted a qualitative lab study with 21 computer science students and a quantitative online study with 80 freelance developers. We focused on secure password storage and asked participants to implement registration logic under four conditions: without instructions, with AI assistance, with security instructions, or with both AI assistance and security instructions. Our study reveals a clear behavioral shift: In our task, many participants relied on AI-assisted code generation for security-related tasks, often prioritizing convenience over security. However, explicit security-focused instructions can redirect this behavior toward secure outcomes, demonstrating that AI tools alone are insufficient without targeted guidance.2026AYAsli Yardim et al.Ruhr University BochumExplainable AI (XAI)Passwords & AuthenticationGenerative AI (Text, Image, Music, Video)CHI
Sensing Your Vocals: Exploring the Activity of Vocal Cord Muscles for Pitch Assessment Using Electromyography and UltrasonographyVocal training is difficult because the muscles that control pitch, resonance, and phonation are internal and invisible to learners. This paper investigates how Electromyography (EMG) and ultrasonic imaging (UI) can make these muscles observable for training purposes. We report three studies. First, we analyze the EMG and UI data from 16 singers (beginners, experienced \& professionals), revealing differences among three vocal groups of the muscle control proficiency. Second, we use the collected data to create a system that visualizes an expert's muscle activity as reference. This system is tested in a user study with 12 novices, showing that EMG highlighted muscle activation nuances, while UI provided insights into vocal cord length and dynamics. Third, to compare our approach to traditional methods (audio analysis and coach instructions), we conducted a focus group study with 15 experienced singers. Our results suggest that EMG is promising for improving vocal skill development and enhancing feedback systems. We conclude the paper with a detailed comparison of the analyzed modalities (EMG, UI and traditional methods), resulting in recommendations to improve vocal muscle training systems.2026KCKanyu Chen et al.Graduate School of Media DesignBiosensors & Physiological MonitoringEmotion Recognition & DetectionAffective Feedback & Emotion Regulation InterfacesCHI
It Shouldn't Be This Difficult: Researcher Perspectives on Diversity and Inclusion in Usable Privacy and Security ResearchWhile recent usable privacy and security (UPS) research has made progress in moving beyond “the average user,” a systematic account of how UPS researchers navigate diversity and inclusion in their work remains lacking. Through 20 in-depth semi-structured interviews with experienced researchers, we examine how and why they recruit diverse, underserved populations in their work, as well as the challenges they face in doing so, including conceptual difficulties in defining who is underserved, limited access to target populations, and inflexible peer review and publishing norms. Participants also reflected on their own positionality when planning and conducting studies, often expressing uncertainty about how to account for and articulate their positionality. We identify strategies researchers use to overcome challenges and highlight areas where collective action from the research community and institutions is needed to foster greater inclusion in UPS research practices.2026PCPriyasha Chatterjee et al.MPI-SPPrivacy by Design & User ControlPrivacy Perception & Decision-MakingInclusive DesignCHI
Robust Methods for Developer Screening in Rapidly Evolving AI ContextsThe rise of AI-powered tools like ChatGPT enables non-programmers to bypass programming screening questions, undermining internal validity in usable security and privacy, and software engineering studies. Past ChatGPT-resistant tasks proposed static visual questions, which ChatGPT can now circumvent. Therefore, we tested alternative approaches such as video- and audio-based screeners that reveal key information step by step under strict time constraints to distinguish programmers from non-programmers. To this end, we conducted a study with 74 participants across three groups: programmers, non-programmers without AI assistance, and non-programmers using ChatGPT. Our results showed that audio-based screeners were robust against ChatGPT-based cheating, as non-programmers struggled to find correct answers within time limits, whereas programmers demonstrated high accuracy with minimal time pressure. Based on our findings, we recommend six audio-based ChatGPT-resistant screening questions that maximize screening effectiveness and efficiency and suggest a 215-second instrument that includes 95.87% of programmers while excluding 99.69% of non-programmers.2026RSRaphael Serafini et al.University of CologneExplainable AI (XAI)AI-Assisted Decision-Making & AutomationPrivacy by Design & User ControlCHI
Certified AI System = Trustworthy? Exploring Expert and Lay User Perceptions and Needs Regarding AI CertificationAI certification has emerged as a promising mechanism to enhance transparency, accountability, and public trust. However, end-user perspectives remain largely unexplored. This study investigates two groups with differing AI expertise. Through qualitative interviews with 30 participants (15 experts, 15 lay users), we examined how AI certification influences trust, who should conduct it, transparency needs, post-certification monitoring, and certification fraud. Results reveal key differences between the two groups. Lay users perceive AI certification more positively than experts. Both groups prefer independent certifiers, with experts being more open to certification by private companies. Experts favor post-certification monitoring tied to system updates, whereas lay users prefer annual checks. Both groups value transparency, but the specific details they require differ. Regarding fraudulent AI certification, experts emphasize technical safeguards, while lay users focus on legal enforcement. The study discusses the implications of its findings and offers several recommendations for improving AI certification schemes.2026SGSarah Abdelwahab Gaballah et al.Ruhr University BochumExplainable AI (XAI)AI Ethics, Fairness & AccountabilityPrivacy by Design & User ControlCHI
"That's another doom I haven't thought about": A User Study on AI Labels as a Safeguard Against Image-Based MisinformationAs generative AI is increasingly contributing to the spread of deceptively realistic misinformation, lawmakers have introduced regulations requiring the disclosure of AI-generated content. However, it is unclear if labels reduce the risk of users falling for AI-generated misinformation. To address this research gap, we study the effect of labels on users' perception and the implications of mislabeling, focusing on AI-generated images. We first explored users' opinions and expectations of labels using five focus groups. Although participants were wary of practical implementations, they considered labeling helpful in identifying AI-generated images and avoiding deception. Second, we conducted a survey with 1,354 participants to assess how labels affect users' ability to recognize misinformation. While labels reduced participants' belief in false claims supported by AI-generated images, we found evidence of overreliance, leading to unintended side effects: Participants were more susceptible to false claims accompanied by human-made images, and were more hesitant to believe true claims illustrated with labeled AI-generated images.2026SHSandra Höltervennhoff et al.CISPA Helmholtz Center for Information SecurityExplainable AI (XAI)AI Ethics, Fairness & AccountabilityDeepfake & Synthetic Media DetectionCHI
Campus AI vs. Commercial AI: Comparing How Students and Employees Perceive their University’s LLM Chatbot vs. ChatGPTAs the use of LLM chatbots by students and researchers becomes more prevalent, universities are pressed to develop AI strategies. One strategy that many universities pursue is to customize pre-trained LLM-as-a-service (LLMaaS) chatbots. While most studies on LLMaaS chatbots prioritize technical adaptations, these systems are often mainly characterized by user-salient front-end customizations, e.g., interface changes. Yet, no existing studies have examined how users perceive such systems compared to commercial LLM chatbots. In a field study, we investigate how students and employees (N = 526) at a German university perceive and use their institution's customized LLMaaS chatbot compared to ChatGPT. Participants using both systems (n = 116) reported greater trust, higher perceived privacy, and less perceived hallucinations with their university's customized LLMaaS chatbot compared to ChatGPT. We discuss implications for research on users' trustworthiness assessment process, and offer guidance for the design and deployment of LLMaaS chatbots.2026LHLeon Hannig et al.University of Duisburg-EssenHuman-LLM CollaborationExplainable AI (XAI)AI-Assisted Decision-Making & AutomationCHI
Women Security Experts Are Not The Enemy: A Qualitative Study on Gender-Related Communication ChallengesEffective communication is crucial for meeting security needs, yet gender-related communication challenges faced by women security experts within software development remain underexplored. In an interview study with 25 women security experts, we investigated gender-related communication challenges hindering the adoption of security requirements, and strategies to overcome these. Key challenges included the undervaluation of women’s security expertise, communication barriers, resistance to women’s security-related suggestions, and instances of hostility. Communication challenges with stakeholders who were men disrupted team collaboration, resulting in delays, weakened security measures, and increased organizational risk. Consequently, women security experts often had to adopt strategies, such as leveraging allied men and overpreparing, to assert their security competence. We further offer insights into women’s participation in security studies. Based on our findings, we provide recommendations on how to address gender-related challenges.2025AYAsli Yardim et al.Ruhr University BochumGender & Race Issues in HCITechnology Ethics & Critical HCICHI
The TaPSI Research Framework - A Systematization of Knowledge on Tangible Privacy and Security InterfacesThis paper presents a comprehensive Systematization of Knowledge on tangible privacy and security interfaces (TaPSI). Tangible interfaces provide physical forms for digital interactions. They can offer significant benefits for privacy and security applications by making complex and abstract security concepts more intuitive, comprehensible, and engaging. Through a literature survey, we collected and analyzed 80 publications. We identified terminology used in these publications and addressed usable privacy and security domains, contributions, applied methods, implementation details, and opportunities or challenges inherent to TaPSI. Based on our findings, we define TaPSI and propose the TaPSI Research Framework, which guides future research by offering insights into when and how to conduct research on privacy and security involving TaPSI as well as a design space of TaPSI.2025SRSarah Delgado Rodriguez et al.University of the Bundeswehr MunichPrivacy by Design & User ControlPasswords & AuthenticationPrivacy Perception & Decision-MakingCHI
Bridging the Gap Between Usable Security Research and Open-Source Practice — Lessons From a Long-Term Engagement With VeraCryptVeraCrypt is a freely available open-source encryption tool popular with tech-savvy users. In a 4-year effort to improve VeraCrypt’s usability to reach less tech-savvy users, we conducted 3 user studies (N=77) and found that participants struggled to successfully encrypt their devices with VeraCrypt. We iteratively redesigned the UI and instructions and suggested significant usability improvements to the VeraCrypt community. Since 7 professional developers struggled to compile the project, we created a step-by-step compilation guide and contributed 5 pull requests for bug fixes and interface improvements. However, our efforts to translate academic findings into practical applications were unsuccessful. In this work, we explore why our usability improvements failed. Due to code complexity and a lack of transparency, the OS community was concerned our changes could undermine security. Based on our findings, we provide recommendations for researchers collaborating with open-source communities.2025FRFelix Reichmann et al.Ruhr University BochumPrivacy by Design & User ControlPrivacy Perception & Decision-MakingResearch Ethics & Open ScienceCHI
Security Knight in Shining Armor: What and Who VPN Providers Claim to Shield Consumers AgainstConsumer virtual private network (VPN) providers promise online security and privacy by tunneling user traffic through their servers. However, there is a growing disparity between the users' perceptions of achievable security and privacy and the actual limitations of such services. In a large-scale, multi-step mixed methods study, we holistically investigated the degree to which 78 consumer VPN providers support or undermine proper mental models for their products and services. We collected search queries from 300 participants - coming from five countries across four continents - to identify suitable VPN providers and, subsequently their security and privacy promises. Among VPN providers’ statements, a large share contains misleading or false information, and more than half do not mention any threat agent at all. Our results extend the current research on consumer VPNs and provide a more realistic, holistic, and accurate overview of information on VPN provider websites.2025FRFelix Reichmann et al.Ruhr University BochumPrivacy by Design & User ControlPrivacy Perception & Decision-MakingCHI
ReverSim: An Open-Source Environment for the Controlled Study of Human Aspects in Hardware Reverse EngineeringHardware Reverse Engineering (HRE) is a technique for analyzing integrated circuits. Experts employ HRE for security-critical tasks, like detecting Trojans or intellectual property violations, relying not only on their experience and customized tools but also on their cognitive abilities. In this work, we introduce ReverSim, a software environment that models key HRE subprocesses and integrates standardized cognitive tests. ReverSim enables quantitative studies with easier-to-recruit non-experts to uncover cognitive factors relevant to HRE. We empirically evaluated ReverSim in three studies. Semi-structured interviews with 14 HRE professionals confirmed its comparability to real-world HRE processes. Two online user studies with 170 novices and intermediates revealed effective differentiation of participant performance across a spectrum of difficulties, and correlations between participants’ cognitive processing speed and task performance. ReverSim is available as open-source software, providing a robust platform for controlled experiments to assess cognitive processes in HRE, potentially opening new avenues for hardware protection.2025SBSteffen Becker et al.Ruhr University Bochum; Max Planck Institute for Security and PrivacyExplainable AI (XAI)Computational Methods in HCICHI
Exploring the Impact of Intervention Methods on Developers’ Security Behavior in a Manipulated ChatGPT StudyIncreased AI use in software development raises concerns about AI-generated code security. We investigated the impact of security prompts, insecure AI suggestion warnings, and the use of password storage guidelines (OWASP, NIST) on the security behavior of software developers when presented with insecure AI assistance. In an online lab setting, we conducted a study with 76 freelance developers who completed a password storage task divided into four conditions. Three conditions included a manipulated ChatGPT-like AI assistant, suggesting an insecure MD5 implementation. We found a high level of trust in AI-generated code, even when insecure suggestions were presented. While security prompts, AI warnings, and guidelines improved security awareness, 32% of those notified about insecure AI recommendations still accepted weak implementation suggestions, mistakenly considering it secure and often expressing confidence in their choice. Based on our results, we discuss security implications and provide recommendations for future research.2025RSRaphael Serafini et al.Ruhr University BochumExplainable AI (XAI)Algorithmic Transparency & AuditabilityCHI
A Qualitative Study of Adoption Barriers and Challenges for Passwordless Authentication in German Public AdministrationsPublic administrations provide critical services and manage sensitive data for a country's citizens. Recent phishing campaigns targeting public sector employees highlight their attractiveness as targets. Deploying state-of-the-art authentication technologies, such as FIDO2, can improve overall security. We conducted a mixed-methods study in Germany to understand better the practices and challenges of deploying passwordless authentication in the public sector. First, we conducted an online survey (N=108) among German public sector employees to gain insights into their experiences and challenges. Next, we partnered with an e-government vendor and performed an in-situ experiment. We let 11 employees from the public sector experience FIDO2 under real-world conditions. Our results show that only a minority of our participants were aware of current passwordless authentication procedures. In our experiment, FIDO2-based methods left an overall positive impression. Hierarchical and heterogeneous public sector structures and the need for more technical expertise and equipment were barriers to adoption.2025JHJan-Ulrich Holtgrave et al.CISPA Helmholtz Center for Information SecurityPasswords & AuthenticationPrivacy Perception & Decision-MakingCHI
A Comparative Long-Term Study of Fallback Authentication SchemesFallback authentication, the process of re-establishing access to an account when the primary authenticator is unavailable, holds critical significance. Approaches range from secondary channels like email and SMS to personal knowledge questions (PKQs) and social authentication. A key difference to primary authentication is that the duration between enrollment and authentication can be much longer, typically months or years. However, few systems have been studied over extended timeframes, making it difficult to know how well these systems truly help users recover their accounts. We also lack meaningful comparisons of schemes as most prior work examined two mechanisms at most. We report the results of a long-term user study of the usability of fallback authentication over 18 months to provide a fair comparison of the four most commonly used fallback authentication methods. We show that users prefer email and SMS-based methods, while mechanisms based on PKQs and trustees lag regarding successful resets and convenience.2024LLLeona Lassak et al.Ruhr University BochumPasswords & AuthenticationPrivacy Perception & Decision-MakingCHI
Do You Need to Touch? Exploring Correlations between Personal Attributes and Preferences for Tangible Privacy MechanismsThis paper explores how personal attributes, such as age, gender, technological expertise, or "need for touch", correlate with people's preferences for properties of tangible privacy protection mechanisms, for example, physically covering a camera. For this, we conducted an online survey (N = 444) where we captured participants' preferences of eight established tangible privacy mechanisms well-known in daily life, their perceptions of effective privacy protection, and personal attributes. We found that the attributes that correlated most strongly with participants' perceptions of the established tangible privacy mechanisms were their "need for touch" and previous experiences with the mechanisms. We use our findings to identify desirable characteristics of tangible mechanisms to better inform future tangible, digital, and mixed privacy protections. We also show which individuals benefit most from tangibles, ultimately motivating a more individual and effective approach to privacy protection in the future.2024SRSarah Delgado Rodriguez et al.University of the Bundeswehr MunichPrivacy by Design & User ControlPrivacy Perception & Decision-MakingCHI
Self-Efficacy and Security Behavior: Results from a Systematic Review of Research MethodsAmidst growing IT security challenges, psychological underpinnings of security behaviors have received considerable interest, e.g. cybersecurity Self-Efficacy (SE), the belief in one’s own ability to enact cybersecurity-related skills. Due to diverging definitions and proposed mechanisms, research methods in this field vary considerably, potentially impeding replicable evidence and meaningful research synthesis. We report a preregistered systematic literature review investigating (a) cybersecurity SE measures, (b) SE’s proposed roles, and (c) intervention approaches. We minimized selection bias by detailed exclusion criteria, interdisciplinary search strategy, and double coding. Among 174 cybersecurity SE studies (2010-2021) from 18 databases with 55,758 subjects, we identified 173 different SE measures with considerable differences in psychometric quality and validity evidence. We found 276 variables as assumed causes/outcomes of cybersecurity SE and identified 13 intervention designs. This review demonstrates the extent of methodological and conceptual fragmentation in cybersecurity SE research. We offer recommendations to inspire our research community toward standardization.2024NBNele Borgert et al.Ruhr University Bochum, Ruhr University BochumPrivacy Perception & Decision-MakingCybersecurity Training & AwarenessCHI
Out-of-Device Privacy Unveiled: Designing and Validating the Out-of-Device Privacy Scale (ODPS)This paper proposes an Out-of-Device Privacy Scale (ODPS) - a reliable, validated psychometric privacy scale that measures users’ importance of out-of-device privacy. In contrast to existing scales, ODPS is designed to capture the importance individuals attribute to protecting personal information from out-of-device threats in the physical world, which is essential when designing privacy protection mechanisms. We iteratively developed and refined ODPS in three high-level steps: item development, scale development, and scale validation, with a total of N=1378 participants. Our methodology included ensuring content validity by following various approaches to generate items. We collected insights from experts and target audiences to understand response variability. Next, we explored the underlying factor structure using multiple methods and performed dimensionality, reliability, and validity tests to finalise the scale. We discuss how ODPS can support future work predicting user behaviours and designing protection methods to mitigate privacy risks.2024HFHabiba Farzand et al.University of GlasgowPrivacy by Design & User ControlPrivacy Perception & Decision-MakingCHI
Understanding Users' Interaction with Login NotificationsLogin notifications intend to inform users about sign-ins and help them protect their accounts from unauthorized access. Notifications are usually sent if a login deviates from previous ones, potentially indicating malicious activity. They contain information like the location, date, time, and device used to sign in. Users are challenged to verify whether they recognize the login (because it was them or someone they know) or to protect their account from unwanted access. In a user study, we explore users' comprehension, reactions, and expectations of login notifications. We utilize two treatments to measure users' behavior in response to notifications sent for a login they initiated or based on a malicious actor relying on statistical sign-in information. We find that users identify legitimate logins but need more support to halt malicious sign-ins. We discuss the identified problems and give recommendations for service providers to ensure usable and secure logins for everyone.2024PMPhilipp Markert et al.Ruhr University BochumPrivacy by Design & User ControlPasswords & AuthenticationCHI
I see an IC: A Mixed-Methods Approach to Study Human Problem-Solving Processes in Hardware Reverse EngineeringTrust in digital systems depends on secure hardware, often assured through Hardware Reverse Engineering (HRE). This work develops methods for investigating human problem-solving processes in HRE, an underexplored yet critical aspect. Since reverse engineers rely heavily on visual information, eye tracking holds promise for studying their cognitive processes. To gain further insights, we additionally employ verbal thought protocols during and immediately after HRE tasks: Concurrent and Retrospective Think Aloud. We evaluate the combination of eye tracking and Think Aloud with 41 participants in an HRE simulation. Eye tracking accurately identifies fixations on individual circuit elements and highlights critical components. Based on two use cases, we demonstrate that eye tracking and TA can complement each other to improve data quality. Our methodological insights can inform future studies in HRE, a specific setting of human-computer interaction, and in other problem-solving settings involving misleading or missing information.2024RWRené Walendy et al.Ruhr University Bochum, Max Planck Institute for Security and PrivacyEye Tracking & Gaze InteractionVisualization Perception & CognitionCHI