Usability testing of a healthcare chatbot: Can we use conventional methods to assess conversational user interfaces?

William Holmes, Anne Moorhead, RR Bond, Huiru Zheng, Vivien Coates, Mike McTear

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

92 Citations (Scopus)
2735 Downloads (Pure)


Chatbots are becoming increasingly popular as a human-computer interface. The traditional best practices normally applied to User Experience (UX) design cannot easily be applied to chatbots, nor can conventional usability testing techniques guarantee accuracy. WeightMentor is a bespoke self-help motivational tool for weight loss maintenance. This study addresses the following four research questions: How usable is the WeightMentor chatbot, according to conventional usability methods?; To what extend will different conventional usability questionnaires correlate when evaluating chatbot usability?; And how do they correlate to a tailored chatbot usability survey score?; What is the optimum number of users required to identify chatbot usability issues?; How many task repetitions are required for a first-time chatbot users to reach optimum task performance (i.e. efficiency based on task completion times)? This paper describes the procedure for testing the WeightMentor chatbot, assesses correlation between typical usability testing metrics, and suggests that conventional wisdom on participant numbers for identifying usability issues may not apply to chatbots. The study design was a usability study. WeightMentor was tested using a pre-determined usability testing protocol, evaluating ease of task completion, unique usability errors and participant opinions on the chatbot (collected using usability questionnaires). WeightMentor usability scores were generally high, and correlation between questionnaires was strong. The optimum number of users for identifying chatbot usability errors was 26, which challenges previous research. Chatbot users reached optimum proficiency in tasks after just one repetition. Usability test outcomes confirm what is already known about chatbots - that they are highly usable (due to their simple interface and conversation-driven functionality) but conventional methods for assessing usability and user experience may not be as accurate when applied to chatbots.
Original languageEnglish
Title of host publicationECCE 2019 Proceedings of the 31st European Conference on Cognitive Ergonomics
Subtitle of host publication''Design for Cognition''
PublisherAssociation for Computing Machinery
Number of pages8
ISBN (Electronic)9781450371667
ISBN (Print)978-1-4503-7166-7
Publication statusPublished (in print/issue) - 10 Sept 2019
Event31st European Conference on Cognitive Ergonomics: Design for Cognition - Belfast, United Kingdom
Duration: 10 Sept 201913 Sept 2019

Publication series

NameProceedings of the 31st European Conference on Cognitive Ergonomics
PublisherAssociation for Computing Machinery


Conference31st European Conference on Cognitive Ergonomics
Abbreviated titleECCE 2019
Country/TerritoryUnited Kingdom
Internet address

Bibliographical note

Had confirmation from Raymond that no Embargo applies as far as he is aware.

Publisher Copyright: © 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM. Copyright: Copyright 2019 Elsevier B.V., All rights reserved.


  • Usability Testing
  • Chatbots
  • Conversational UI
  • UX Testing


Dive into the research topics of 'Usability testing of a healthcare chatbot: Can we use conventional methods to assess conversational user interfaces?'. Together they form a unique fingerprint.

Cite this