It is often beneficial to be able to hear what someone is thinking while they carry out a task such as buying a product on a website. It provides an extra dimension of user understanding in addition to seeing what actions they’re taking and what they’re looking at.

Concurrent verbal protocol (think aloud protocol)

The concurrent verbal protocol is carried out at the same time as the task is being undertaken. This will often feel unnatural to the person speaking and can therefore slow them down and affect gaze patterns, so it is not ideal if the time it takes to carry out the task is an important metric or eye tracking is being used. It does ensure you hear what they’re thinking at the point they’re thinking it, which is ideal for those tell tale “I can’t see how to reach X page <pause of a few seconds> oh there it is. Silly me, it was there all along” comments which can indicate the link was poorly worded or in an unexpected location on the page for example.

Retrospective verbal protocol

This is carried out after the task has been completed and entirely separate to it. The immediate benefit is that is will not impact on performance or time on a task and will not create unexpected gaze patterns which would affect eye tracking data. The main drawback is that you no longer get an immediate vocalisation of a thought, so you can lose helpful clues. Depending on how long after the task is completed the person is asked to speak their thoughts and whether or not they are provided with a visual cue (either the original environment such as a website or the video of their own actions), the quality and accuracy of what they say will be affected as their memory will fade quickly and they will automatically create fresh interpretations of what they were doing which may results in false information being recorded.