Moderator: Now I would like to understand how you would switch this device on. Begin at the sound of the buzzer. Let me know when you have completed switching it on.
Usability testing is a combination of observing user behavior and gathering explicit quantifiable data from the users. The above scenario tests users on multiple levels.
a) How long it would take the user to switch on the device.
b) Whether the user is successful in this goal.
c) The number attempts taken and the errors performed by the user.
d) It is testing whether the device can be switched on easily.
Recording time and errors is quantitative analysis of user behavior. The machine (or software on the test device) records the time taken and perhaps the errors performed by the user. Testing how “easy” it was to switch the device on is qualitative analysis of user behavior. This is because ease of use is either the user’s or test takers subjective opinion of what has taken place.
Usability testing gathers both quantitative and qualitative data. Quantitative data is gathered by measuring the time taken to complete the task, the error rate while performing a task or the success or failure to complete a task.
Usability tests also have qualitative analysis of test devices (websites, web applications, mobile phones etc,); participants are asked their opinion about the performance of the test device, based on ease of use, its appearance etc,. Users are usually asked these questions directly or asked to fill in a questionnaire.
There are several variations in usability studies.
1. Studies that gather only quantitative data.
2. Studies that gather only qualitative data.
3. Studies that mix ‘em up.
Whether the test protocol has qualitative or quantitative test questions depends on whether the study is formative or summative in nature.
However, the primary purpose of a usability test is to observe user behavior. While it may help you quantify error rate, time taken to complete task and success rate, the most efficient use of usability testing is to understand the “whys” of user behavior.
Most usability tests gather,
a) Subjective opinion of the test object’s performance. (A test object is anything that is being studied, be it a mobile phone, website or a web application.)
b) User’s frequency of behavior.
c) User preferences
This information is usually gathered by
a) Observing user’s behavior.
b) Probing users to understand the whys behind user behavior.
c) Providing questionnaires for users to fill in and understand their responses.
What are the tools that help us gather this information?
There are various scales of measurement used in statistical analysis.
a) Nominal scale
b) Ordinal scale
c) Interval scale and Ratio scale
In user research however, Nominal and ordinal scales are most commonly used.
Nominal scale:
Nominal scale is a tool that expects user to classify data. Nominal scale is either binary options or multiple options to select from.
Example of binary nominal scale:
Please specify gender:
Male
Female
Example of multiple choice question:
Q: Select the mobile phone brand you own:
Nokia
Samsung
Sony Ericsson
Motorola
LG
Ordinal Scale:
Ordinal scale is used when we want to rate the quality of something. This is usually
a) Something the user has experienced.
b) Something the user is going to experience
There are various kinds of ordinal scales
a) Likert Scale
b) Semantic differential scale
c) Rank order scaling
Likert Scale
Likert scale is most commonly used rating scale. It is a 5 point scale which is framed such that,
1 = strongly unfavorable to the concept
2 = somewhat unfavorable to the concept
3 = Undecided
4 = somewhat favorable to the concept
5 = strongly favorable to the concept
Sample Likert Scale
1 2 3 4 5
Very Difficult Difficult Neutral Easy Very Easy
Likert helps us understand individual as well as group responses. It is effective in giving the percentage of responses while reporting. Example: “98% of the participants found it difficult to return to the home page while 2% of the participants found it “very difficult.” Analyzing and interpreting this data provide a strong case in your design direction.
Semantic differential scale
Semantic differential scales are good to understand user’s attitudes. It is bi-polar; it uses opposite adjectives at the either ends of the scale. It is beneficial because it doesn’t have a neutral “definition”, and the user rates the degree of “attractiveness” and “unattractiveness” of the website.
Having interacted with this website, how would you rate its looks?
(7)Attractive
(6)
(5)
(4)
(3)
(2)
(1) Unattractive
Rank order scaling
Rank order scaling asks participants to prioritize their preferences, likings, efficacy of the test protocol etc,
Rank the following paper prototypes in the order of your preference. Rank the prototype that you favor most as “1″ and you favor next as “2″ and so on. No two prototypes can get the same rank?
___ Prototype 1
___ Prototype 2
___ Prototype 3
___ Prototype 4
All the information gathered through ordinal scaling methods give us the relative performance of test devices (especially in benchmark studies like the example cited below) but not the magnitude of the difference(how different is a rating of 4 different from 3?)
References:
http://www.tfn.net/~polland/quest.htm
http://www.questionpro.com/tutorial/2.html
http://en.wikipedia.org/wiki/Semantic_differential
http://en.wikipedia.org/wiki/Thurstone_scale
This post has been viewed 787 times.






3rd April, 2010 at 6:22 am
[...] every Likert scale survey will include responses from positive to negative, so in those cases, …learnability matters Data Gathering in User Researcha) Likert Scale. b) Semantic differential scale. c) Rank order scaling. Likert Scale. Likert … 5 [...]