Analyzing responses on a rating scale can pose a challenge because there is no universal standard in measuring each component. We do not know the variation in severity from “strongly disagree” and “disagree.” There is no standard, measurable or even equal distance between “strongly disagree” to “strongly agree.”
How effective is the Rating Scale?
While testing in India, we have found using rating scales quite ineffective.
1. Many participants have never rated anything before. Despite not grasping the concept after patient explanation, they urge moderators to get on with it. Of course they don’t want to appear daft especially if a female moderator is present. The result is that we get inaccurate ratings, contrary to the observed behavior.
This is quite a threat to objective reporting because; numbers are trusted more than the moderator’s word. Let me give you a sample scenario.
Moderator: Could you tell me how easy it would be to forward a message using this mobile phone?
User: But I haven’t used the mobile phone yet.
Moderator: Yeah. But even before using the mobile phone, how would you rate it.
User: How can I rate it without using it first!
Moderator: Think of it like this. You see the promos of “Singh is King” (a Bollywood movie) you know whether it is likely to be a hit or a flop. You guess what kind of movie it might be. Also, you see there are some reviews and ratings on how many stars it has got. In the same way, even before using the mobile, how much would you rate it?
User: ok. I would rate it “Neither easy nor difficult”
2. Participants blame themselves for not completing tasks. After taking 15 minutes to complete a task,(as opposed to 4 minutes assigned) the participant is asked to rate the task completion on the mobile.
Moderator: Having completed the task how would you rate “forwarding messages” using this mobile phone?
User: It is easy to use. I was not able to find it. I have learned to do it now. Next time onwards it will be easier.
Moderator: We are trying to understand how your experience of forwarding the message was and how difficult or easy it was to complete it the first time.
User: I know. But it is easy on the mobile phone. It is I who couldn’t find it. Once I located where the forward message option is, I could easily do the task.
Moderator: So you feel you could not locate the “forward message” on the mobile easily.
User: It is easy on the mobile madam. I was not able to find it!
Needless to say, the user rates the experience favorably.
3. Participants give their ratings taking into account their family and neighbors. Here’s a scenario.
Moderator: Having completed the task how would you rate using this mobile phone?
User: Well it was easy for me to do because I am educated, but if my father or mother were to use this mobile, it would be very difficult to do.
Moderator: We are testing your experience; we would like to take into account only your experiences.
User: But when you are testing a mobile phone, you should account for everybody. How will it be any good otherwise?
4. Participant ratings often do not comply with their experience. Following example should give you an idea.
Moderator: Could you tell me how easy it will be to forward a message using this mobile phone?
User: I think it will be difficult. I will give it 15.
After task completion, user is asked,
Moderator: Having forwarded the message, how would you rate the ease of forwarding a message?
User: It wasn’t as difficult as I thought, actually it was quite easy. I would rate it a 14.
Hope you’ve noticed how the distance between “difficult” and “quite easy” is so slim! These are real responses and not just coined for the purpose of this article.
5. A universal complaint against rating scales is users avoid giving extreme responses. This is partially true with Indians. While users easily rate extreme positive responses, they never rate the negative extreme.
6. India is a country where people’s perception of self, means a lot. People care about how they seem/appear to others. It is highly likely that they would rate the mobile phone’s performance favorably because they want to appear favorable.
Should Ratings be used In India at all?!
This is a tricky question; I would say it depends on who you are testing. If the product is aimed at people in the SEC A and perhaps B, using rating scales could be a good tool to get information. People can discern the difference between testing mobile phones and testing themselves.
This is not to say that people from other SEC are un-intelligent. There are many socio-cultural factors that come into play when performing a usability test. When testing for SEC C, D and E, it is wiser to rely on observation and few probing questions than having participants learn the concept of rating and then relying on this data. With SECS C, D and E social desirability, acquiescence biases are highly likely to feature in responses.
Are there benefits to using ratings scales?
Of course there are! Having participants rate their experiences and preferences is an effective way to understand to gather large portion of responses in the short span of one hour. This helps gather data that is not observable in the test laboratory, for example, past usage patterns. It is most effective because it gives users a chance to introspect and give shape to their experience with the test device.
Here is a sample conversation:
Moderator: You have interacted with this website; could you tell me your experience with the ’sign in’ process?
User: hmm….it was ok. I was able to do it.
However, there is no way to quantify “ok.” Asking participants to rate their performance gives them a chance to evaluate their responses. It instructs them to quantify the degree of success of the test object at satisfying their goal.
Moderator: Having interacted with the website, how would you rate the following statement:
I could complete this task easily?
User:

This post has been viewed 629 times.






2nd April, 2009 at 7:42 am
Really interesting post, Sneha. My knowledge of research is basic. So bear with me.
Do you come across users who are rate everything low because they are difficult to please or those who rate everything high because they see the positive in everything? What role does the users’ personality play in ratings?
Since you have the users performing a task before they rate, it is more effective that the strongly agree and disagree questionnaires.
2nd April, 2009 at 8:45 am
Thanks Archi! Yes…there are people who are very hard and very easy to please. During a study a user who was in the teaching profession, was very hard to please giving thought and justification before rating.
The user’s personality plays a big role in how they rate, but somewhere we need this personality and individuality to kick in. Else, everybody would have the same experiences with the device.
The problem arises when there is disparity in what is experienced and how this experience is rated. Especially when users say something is “very difficult” and give it 10 points out of 20 and something as “very easy” after having committed hazaar errors.
26th March, 2010 at 11:13 am
[...] research. RSS. Second Life. Shock2008. Social software. Socialisation. study. Surveys. T&L …learnability matters Rate it!The earlier article speaks of the different types of gathering user information. Do socio-cultural [...]