hc 12
Chapter 14 Lecture Slides
ITS-536
Why, what, where, and when to evaluate
Iterative design and evaluation is a continuous process that examines:
Why: To check users’ requirements and confirm that users can utilize the product and that they like it
What: A conceptual model, early and subsequent prototypes of a new system, more complete prototypes, and a prototype to compare with competitors’ products
Where: In natural, in-the-wild, and laboratory settings
When: Throughout design; finished products can be evaluated to collect information to inform new products
2
2
2
Bruce Tognazzini tells you why you need to evaluate
“Iterative design, with its repeating cycle of design and testing, is the only validated methodology in existence that will consistently produce successful results. If you don’t have user-testing as an integral part of your design process you are going to throw buckets of money down the drain.”
See AskTog.com for topical discussions about design and evaluation
3
3
3
Testing vs. Not Testing ROI
If you test iteratively, you reduce cost. If your testing is limited, your cost increase. If you don’t test, your product is likely to have significant failures.
Types of evaluation
Controlled settings that directly involve users (for example, usability and research labs)
Natural settings involving users (for instance, online communities and products that are used in public places)
Often there is little or no control over what users do, especially in in-the-wild settings
A study done at a park, a school, a restaurant – natural
Any setting that doesn’t directly involve users (for example, consultants and researchers critique the prototypes, and may predict and model how successful they will be when used by users
Limited interaction where the environment is pre-defined and conditions are set - controlled
6
6
6
Evaluation case studies
A classic experimental investigation into the physiological responses of players of a computer game
An ethnographic study of visitors at the the Royal Highland show in which participants are directed and tracked using a mobile phone app
Crowdsourcing in which the opinions and reactions of volunteers (for example, from the crowd) inform technology evaluation
7
7
7
Challenge and engagement in a collaborative immersive game
Physiological measures were used
Players were more engaged when playing against another person than when playing against a computer
Why does human-to-human interaction matter?
What is the problem with man vs. computer?
Why was the physiological data collected normalized?
8
8
8
Example of physiological data
9
A participants’ skin response when scoring a goal against a friend (a), and another participants’ response when when engaging in a hockey fight against a friend versus against the computer (b).
Source: Mandryk and Inkpen (2004), “The Physiological Indicators for the Evaluation of Co-located Collaborative Play,” CSCW’2004, pp 102-111. Reproduced with permission of ACM Publications.
9
9
Ethnobot app used at the Royal Highland Show
10
Source: Tallyn et al. (2018) Reproduced with permission of ACM Publications.
The Ethnobat directed Billy to a particular place (Aberdeenshire Village)
Next, Ethnobot asks “…what’s going on?”
The screen shows five of the experience buttons from which Billy needs to select a response
Directs based on context and pre-conceived questions. Makes assumptions to guide.
We call these recommendation apps but for the HCI professional this is a type of walkthrough.
Experience responses submitted in Ethnobot
11
Number of prewritten experience responses submitted by participants to the pre-established questions that Ethnobot asked them about their experiences
Does this data have anything of value? Can it help guide the user to the NBD (next best decision).
Source: Tallyn et al. (2018) Reproduced with permission of ACM Publications.
What did we learn from the case studies?
How to observe users in the lab and in natural settings
How evaluators excerpt different levels of control in the lab and in natural settings and in crowdsourcing evaluation studies
Use of different evaluation methods
How to develop different data collection and analysis techniques to evaluate user experience goals such as challenge and engagement
12
Evaluation methods
| Method | Controlled settings | Natural settings | Without users |
| Observing | x | x | |
| Asking users | x | x | |
| Asking experts | x | x | |
| Testing | x | ||
| Modeling | x |
13
Controlled gains knowledge from experiences. Results in data variation.
Natural has unknown variables; however, there is a dialogue that produces outcomes.
Without users, the expert shapes the opinion based on measured details and knowns.
13
13
The key terms used by HCI experts in evaluation
Analytics
Analytical evaluation
Biases
Controlled experiment
Crowdsourcing
Ecological validity
Expert review or criticism
Field study
Formative evaluation
Heuristic evaluation
Informed consent form
In the wild evaluation
Living laboratory
Predictive evaluation
Reliability
Scope
Summative evaluation
Usability laboratory
User studies
Usability testing
Users or participants
Validity
14
14
14
Participants’ rights and getting their consent
Participants need to be told why the evaluation is being done, what they will be asked to do and informed about their rights
The consent provides the end-user an out should they at any time want to discontinue participation. It also provides you protection in that a user knows what you and your organization will utilize the data for – no surprises
Informed consent forms provide this information and act as a contract between participants and researchers. Often this is the contract for potential compensation as well.
15
Institutional Review Board in Academia
If you EVER complete a research study of ANY type that requires the collection of human data in survey format or otherwise, a university requires you complete an IRB report as a contract between not only you and the subjects BUT you and the university. The research for all purposes, no matter how unscientific it is thought of as scientific.
Before you go in front of an IRB or conduct a research study, you need to get cleared by taking a certification to prepare you for institutional research.
Collaborative Institutional Training Initiative - CITI Program
The design of the informed consent form, the evaluation process, data analysis, and data storage methods are typically approved by a high authority, such as the Institutional Review Board
Things to consider when interpreting data
Reliability: Does the method produce the same results on separate occasions?
Validity: Does the method measure what it is intended to measure?
Ecological validity: Does the environment of the evaluation distort the results?
Biases: Are there biases that distort the results?
Scope: How generalizable are the results?
17
Time (seconds)
Friend
Computer
Goal 1.80 1.60 1.40 1.20 1.00 0.80 0.60 0.40 0.20 0.00
When Goal Scored Participant 2
Time (seconds)
Friend
Computer
Fight begin Fight end2.60
2.50
2.40
2.30
2.20
2.10
2.00
Fight Sequence Participant 9
(a) (b)
252015
17
9
0
20
16
8
1050
I tried something
I didn’t like something
I experienced something
I learned something
I enjoyed something
I bought something