CASE STUDY 3
By Joel Smith, Dr Pepper Snapple Group
Before you perform any statis-tical analysis, it is important that you determine whether your measurement system can provide you with trustworthy data. Another article could be written on the ill effects that a poor measurement system can have on an analysis, but for now, let’s agree that trustwor- thy data are a very important part of data analysis.
Measurement system analysis is composed of many different tools to evaluate your measurement system, but the most common tool of choice in lean Six Sigma and most quality settings is gage repeatability and reproducibility (R&R) studies.
Why do I need a gage R&R study?
A gage R&R study helps you inves- tigate the sources of variation in your data and how much of that variation is from the measurement system. See Figure 1.
The standard gage R&R study, suggested by the Automotive Industry Action Group and adopt- ed by virtually all lean Six Sigma programs, analyzes the variation when three operators measure 10 parts, two times each, for example.
By including multiple operators, parts and replicates as a designed experiment, you can estimate how much of the variation is truly between parts, and how much of the variation is due to measure- ment error. Measurement error can be split into:
• Repeatability—Error that occurs when multiple measure- ments are taken on the same part by the same operator.
• Reproducibility—Error due to differences between operators, such as some operators tending to generally measure higher or lower than others, or differences in how operators measure specific parts compared to other operators.
Selecting operators and parts for your study
Selecting operators is hopefully straightforward: You start with a certain number of people who perform these measurements, and from these people, you randomly select three.
Selecting parts would seem to be just as straightforward. Yet often, sampling schemes that are used
can result in inaccurate results. When I posed the question of
selecting parts for an adequate study to about 100 attendees at a presentation at the 2015 ASQ World Conference on Quality and Improvement, at least two-thirds of the audience reported selecting parts in a way that I know produces bad results.
So, how can we know whether a certain scheme for sampling parts worked well? The answer is simulation. By creating simulated experiments on a computer, we can know the true underlying variation between parts and the variation due to the measurement system. Thus, we can experiment with different situations and know how accurately and precisely we estimated the components of variation.
Unlike real life, where we per- form the experiment and estimate these values without knowing where the true underlying value in the process is, simulations dem- onstrate the distribution of results we are likely to see in real life given a particular scenario.
To compare various scenarios, 1,000 simulated gage R&R studies were performed for each scenario. The metric used to assess the qual- ity of the measurement system is “percentage contribution.”
For example, I perform the stan- dard gage R&R study 1,000 times with three randomly selected oper- ators and 10 randomly selected parts measured two times each. If I use a true value for “percentage contribution of 5%” (the midpoint of the marginal range), Figure 2 shows the distribution of values from my simulations.
RANDOM THOUGHTS
Selecting Parts for a Gage R&R Study
20 I F E B R U A R Y 2 0 1 7 I W W W . A S Q . O R G
Figure 1. Gage repeatability and reproducibility
Total variation
Part to part Measurement
error
Repeatability Reproducibility
Figure 2 demonstrates that in the standard experiment, there is quite a bit of variation around the true value of 5%. In fact, about 17% of the time someone who does this experiment in real life will classify the measurement sys- tem as poor (the percentage con- tribution > 9%), despite the true value being right in the middle of the marginal range.
Before diving further, it is worth noting that if you do not supply a historical value for the part-to-part variation, it is estimated from your data. If a historical estimate of total variation is available for the mea- surement system being evaluated, you should use that value rather than relying on the experiment for the estimation of the part-to-part variation. If this is your situation, the discussion in this article is no longer relevant.
Comparing schemes for selecting parts
An entire book could be written on whether the standard experi- ment is best or whether other alternatives should be used. The purpose of this column, however, is to simply focus on one aspect of the experiment: how the parts are selected.
In my career, I have heard of several schemes, but three come up most frequently and will be the focus of my results, along with a fourth theoretical one. The schemes are:
1. Randomly selected parts—In this scheme, the parts that are used for the study are randomly selected from available parts over time, in the hopes that they will represent the true distribution of parts produced.
2. Perfectly selected parts—This
scheme is not possible in real life, but rather to show the effect of ran- domness in the random scheme. In the perfect scheme, parts are selected at equal percentiles of the true underlying distribution of parts, such that the mean and standard deviation precisely match that of the population.
3. Uniformly selected parts—In this scheme, parts are selected at equally spaced intervals across the range of the specifications. This helps ensure a more accurate esti- mate of the part-to-part variation.
4. Extremely selected parts—In this scheme, eight of the 10 parts are selected randomly, but one part is purposely selected a little bit outside of each specification limit. The intention is to ensure the measurement system can reli- ably measure parts outside of the specifications.
Random scheme
Because the most common met- rics for assessing measurement
error—percentage contribution and percentage of study varia- tion—relate the measurement system variation to the overall variation (which includes part- to-part variation), it is important that the experiment estimates the part-to-part variation as accurately as possible. A logical way to do this is to select parts randomly over time.
Earlier, we saw the distribution of percentage contribution values seen when this was done for a mar- ginal gage. In the interest of con- ciseness, the patterns described from here forward hold true for excellent and poor gages as well. For randomly sampled parts, the distribution of percentage con- tribution will start near zero and peak at some value a little smaller than the true value, with a skewed distribution that extends far to the right.
Thus, we see that the random sampling of parts may result in quite a bit of variation around the true value.
S I X S I G M A F O R U M M A G A Z I N E I F E B R U A R Y 2 0 1 7 I 21
Figure 2. Simulated gage percentage contribution
Standard experiment—true gage = “marginal”
1.00% True 9.00%
0.00% 4.00% 8.00% 12.00% 16.00% 20.00% 24.00% 28.00%
Perfect scheme
One reason that many practitio- ners avoid the random sampling of parts is that they fear they will inadvertently pull parts that are bigger or smaller than the true distribution, or will select parts that show unusually high or low variation. In short, they are con- cerned that the distribution of just 10 parts will deviate from the true distribution in a meaningful way and not be representative.
Let’s compare a simulation with parts that perfectly match the underlying distribution of parts with the simulation from the random scheme. In other words, compare the estimates that are obtained when the parts are taken from evenly spaced percentiles of the distribution, such that the mean and standard deviation of the parts taken perfectly match the underlying population from the random scheme.
Figure 3 demonstrates that the perfect scheme also shows consid- erable variation in the estimates of percentage contribution, although it is slightly less than the variation with the random scheme. Thus, we conclude that the majority of variation in estimates is not due to parts being chosen randomly.
Of course, we cannot use the perfect scheme in real life because we can never know the true underlying distribution of parts. Secondly, because of measure- ment error, even if we wanted to sample a part of a certain true size, we can’t ever really confirm that we have done so.
Uniform scheme
Many practitioners want to make sure that their measurement sys-
tem is adequate—not just in the range where parts are being pro- duced, but also across the entire range of interest. Thus, an attempt is made to select parts at evenly spaced intervals between the speci-
fication limits (and sometimes beyond the specification limits).
Before discussing the impact of this on percentage contribution estimates, it is worth noting that just as with the perfect scheme,
22 I F E B R U A R Y 2 0 1 7 I W W W . A S Q . O R G
Figure 3. Simulated percentage contribution— randomly vs. perfectly
1.00% True 9.00%
True gage = 5% (marginal)
Randomly
Perfectly
0.00% 1.60% 3.20% 4.80% 6.40% 8.00% 9.60% 11.20%
Figure 4. Simulated percentage contribution— randomly vs. uniformly
1.00% True 9.00%
True gage = 5% (marginal)
Randomly Pp
0.5 1.0 1.5
Uniformly
0.00% 1.60% 3.20% 4.80% 6.40% 8.00% 9.60% 11.20%
RANDOM THOUGHTS
intentionally selecting parts at certain points is likely impossible as measurement error prevents us from knowing the true value anyway.
When evaluating the perfor- mance of the uniform scheme, you must consider that the parts chosen depend on the width of the specifications relative to the true distribution. Thus, this scheme is simulated for processes with P
pk
values of 0.5, 1.0 and 1.5. In Figure 4, the results are in the lower panel, with the random scheme displayed in the upper panel for comparison.
For the low-quality process with P
pk = 0.5, the sampling scheme
produces roughly similar results to the random scheme. As the quality improves, however, the estimates
shrink considerably from the true value of percentage contribution equals 5% and begin to suggest that you have an excellent mea- surement system.
In fact, for a high-quality pro- cess, this sampling scheme will result in the measurement system appearing to be excellent most of the time when in reality it is a mar- ginal measurement system.
Extreme scheme
Given that we want our parts to be representative of the underlying population, it shouldn’t be sur- prising that choosing uniformly distributed parts across the specifi- cation range rather than the actual range of the data produces poor results.
The extreme scheme, however, may allow us to have it both ways: Eight out of 10 parts are chosen randomly and just two parts are selected to ensure capability out- side of the specifications. Again, even with just two parts based on the location of the specifications, the results will vary based on the quality of the process so the same P
pk levels are evaluated. The results
are again compared to the random scheme (Figure 5).
Now, unfortunately, all three quality levels provide inaccurate results, biasing our estimates of percentage contribution to tell us the measurement system is much better than it actually is.
Which scheme?
Unless you have a historical esti- mate of the process standard deviation, only one part-selection scheme provides unbiased results, and that is random part selection. Other tools exist for accomplish- ing the intentions of the other schemes—most notably, a bias and linearity study—and those tools should be used instead. Otherwise the results of the gage R&R study are likely invalid and don’t rep- resent the true capability of the measurement system.
JOEL SMITH is director of rapid continuous improvement at Dr Pepper Snapple Group in Plano, TX. He holds a master’s degree in statistics from Virginia Tech in Blacksburg, VA. Smith is a senior member of ASQ and a former chair of the ASQ Statistics Division.
Figure 5. Simulated percentage contribution— randomly vs. extremely
1.00% True 9.00%
True gage = 5% (marginal)
Randomly Pp
0.5 1.0 1.5
Extremely
0.00% 1.60% 3.20% 4.80% 6.40% 8.00% 9.60% 11.20%
S I X S I G M A F O R U M M A G A Z I N E I F E B R U A R Y 2 0 1 7 I 23
We cannot use the perfect scheme in real life because We can never knoW
the true underlying distribution of parts.