Green Belt Case study
Six Sigma Green Belt Book 5 | Module 5
©2020 Bisk Education, Inc. and Villanova University. All rights reserved.
No part of this document may be reproduced in any form or by any electronic or mechanical means,
including information storage and retrieval systems, without written permission from the copyright owner.
Company, product, and service names used herein may be trademarks of their respective owners and
are used in an editorial fashion with no intention of infringement of the respective owner’s trademark
rights. The information in this study guide is distributed on an “as is” basis, without warranty. Neither
the copyright owner nor the author(s) shall have any liability to any person or entity with respect to any
actual or alleged damage caused by the information contained herein.
Permission to print this document is limited to one copy per student.
Six Sigma Green Belt | 3
Table of Contents
Introduction ������������������������������������������������������������������������������������������������������ 4
Objectives ��������������������������������������������������������������������������������������������������������� 4
Assignment Checklist ����������������������������������������������������������������������������������������� 4
Correlation and Scatter Plots ��������������������������������������������������������������������������� 5
Performance Indicators ����������������������������������������������������������������������������������� 7
Types of Data ����������������������������������������������������������������������������������������������� 11
Levels of Measurement ��������������������������������������������������������������������������������� 16
Operational Definitions and Target Types ��������������������������������������������������������� 18
Tollgate Review: Performance Indicator Identification ���������������������������������������� 20
Gage Repeatability and Reproducibility (R&R) ������������������������������������������������ 20
Tollgate Review: Data Collection Plan ������������������������������������������������������������ 23
Data Collection �������������������������������������������������������������������������������������������� 23
Baseline Performance I �������������������������������������������������������������������������������� 25
Baseline Performance II ������������������������������������������������������������������������������� 28
Process Capability Analysis ��������������������������������������������������������������������������� 29
Tollgate Review: Baseline Performance Measurement �������������������������������������� 33
Conclusion to the Measure Phase ������������������������������������������������������������������ 33
Six Sigma Green Belt | 4
Module 5 Introduction Week 5 continues the discussion on the Measure phase of DMAIC, starting with more graphical data
representation tools� You will then be taught about the three major categories of collecting data, the types
of data that can be collected, and the levels of measurement used in correlation with that data� You will
be introduced to operational definitions and target types, and an overview of the performance indicator
identification and data collection tollgate review meetings will be provided� Baseline performance will
also be discussed�
Objectives • Apply a Pareto analysis�
• Define your Data Collection Plan�
• Interpret histograms�
• Describe the Motorola Shift�
• Calculate the Process Performance, Pp and Ppk, based on the current process�
Assignment Checklist � ____________________________________
� ____________________________________
� ____________________________________
� ____________________________________
� ____________________________________
� ____________________________________
� ____________________________________
� ____________________________________
� ____________________________________
� ____________________________________
� ____________________________________
� ____________________________________
� ____________________________________
� ____________________________________
Six Sigma Green Belt | 5
Correlation and Scatter Plots
Introduction
In order to improve a specific performance metric, we must find the upstream variables that influence
that metric� To accomplish this task, we need methods that allow us to evaluate the influence one factor
might have over a critical-to-quality variable� Correlation analysis is one of the statistical methods we
use�
Correlation analysis attempts to measure the association between numerical variables� That is, the
result of correlation analysis is a quantitative measure of the strength of association between numerical
variables� However, before we compute a correlation measure, we first check the data graphically with
a scatter plot to examine the nature (if any) of the association� A scatter plot (or scatter diagram) is a
graph that displays the relationship between two variables�
Example: suppose a professional service company is concerned the wait time (minutes) for an important
service is related to the number of complaints they receive� They were able to group the typical wait
time into several typical wait times and total the number of complaints received for that group’s wait
time� The data and scatter plot are shown below�
Wait Time Complaints
8 17
12 25
16 33
20 38
24 45
28 57
Figure 1
Figure 2
Six Sigma Green Belt | 6
The scatter plot clearly shows that the number of complaints increases as the wait time increases� That
is, there is a positive association between wait time and number of complaints�
The scatter plot allows us to perform an initial inspection of the data� If the variables are negatively
associated, then as one of the variables increases, the other variable decreases� If the two variables
lack much of an association, then the scatter plot will exhibit a random pattern�
Correlation Analysis
Again, correlation analysis attempts to measure the association between numerical variables� Typically,
for a suspected linear correlation, the Pearson Correlation Coefficient is used� The measure is named for
Karl Pearson (a British statistician)� The sample Pearson correlation coefficient (rxy) for two variables
x and y is computed as follows�
Note that n is the sample size, sx is the sample standard deviation for x, and sy is the sample standard
deviation for y. ∑ is the symbol for “sum.”
The Pearson sample correlation coefficient varies between -1 and +1� Values close to -1 or +1 indicate
a strong linear relationship, while values close to 0 indicate a weak linear relationship�
Example: consider our previous examples concerning wait times and the number of complaints� First,
let’s organize our data so that the calculation will be more direct�
Wait Time, x Complaints, y xy
8 17 136
12 25 300
16 33 528
20 38 760
24 45 1080
28 51 1428
Sums 108 209 4232
Std. Deviations 7.483 12.592
Figure 3
Six Sigma Green Belt | 7
The sample Pearson correlation coefficient is 0�998 (almost 1) which indicates a very strong positive
correlation�
Performance Indicators There are three categories of ways in which a project team can gather knowledge about customers’
needs and requirements�
Sources of VOC
• Direct – you directly involved with the customer, observe directly�
• Indirect – comes from a party internally such as sales or customer service�
• Third party – industry experts, associations, competitor data, market research�
Direct
In the direct category, you are directly in contact with the customers:
• Conduct interviews�
• Participate in focus groups�
• Survey customers�
• Direct observation�
• Become the customer�
• Various scorecards�
Indirect
In the indirect category, data comes from an internal third party:
• Complaints department�
• Service department�
• Salespeople�
Six Sigma Green Belt | 8
Third Party
In the third party category, data comes from outside the company:
• Industry experts that publish information about customer groups�
• Industry associations�
• Competitor data�
• Market research�
Advantages and Disadvantages
• Each category has its own advantages and disadvantages�
• Always weigh your pros and cons and mix up your approaches�
Source Advantages Disadvantages
Direct • Firsthand information
• Target groups
• Face-to-face contact
• Costs
• Groupthink
• Response rate
Indirect • Immediate resolution
• Explicitly expressed concerns
• Direct communication
• Dissatisfaction not expressed
Third-Party • Competitor data
• Research data
• Reach more people
• Cost of consultants
• Consultants’ timeline
• Too much data
Figure 4
Matching Needs and Requirements
• Customers’ needs are equivalent to a process output�
• Customer needs and requirements do not always match up�
- Requirements are supposed to identify a specific characteristic of an output�
- Clarify that the need and the requirement actually match up�
Six Sigma Green Belt | 9
Figure 5
Feedback Challenges
There can be challenges to receiving feedback:
• Instead of feedback, customers sometimes give solutions to issues that you are not ready to solve�
• Sometimes needs go unspoken; these are called latent needs.
- Customers may not even know their needs�
- Presumptions of your product/service may not match what the customer needs/requires�
• You may run into competing critical to quality characteristics�
- Salespeople want fast and easy while risk management needs a lot of information�
Actionable List of Measures
• What kind of data do we really need to develop the ALoM?
• What the customers want based on their true needs regardless of what your process is capable of
delivering�
• Don’t bias your data collection with your present process capability�
• Customers’ perception of your process performance compared to your competitors�
Six Sigma Green Belt | 10
SIPOC
• Start your search for customers with your high-level process mapping tools, such as a SIPOC�
- The “C” in SIPOC stands for “customer.”
- This will help you find a list of key internal and external customers�
- The map relates the customers to specific processes your project has targeted�
Figure 6
Leading and Lagging Indicators
• Leading and lagging indicators are types of measures�
- Leading indicators are things you can change in order to get better results in your lagging
indicators�
- Lagging indicators cannot be changed, so focus more on leading indicators�
- We tend to focus more on the lagging indicators and wonder why we can’t change the lagging
indicators�
Six Sigma Green Belt | 11
Twofers
• There are three areas you need data for
• Customers require quality – use two effectiveness measures
- Inputs from the suppliers – effectiveness
- Outputs – effectiveness
- Process – efficiency
• In each of these areas, there are two types of measures; that’s why they are called twofers�
- There are two measures for each data area�
Conclusion
In this section, we explored where project teams can get information, pros and cons for each approach,
challenges with feedback, alignment of requirements with customer needs, and actionable measures�
Types of Data
Qualitative vs. Quantitative Statistics
Six Sigma Green Belt | 12
Figure 7
Qualitative data, or the quality of something, can be described as:
• The divisions between sources or categories of data�
• A good example of these are the 6Ms�
Figure 8
When we start talking about quantitative data, now we’re starting to talk about divisions of actual
values:
• Name�
• Order�
• Interval�
• Origin�
Six Sigma Green Belt | 13
Figure 9
• Attribute data are things that are counted�
- This is discrete data�
- The answer to these questions are either “Yes or no,” “Go or no go,” or “Counted data.”
• Variable data include things that are measured
- Here, we are talking about continuous measurements such as distances, heights, or time�
Measurement Scale - Examples
Four types – nominal, ordinal, interval and ratio�
• Nominal scale is color�
• Ordinal scale can be days of the week since they follow a sequence�
• Interval scale is time�
- It shares the same properties as nominal or ordinal data but does not have a defined starting
point�
• Ratio data can be distance, temperature, or weight�
- It has a fixed starting point�
Six Sigma Green Belt | 14
Independence
• If two events are independent, the outcome of one particular event does not impact the outcome
of another�
• If they’re dependent, then the outcome of one does impact the outcome of another�
Figure 10
Mutual Exclusivity
• Means two or more events cannot coincide�
• Example – rolling a die – it cannot be two numbers at the same time�
Figure 11
Six Sigma Green Belt | 15
Mutually Exclusive, Complementary
Must be one outcome or the other�
Figure 12
Mutually Exclusive, Non-Complementary
Figure 13
Conclusion
In this section, we saw examples of qualitative and quantitative statistics, explored measurement
scales, and understood the meanings of mutually exclusive and independent probabilities�
Six Sigma Green Belt | 16
Levels of Measurement
Introduction
Understanding the nature of your data and how to represent it can affect the types of statistical tests
that you can do, and so we’re going to look at four levels of scales� These are:
• Nominal�
• Ordinal�
• Interval�
• Ratio�
Nominal Scale
• Has the least number of options�
• Consists of categories:
- Names, labels, telephone numbers, street addresses, social security numbers�
• Cannot be arranged in an ordering scheme�
• No arithmetic operations are performed for nominal data�
• Possesses no order, distance, or origin�
Ordinal Scale
With ordinal scale, data is:
• Arranged in some order, but differences between data values either cannot be determined or are
meaningless
• Ordered but with no distance or origin�
• Examples include:
- Movie ratings – PG, X, R�
- Course grades – A, B, C, D, F�
- Horse race – 1st place, 2nd place, 3rd place�
Six Sigma Green Belt | 17
Interval Scale
With interval scale, data can be:
• Arranged in some order and for which differences in data values are meaningful�
• Arranged in an ordering scheme and differences can be interpreted�
• Arranged by order and distance, but without origin�
• Examples include:
- Time between calendar dates�
- Average temperatures�
Ratio Scale
With ratio scale data can be:
• Ranked and serve as that which all arithmetic operations including division can be performed�
• An absolute zero, and a value of zero indicates a complete absence of the characteristic of interest�
• Arranged by order, distance, and origin�
Examples include:
• Speed of a jet�
• Crime rates�
• Production rates�
Scale Central Location Dispersion Significance Test
Nominal Mode Information Only Chi-square
Ordinal Median Percentages Sign or Run Test
Interval Arithmetic Mean Standard or Average Deviation t Test
f Test
Correlation Analysis
Ratio Geometric or Harmonic Mean Percent Variation (same as usual)
Figure 14
Conclusion
Discussed nominal, ordinal, interval and ratio measurement scalesand the different ways that the
central tendency, location, and significance can be measured for each�
Six Sigma Green Belt | 18
Operational Definitions and Target Types
Operational Definitions
When I’m working with teams, I stress the importance of operational definitions� Take the example of
door-to-balloon time in a hospital setting� The definition of door-to-balloon time is the amount of time
that passes between when a patient arrives at the hospital and when they get treatment� (Here, the
term “balloon” refers to, in the angioplasty process, the little balloon that rises and pushes the blockage
apart�)
Could that definition be refined in order to measure it over time and see if it is getting better or not?
For example, what door are we talking about? The word “door” could refer to many things: the door to
the emergency helicopter, the ambulance door, the front door of the hospital, or the emergency room
door. If two people are measuring this, and they have different understandings of what “door” refers
to, there could be measurement error� We need to operationally define things ahead of time in order to
avoid confusion and prevent the introduction of variation into the process� Ultimately, an operational
definition isn’t the most exact definition – it’s the definition that everyone can agree to�
Types of Targets
When we are measuring things, there are different types of targets� No matter what the target type is,
you want as little variation around that target as possible�
Larger Is Better
• The target could be infinity or 100%�
- When talking about tires on a car, the target would be infinity because we want the tires to last
for an infinite number of miles�
- When talking about attendance, the target would be 100% because we want 100% attendance�
Smaller Is Better
• Zero is the target�
- If we’re measuring contamination in a chemical bath, we would want zero contaminants�
Nominal Is Best
• The target is some middle value�
- Kicking a field goal right down the middle of the goal posts�
- Landing a plane in the middle of a runway�
Six Sigma Green Belt | 19
To illustrate that, imagine driving down a single-lane mountain road� There are no cars coming the
other way, but there are no guard rails on either side� You could drop off 1,000 feet on one side or you
could scrape against the wall of rock on the other side� The rework side is the rock wall – if you scrape
it, you could rework your car and fix it� The drop off is the scrap side – if you go off the edge, there’s
no coming back� Which side are you likely to favor? Most people would probably favor the rework side
over the scrap side� Instead of thinking about the lower spec and the upper spec, the most marginally
accepted product or service measurement, look down the middle of the road with as little variation
around that as possible� This should all be part of the operational definition ahead of time� What are
we measuring? Is larger better? Is smaller better? Or is some middle value best?
Real-Life Examples
You can apply these concepts in your everyday life� For example:
• At a bakery, will you choose the fresh, hot bread that just came out of the oven, or the bread
that came out of the oven an hour ago? Most people would want the bread right out of the oven,
meaning zero time has passed�
• If you’re buying milk, will you choose a carton with a pull date that is sooner or later? Even though
the label says it is good for seven days after the pull date, many people will try to find one with a
date that is as far out as possible�
• If there are 15 registers at the front of the store, which one will you pick? You will most likely pick
the one with the shortest line�
Scenarios
• Is the payment of an invoice a larger-is-best, smaller-is-best, or a nominal-is-best scenario? You
don’t want to pay it too fast or too late, so in this scenario the target is some ideal amount of time
in the middle�
• Regarding salary, is a larger, smaller, or nominal target best? If you are an employee, you would say
larger is better� If you are in top management, larger is not best because then you give up some of
your profits� Management doesn’t want to pay too little either, because then employees will leave�
So, there is this ideal salary for each position that falls somewhere in the middle�
• What about departure times in an airport? Larger is better, smaller is better, or nominal is best?
Maybe they are one or two minutes late every now and then, but typically you would want that to
be at zero with as little variation as possible�
• What is the target for playing an 18-round golf game? Many people would say par, however, the
ideal target for 18 holes of golf is a hole-in-one in every hole� A goal, however, is different, because
Six Sigma Green Belt | 20
people do better than par� So, par is not really a target – it’s what you’re gravitating toward� If larger
is better, for example, you’re gravitating toward 100 percent or you’re gravitating toward infinity�
Is it likely you’re going to get there? No, but that’s what you’re pulling toward� So, if you have an
average golf score of 115, next year you could have a goal of getting it down to 100, but ultimately
you’re still gravitating toward that even smaller score�
Conclusion
Often, the decision of whether larger is better, smaller is better, or nominal is best, is in the eye of the
beholder� However, we always want as little variation around that target as possible�
Tollgate Review: Performance Indicator Identification At the performance indicator identification tollgate review meeting, the sponsor will want the following
information:
• What you are going to measure�
• A clear explanation of the performance measures you chose and why they are good choices�
- Do not focus on outcome measures; instead, focus on upstream measures, leading indicators,
etc�
If data already exists for the measures that you suggest, bring some of that information with you to the
meeting� If the organization doesn’t exactly have what you are looking for but they have things that
are close to it, the sponsor may want to know why you are suggesting that the team gather new data
instead of using the information that is sort of like what you want� So, be prepared to answer that at
the meeting� If the data is not available, you will have to figure out how to get it by the next tollgate,
but try not to make promises about gathering the data until you put the data collection plan together�
Gage Repeatability and Reproducibility (R&R)
Introduction
The purpose of Gage Repeatability and Reproducibility study (Gage R&R) is to discover whether the
variation caused by the measurement system, Measurement System Analysis (MSA), is small enough
that when we measure something with it, we are seeing mostly the true variation�
Six Sigma Green Belt | 21
Definitions
A Gage R&R study addresses two questions about the measurement system:
1� What percentage of the variation in the measurements is due to the measurement system?
2� How precise is my measurement system?
Precision
The extent to which the measurement system is subject to spread� Precision is divided into repeatability
and reproducibility�
Reproducibility
Measurement of the variation due to differences between operators�
Repeatability
The remaining variation if all circumstances, such as measurement instrument, person, and location
are kept equal for each of those repeated measurements�
Accuracy
How close the measurement is to a standard or a true value� In order to determine the accuracy of the
measurement system, the measurement is compared to a specific value from an instrument that is
designated as the standard, or master gage�
Bias
A systematic measurement error� The difference between the average of multiple measurements on
the same object and the reference value� Bias can be corrected by calibration of the measurement
equipment�
Gage R&R
A Gage R&R checks:
• Repeatability and Reproducibility of the measurement system, not just the gage itself�
• It answers the question, “what percentage of the variation in the measurements is due to the
measurement system?”
• How precise is my measurement system?
• It does not tell you how accurate the measurement system is�
Six Sigma Green Belt | 22
How to Conduct a Gage R&R
Most people follow the automotive industry’s model for conducting Gage R & R studies:
Ten Parts are each measured
Three Times (Three Trials) by
Three Operators
For a total of 10x3x3 = 90 measurements
It is possible do a smaller study with only 5 parts, measured again 3 times by 3 operators, but the
results will not be as reliable or, as we say, statistically significant� Or we could include only one
operator, but if we do, we would not get any information about reproducibility�
After collecting the data for all the parts, trials, and operators, the data is analyzed usually by one of
two methods:
• Xbar and R�
• Analysis of Variance (ANOVA)�
The actual calculations for these methods are beyond the scope of this class, but it is important to
remember the main point we are trying to find out if the variation caused by the gage and operators
(Repeatability and Reproducibility) is adequate for the tolerance needed�
P/T Ratio
The P/T ratio is they key outcome from a Gage R&R� It checks the precision of the measurement
system against the total tolerance allowed from our customer for whatever the part or service is we are
providing�
A P/T ratio gives us:
• 0�10 or less, which is good
• 0�30 or greater, which is not acceptable
• In between we have to make arrangements�
Six Sigma Green Belt | 23
Tollgate Review: Data Collection Plan At the data collection plan tollgate review meeting, the sponsor will want the following information:
• A plan that tells the sponsor exactly what data you are going to collect, how you are going to collect
it, and how much of it you are going to need�
• An estimate of the cost and time that it will take to go get that data�
Give the actual data collection plan worksheet to the sponsor ahead of time� Be prepared to give a clear
and concise explanation of each measure at the review meeting�
If the sponsor finds anything unreasonable or not acceptable within the data collection plan, you
may need to revisit the performance indicator identification tollgate� You might want to come to the
data collection plan tollgate meeting prepared for that and have a backup recommendation ready to
go, especially if some of the data is going to be expensive, time-consuming to gather, or disruptive to
business�
Data Collection
Introduction
In Six Sigma, Y = f(X)� Good data helps us understand the potential of the X’s on the overall Y� When
we have narrowed down the X’s to a vital few, we need to be able to validate them with data�
Collecting data in a deliberate fashion helps us:
• Understand the baseline performance of our process�
• Measure the magnitude and frequency of our defects or issue�
• Ensure we can sustain the gains or quality improvements�
How Much Data to Collect
Some considerations when deciding how much data to collect include:
1� Finances can restrict how much data you can realistically collect�
2� Time may also be a factor in how much data you can collect for any given project�
3� Governmental agencies and/or regulatory groups�
Six Sigma Green Belt | 24
Formulas are available to help determine a good minimum sample size, and you should consult your
Black Belt, Master Black Belt, or a statistician if you have any concerns or questions regarding sample
size�
Benefits of a Data Collection Plan
Data collection plans:
• Ensure you are collecting only what data is needed�
• Prioritize the collected data�
• Determine how often to collect the data�
• Determine how much and where in the process is the data to be collected�
Two additional considerations in any data collection plan:
• Destructive and nondestructive production of a data point: is what we are measuring going to be
destroyed in the process or not?
• Return on investment: how much improvement of our output variable or project Y are we going to
get for this data point?
Granularity
There are two guidelines that help us determine how many data point we need, and which you should
use depends on what type of data we have�
• For continuous data, 30 data points�
• For discrete data, 100 data points�
Data Collection Plan
The data collection plan should be easy to read and understand� How you actually set up the data
collection plan is up to you and your team� However, every data plan should:
• Define all the columns�
• Ensure that it presents the opportunity to get the right kinds of data in the right amounts (appropriate
granularity)�
• Identify what data is already available through the document control�
• Consider whether the data is discrete or continuous�
- This will help in the Analyze phase�
Six Sigma Green Belt | 25
Using Images
• When possible, convert check sheet data into a picture�
• Use a Pareto chart for discrete data; it prioritizes causes for you�
• Use cycle time information as continuous data; this shows frequency or distribution patterns in the
data�
Conclusion
Use data to support your decision making. Choose substance over “gut feelings” and you will get the
information needed to improve processes�
Baseline Performance I
Introduction
Baseline quality performance focuses on ways to measure the starting (or baseline) quality performance
for your situation� There are multiple ways of looking at baseline quality performance�
1� Defects per unit (DPU) looks at total defects over total units produced� It is a simple metric for
assessing the cost of poor quality, however DPU does not account for varying complexity�
2� Rolled throughput yield (RTY) looks at the yields of a process multiplied together� RTY only accounts
for defective products or services, not if a product has multiple defects per process step�
3� Defects per million opportunities (DPMO) accounts for the opportunities for failure� DPMO allows
an organization to compare products or services of varying complexity across the same process
steps�
Defects Per Unit (DPU)
Defects per unit (DPU) is the total number of defects found within a specific period of time divided by
the number of units produced in that time�
DPU = Total Number of Defects / Number of Units Processed
Example: Mortgage Applications
How many mortgage application processing defects were there for one client in the past month? During
that time, 85 applications were processed� There were 15 W-2s with the incorrect year, 11 pay stubs
that weren’t recent enough, eight incomplete lists of debts, and four 1099 forms went missing�
DPU = (15 + 11 + 8 + 4) / 85 = 38 / 85 = 0�45
Six Sigma Green Belt | 26
Rolled Throughput Yield
Rolled Throughput Yield (RTY) is the probability that a product or service will go through all the steps
of the process without a defect� Rolled throughput yield is equal to the first pass yield (FPY) of the first
process step multiplied by the first pass yield of the second process step, so on and so forth for all
process steps� It accounts for both scrap and rework�
Example
RTY = FPY1 × FPY2 × ��� FPYn
Three Processes
• Process A = 20 Units / 4 Defective = 0�80 = 80% FPY
• Process B = 20 Units / 2 Defective = 0�90 = 90% FPY
• Process C = 20 Units / 1 Defective = 0�95 = 95% FPY
Therefore, RTY = 0�80 × 0�90 × 0�95 = 0�684 = 68�4%�
DPU vs. RTY
Key differences between defects per unit and rolled throughput yield include:
• DPU considers how many total defects existed in a given number of units�
• If multiple defects can exist on one unit, the calculated DPU can be greater than one�
- RTY only considers that a unit was defective, regardless of how many defects existed�
• A unit with one defect can have the same impact on RTY as a unit with 10 defects�
When to Use One vs. the Other
• Use DPU when the cost of rework is high and multiple defects per unit are possible�
• Use RTY if there are checks and yields from multiple process steps, and if multiple defects per unit
are rare and not impactful to the cost of quality�
Drawbacks to Using Defects Per Unit and Rolled Throughput Yield
• They do not consider the complexity of the product or service being analyzed�
• The more complex a product or service is, the worse its DPU or RTY will be when compared to
similar products that are less complex�
• Only considering DPU or RTY may cause an organization to choose the wrong priority when trying
to decide for which services or products to deploy resources for improvement�
Six Sigma Green Belt | 27
Defects Per Million Opportunities
Defects per million opportunities (DPMO) is a good metric to use for more complex products and
services�
The formula:
DPMO = Total Defects in Sample / (Total Opportunities × 1,000,000)
DPMO is something like a percent, but 1,000,000 is better for low levels of defects� To get from
percent to DPMO, multiply by 10,000� For example, 0�08%= 800 DPMO�
Benefits of Using DPMO as a Metric
DPMO:
• Provides numbers that are easy to discuss�
• Makes it easier to compare different products or industries�
Calculating DPMO
What are the steps for calculating defects per million opportunities?
1� Determine the number of defects (D).
2� Determine the number of units processed or services offered (U).
3� Determine opportunities per unit (OP).
4� Calculate the total opportunities (TOP).
5� Divide the total defects by the total opportunities (DPO).
6� Multiply that answer by one million�
Example: Gold Wedding Bands
A gold wedding band might have opportunities for defects regarding size, purity, and weight� Therefore,
each ring could have three opportunities for defects per unit�
• If we find a total of 150 defects, then D = 150�
• Assuming we have 10,000 wedding bands, then U = 10,000�
• Opportunities = OP = 3 (size, purity, weight)�
• Total opportunities = TOP = 3 × 10,000 = 30,000�
Six Sigma Green Belt | 28
• Defects per opportunities = DPO = 150 / 30,000 = 0�005�
• Defects per million opportunities = DPMO = 0�005 × 1,000,000 = 5,000�
Conclusion
This lecture discussed:
• Defects per unit (DPU)�
- Total defects over total units produced�
- Simple metric for assessing the cost of poor quality�
- Doesn’t account for varying complexity�
• Rolled throughput yield (RTY)�
- The yields of a process multiplied together�
- Only accounts for defective products or services, not if a product has multiple defects per
process step�
• Defects per million opportunities (DPMO)�
• Accounts for the opportunities for failure�
• Allows an organization to compare products or services of varying complexity across the
same process steps�
Baseline Performance II
Calculating Sigma
There are multiple approaches to calculating sigma:
• Using a Z-score table (which is covered in greater detail at the Black Belt level)�
• A Sigma table is provided in this course material� There are two kinds of Sigma tables:
- With a 1�5 sigma shift – Six Sigma is 3�4 defects per million (preferred)�
- Without a 1�5 sigma shift – Six Sigma is around two defects per billion�
• A Sigma calculator can be found online or in some statistics programs�
• The 1�5 sigma shift is a way to practically account for the fact our process will produce more
defects in the long run than we will probably measure over the short time of our project�
Six Sigma Green Belt | 29
Example: Defects Per Million Units
A call center monitors employee performance and listens to 500 calls� They are only listening for
whether or not the customer is given the correct answer, so this is a defects per unit problem� They find
75 wrong answers, or defects, over the course of 500 calls�
1� Defect rate = 75 / 500 = 0�15 = 15%�
2� DPMU = 0�15 × 1,000,000 = 150,000�
3� Use the Sigma table�
- One column shows defects per million units and the next column shows Sigma�
- Large numbers appear at the top and decrease until it reaches Six Sigma (3�4 defects per
million units)�
- Find a number slightly larger than 150,000 and a number slightly less than 150,000�
- In between those numbers, we’ll see 150,000 defects per million units equals somewhere
between 2�5 and 3 Sigma�
Example: Defects Per Million Opportunities
For the same call center, we can now look at total service� 500 calls are monitored and 75 defects are
found in the areas of answer speed, courtesy of service, and correct response�
1� Opportunities for a defect = 500 × 3 = 1,500�
2� Defect rate = 75 / 1500 opportunities = 0�05 = 5 percent�
3� DPMO = 0�05 × 1,000,000 = 50,000�
4� Use the Sigma table to find that 50,000 defects per million opportunities is somewhere between
3 and 3�5 sigma�
Process Capability Analysis
Introduction
Process capability analysis consists of a set of metrics that assess the propensity of a stable (in-control)
process to meet the stated requirements� Clearly, in the end, we want processes to meet our operational
definition of quality: that is, minimum deviation from the appropriate target� The way we establish
specifications and performance requirements for our processes (and our products and services) is a
result of our understanding of the voice of the customer� Our ongoing monitoring of the critical-to-quality
Six Sigma Green Belt | 30
characteristics provides actionable direction for our improvement efforts� Our objective is to match the
voice of the customer with the voice of our process so that we meet the customer’s requirements (see
the graph below)�
Figure 15
Process capability analysis provides us with a method to measure the extent to which our processes do
or do not meet our requirements (and, so, whether we do or do not meet the expectations of the voice
of the customer as we have deployed it)�
Process Capability Metrics
Over time, several metrics were developed in process capability analysis� We typically work with them
in two groups: (1) Cp and Cpk, and (2) Pp and Ppk� In general, Cp and Cpk are computed from the
results of a control chart and Pp and Ppk are computed using all the individual data values collected�
Example: Cp and Cpk. Suppose we are monitoring the time (in minutes) for the process for completing
a specific task with an Average and Range Control Chart (Control Charts are discussed in Module 6)�
Let’s assume the Average and Range Chart shows the process to be stable with average of 8�9 minutes
and a process standard deviation (sp) of 1�77� Our experience indicates the customer is likely to be
dissatisfied if the task takes longer than 11 minutes� Moreover, our own experience with this process
leads us to believe employees make more mistakes if they attempt to complete the process in less than
7 minutes� So, the organization decided to use an Upper Specification Limit (USL) of 11 minutes and
a Lower Specification Limit (LSL) of 7 minutes�
Cp measures the ratio of the allowed tolerance to the estimated process variation (using 6 times the
process standard deviation, or sp)�
Six Sigma Green Belt | 31
The Cp of 0�38 indicates the estimated process variation (the denominator) is much larger than the
allowed tolerance (numerator)� So, we know the process will not consistently meet our requirements
because the process variation is too large�
Also, the Cp does not consider where the process is centered or how close the center (average) might
be to a specification limit� To provide that information, we use the Cpk�
Cpk enhances Cp by considering the average of the process relative to the specification limits�
Note that Cpk is a minimum of two calculations�
Note that the Cpk is close to the Cp since the process average is close to the center of the tolerance�
However, the Cpk is also poor because the process contains too much variation�
Technically, the Cpk must be at least 1�0 for us to say we have a process that is (barely) capable� Many
organizations set a higher minimum Cpk (for example, 1�33)�
This process is not capable of meeting the specifications that have been established (that is, a USL of
11 and a LSL of 7)� See the graph below�
Figure 16
Six Sigma Green Belt | 32
Pp measures the ratio of the allowed tolerance to the estimated process variation (using 6 times the
sample standard deviation, or s)� The sample standard deviation (using the all the data available) for
this example is 1�63�
The Pp of 0�41 also indicates the estimated process variation (the denominator) is much larger than the
allowed tolerance (numerator)� So, we know the process will not consistently meet our requirements
because the process variation is too large�
Ppk enhances Pp by considering the average of the process relative to the specification limits (again,
using the sample standard deviation, s)�
Note that Ppk is a minimum of two calculations�
Again, the Ppk is close to the Pp since the process average is close to the center of the tolerance� Also,
the Ppk is also poor because the process contains too much variation�
Technically, the Ppk must be at least 1�0 for us to say we have a process that is (barely) capable� Many
organizations set a higher minimum Ppk (for example 1�50)�
As before, this process is not capable of meeting the specifications�
Remember, the objective of capability analysis is to assess whether or not our process is capable of
meeting our requirements� The two sets of metrics above (Cp, Cpk and Pp, Ppk) consider two different
aspects of the process data� The difference between them is the approach each uses to capture the
variation in the process� It is good practice to calculate both sets of metrics�
Six Sigma Green Belt | 33
Tollgate Review: Baseline Performance Measurement At the baseline performance management tollgate review meeting, the sponsor will want the following
information:
• Charts and graphs that depict the measure(s), the target, and difference between the two�
• This might involve calculating the actual sigma measurement and/or conducting a capability study
by measuring process capability, Pp and Ppk�
• Any necessary information to make some final updates to the charter�
• An up-to-date version of the charter�
Conclusion to the Measure Phase
Tollgates of Measure
The measurement phase requires that the team has completed the following:
1� Selected process performance indicators�
- You know what needs to be measured�
- Many process improvement efforts struggle or fail merely because they’re not measuring the
right things�
2� Created the Data Collection that includes (at a minimum)�
- Type of data�
- Amount of data needed�
- How you’ll collect it�
- Created operational definitions�
3� Established and measured baseline performance�
- Established a target and a gap�
- Calculated Sigma�
Six Sigma Green Belt | 34
The Next Phases of DMAIC
There are several possible outcomes as you conclude the measure phase:
1� You may discover the problem is much worse than originally believed�
2� If this is the case, you may want to reconsider the scope of the project or ask for additional resources�
3� The data may suggest there’s not actually a problem�
4� You can stop the project and apply resources to another opportunity that needs improvement�
5� You will proceed to the next phases of DMAIC�
Six Sigma Green Belt | 35
Notes
- Introduction
- Objectives
- Assignment Checklist
- Correlation and Scatter Plots
- Performance Indicators
- Types of Data
- Levels of Measurement
- Operational Definitions and Target Types
- Tollgate Review: Performance Indicator Identification
- Gage Repeatability and Reproducibility (R&R)
- Tollgate Review: Data Collection Plan
- Data Collection
- Baseline Performance I
- Baseline Performance II
- Process Capability Analysis
- Tollgate Review: Baseline Performance Measurement
- Conclusion to the Measure Phase