SYSEN 5300 Assignment 8 / Takehome Final Factorial Design at Two Levels and Response Surface Method
Lecture 19 Slides Application of Response Surface Methods.ppt
SYSEN 5300 (5310, 5320) - Systems Engineering and Six-Sigma for Systems Reliability and Quality
Introduction System Reliability (FMEA, Fault Tree) Six-sigma & Stat. Control Six-sigma & Systems Improvement (DOE) Six-sigma & Systems Improvement (RSM)
*
SYSEN5300 Lecture 23 Design of Experiments: Application of Response Surface Methods (RSM)
*
H. Oliver Gao *
*
Overview
- Aspects of RSM
- Using RSM to improve a product design
- Simplification of a complicated response function by data transformation
- Using RSM to determine and exploit active and inert factor spaces for multiple-response data
- Using RSM to exploit inert canonical spaces
- Using RSM to move from empiricism to mechanism
*
H. Oliver Gao *
*
Iterative experimentation to improve a product design
- An identical problem faced by different experimenters
- Different factors could have been chosen for the study
- Different ranges for the factors could have been selected
- Different choices could have been made for qualitative and blocking factors
- Different transformations for the factors might have been employed
- Different responses and their metrics might have been chosen
- Different models could have been considered
- Arbitrary choices conclusions from a single experiment are doubtful an iterative sequence of experiments (scientific iteration tends to be self-correcting)
*
H. Oliver Gao *
*
Design of a paper helicopter
- Use statistical design for scientific discovery in a real investigation—sequential unfolding of a problem.
- You need to experience when you perform your own experiments using experimental design and discovering your own iterative path to a solution. “the art of investigation cannot be found just by playing with someone else’s data.
- A prototype helicopter design: the objective was to find an improved design giving longer flight times.
*
H. Oliver Gao *
*
*
H. Oliver Gao *
*
Screening Experiment
- To get some idea as to which factors might be important for increasing flight times, a fractional factorial arrangement was used for testing.
*
H. Oliver Gao *
*
Screening Experiment (cont’d)
- Explore the possibility of increasing the flight time by changing what factors along what path.
*
H. Oliver Gao *
*
Screening Experiment (cont’d)
- The linear model for estimating the mean flight times
- Contour diagram
*
H. Oliver Gao *
*
Steepest Ascent
- Construct a series of helicopters along the steepest ascent path: the factors were changed simultaneously in proportion to the coefficients of the fitted equation.
- I.e., For every increase of 28 units in x2, x3 was reduced by 13 units and x4 by 8 units. The units were the scale factors: lscale=0.875, Lscale=0.875, Wscale=0.375, which were the changes in l, L, and W corresponding to a change of one unit in x2, x3, and x4.
- A helicopter with a 4-inch wing length l was first tested on the steepest ascent path and then additional helicopters were built along this path with wing length l increased by ¾-inch increments and the other dimensions adjusted accordingly.
*
H. Oliver Gao *
*
Steepest Ascent (continued)
- Data for flight helicopters built along the path of steepest ascent
Observation: among the 5 new designs, design 3 gave the longest average flight time of 347 centiseconds—an impressive improvement with flight times increased by more than 50%.
In the practical development of a manufactured product, if a new product design had been discovered that was this much ahead of current competition, a management decision of “cash in”.
What’s next?
*
H. Oliver Gao *
*
An even better design—A sequentially assembled composite design
- Since none of the qualitative factors so far tried seemed to produce any positive effects, it was decided for the present to fix these features.
- We explore further 4 helicopter dimensions—wing length l, wing width w, body length L, and body width W.
- In addition, a discussion with an engineer led to the suggestion that a better characterization of the wing dimensions: wing area A=lw and the length-to-width ration R=l/w.
- A 24 factorial in A, R, W, and L was run with two added center points with the expectation that, if necessary, additional runs could be added to the design to allow the fitting of a second-order model
*
H. Oliver Gao *
*
Helicopter data for 24 factorial design
*
H. Oliver Gao *
*
Normal plot of coefficients
- It is evident that some two-factor interactions now approach the size of the main effects.
*
H. Oliver Gao *
*
Additional runs for a Central Composite Arrangement
Allow for the fitting of a second-order model. The added runs consisted of points placed at +2 and -2 units along each of the four axes.
*
H. Oliver Gao *
*
Additional runs for a Central Composite Arrangement (continued)
- The estimated second-order model, allowing for possible mean differences between blocks.
- 4 linear coefficient in the second line
- 4 quadratic coefficients on the third line
- 6 two-factor interaction on the final lines.
- To the right are the estimated SEs of the coefficients in that line
*
H. Oliver Gao *
*
Additional runs for a Central Composite Arrangement (continued)
- ANOVA—goodness of fit of the model
- Residual MS=9.7. Overall F ratio (207.6/9.7>20) for the fitted second-degree equation, exceeding its 5% significance level F0.05, 14, 14=2.48 by a factor of 8.6.
*
H. Oliver Gao *
*
Canonical analysis
- The fitted second-order model goes as follows:
- It had seemed likely to the experimenters that a maximum might now occur at S. However, the positive coefficient (3.27) suggests that the response surface almost certainly had a minimum at S in the direction of X3.
- It’s possible to move from point S in either dir. Of X3 to increase flight times.
*
H. Oliver Gao *
*
Canonical analysis (cont’d)
- In terms of centered variables
- Thus, beginning at S, one direction of ascent along the X3 axis would be such that for each increase in of 0.52 units would be reduced by 0.45 units, reduced by 0.45 units, and increased by 0.57 units. To follow the opposite direction of ascent, you would make precisely the opposite changes.
- Helicopters were now designed and constructed for 16 points along this axis.
*
H. Oliver Gao *
*
Canonical analysis (cont’d)
- Experimental Data Employing Canonical Factor X3
*
H. Oliver Gao *
*
Canonical analysis (cont’d)
- Characteristics of helicopters along axis X3.
*
H. Oliver Gao *
*
Helicopter Example Summary
- Key: encourage us to experience process improvement and discovery by employing our imagination in studies. We can test our ideas employing different starting points, varying different factors, and so forth. Specifically
Experience the catalysis of the scientific method obtained by the use of stat. methods
Factorial designs for screening
Follow an improvement trend with steepest ascent
Sequential assembly of a composite design by adding axial points and center points to a factorial
Unexpected surprise produced from canonical analysis
More than one good answer to a problem
- Other thoughts on improving the helicopter?
*
In this demonstration only a few of the almost limitless ideas for improving the helicopter were tested. The list of qualitative factors could be greatly extended and there are endless simple ways in which the general configuration of the design could be changed----
Split wings to give the helicopter four bldes,
Other shapes of wings and body, and so on.
One thing is for certain, had the experimenters been possessed of hindsight, they could have undoubtedly reached this answer more quickly.
H. Oliver Gao *
*
Simplification of a response function by data transformation
- Textile “lifetime” experiment data from a 33 factorial design with predicted Value from 4 models
| Factors | Levels | Coding |
| l: length of specimen | 250, 300, 350 mm | x1=(l-300)/50 |
| A: amplitude of loading cycle | 8, 9, 10 mm | x2=A-9 |
| L: load | 40, 45, 50 g | x3=(L-45)/5 |
*
H. Oliver Gao *
*
Simplification of a response function by data transformation (cont’d)
*
H. Oliver Gao *
*
Simplification of a response function by data transformation (cont’d)
Model M1: second-degree equation by least squares
Model M2: ymax/ymin=40.4, a wide range. Apply data transformation so that the model might be simplified and more relevant and interpretable. Using power transformation
*
H. Oliver Gao *
*
Simplification of a response function by data transformation (cont’d)
Lambda plot of t values
*
H. Oliver Gao *
*
Simplification of a response function by data transformation (cont’d)
Model M3: in the science, particularly in engineering, power relations are not uncommon
Model M4:
*
H. Oliver Gao *
*
Comparison of the 4 models
Conclusion: Much simpler models M2, M3, and M4 provide considerably better estimates
*
H. Oliver Gao *
*
Active and inactive factor spaces for multiple-response data
- Example: manufacture of a certain dyestuff, 3 responses were concerned: strength y1, hue y2, and brightness y3. Levels of these responses needed to be changed to meet varying customer requirements. How to manipulate the manufacture process?
- Six adjustable factors: polysulfide index x1, flux ratio x2, moles of polysulfide x3, reaction time x4, amount of solvent x5, and reaction temperature x6.
- Original experimenter knew RSM and ran a one-shot experiment containing 80 runs in a complete 26 factorial design with 12 axial points and 4 center points.
- They fit a second-order model in six factors for each of the 3 responses. Not surprisingly, they found these results impossible to interpret: the experiment was written off as an expensive failure.
- Later, more careful analysis showed that none of the second-order terms were detectably different from noise
*
H. Oliver Gao *
*
Normal Plots, 3 responses
(a) Strength
(b) hue
(c) brightness
Three of the 6 factors, x2, x3, and x5 showed no effects of any kind distinguishable from noise. They were locally inert
*
H. Oliver Gao *
*
Helping the subject matter specialist
- Many real applications with more than one response: explore to find conditions (combinations of factor levels) that provide desirable values for the responses
- To achieve these requirement, we should try to discover for each response which factors are important (active) in driving each response and which factors are inert.
- Information about the overlapping and nonoverlapping of the active and inactive factor spaces for separate responses can greatly simplify multiresponse problems.
*
H. Oliver Gao *
*
From empiricism to mechanism
- All models are approximations
- Often a mechanistic model has advantages because it may provide a physical understanding of the system (also fewer parameters, better fit)
- Sometimes an empirical model can suggest a mechanism.
- Example: find condition of the temperature (T), the concentration of one of the reactants (c), and the time of reaction (t) that maximize an intermediate product y. The factors are coded as below.
*
H. Oliver Gao *
*
From empiricism to mechanism (cont’d)
- In the canonical equation only the coefficient of the first term was distinguishable from noise.
*
H. Oliver Gao *
*
From empiricism to mechanism (cont’d)
- The canonical model offered a wide choice of operating conditions that could produce approximately the same max yield
- Other responses such as cost and amount of purity might be brought to more desirable levels
*
H. Oliver Gao *
*
The mechanistic model
- Physical chemist Philip Youle suggested a mechanistic model
Closely resembles the empirical contour plot shown earlier
*
.
,
,
3
,
,
8
13
28
223
4
3
2
4
3
2
W
and
L
l
of
factors
t
significan
the
are
x
and
x
x
where
x
x
x
y
-
-
+
=
Ù
)
78
.
0
(
13
.
2
50
.
1
63
.
4
)
78
.
0
(
38
.
4
75
.
3
88
.
2
)
61
.
0
(
16
.
0
54
.
2
66
.
1
04
.
2
)
64
.
0
(
08
.
6
25
.
0
08
.
5
08
.
0
06
.
372
4
3
4
2
3
2
4
1
3
1
2
1
2
4
2
3
2
2
2
1
4
3
2
1
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
y
-
-
+
+
-
-
-
-
-
-
-
+
+
-
=
Ù
2
4
2
3
2
2
2
1
4
~
3
~
2
~
1
~
4
4
~
3
~
2
~
1
~
3
4
~
3
~
2
~
1
~
2
4
~
3
~
2
~
1
~
1
4
4
~
3
3
~
2
2
~
1
1
~
4
3
2
1
20
.
1
27
.
3
81
.
3
66
.
4
4
.
371
72
.
0
37
.
0
58
.
0
04
.
0
57
.
0
45
.
0
45
.
0
52
.
0
39
.
0
12
.
0
50
.
0
76
.
0
07
.
0
80
.
0
45
.
0
39
.
0
12
.
0
,
84
.
0
,
33
.
0
,
86
.
0
4
.
371
,
12
.
0
,
84
.
0
,
33
.
0
,
86
.
0
X
X
X
X
y
form
Canonical
x
x
x
x
X
x
x
x
x
X
x
x
x
x
X
x
x
x
x
X
axes
of
Rotation
x
x
x
x
x
x
x
x
S
to
O
center
design
from
Shift
y
which
at
x
x
x
x
system
canonical
of
S
center
of
Position
s
s
s
s
s
-
+
-
-
=
-
-
-
-
=
+
-
-
=
+
+
-
-
=
-
+
-
=
+
=
+
=
+
=
-
=
=
-
=
-
=
-
=
=
Ù
Ù
4
~
3
~
2
~
1
~
3
54
.
0
45
.
0
45
.
0
52
.
0
x
x
x
x
X
+
-
-
º
1
~
x
2
~
x
3
~
x
4
~
x
3
2
3
1
2
3
2
2
2
1
3
2
1
1
143
236
48
276
239
311
536
660
551
x
x
x
x
x
x
x
x
x
x
y
+
-
-
+
+
-
-
+
=
Ù
0
log
0
1
=
¹
-
l
l
l
l
when
y
when
y
2
2
3
2
1
2
2
2
log
17
.
0
27
.
0
36
.
0
75
.
2
log
)
02
.
0
(
)
02
.
0
(
)
02
.
0
(
)
02
.
0
(
Ù
Ù
Ù
=
-
-
+
=
=
Y
anti
y
x
x
x
Y
y
Y
M
Model
3
3
3
3
2
1
3
2
1
log
log
5
.
3
log
65
.
5
log
95
.
4
log
log
log
log
Ù
Ù
Ù
=
-
-
+
=
+
+
+
=
¥
Y
anti
y
L
A
l
const
Y
gives
data
the
to
this
fitting
x
x
x
const
y
or
x
x
x
y
g
b
a
g
b
a
3
5
4
.
/
.
/
log
3
log
5
log
5
-
-
Ù
¥
-
-
+
=
L
G
y
function
power
a
Thus
cycle
load
the
of
amplitude
fractional
the
actually
is
l
A
critical
be
to
suggested
is
A
l
L
A
l
const
Y
n
p
is
y
s
prediction
n
the
of
Average
/
var
2
s
Ù
.
,
15
,
33
)
67
.
0
(
4
.
4
3
.
28
:
)
74
.
0
(
4
.
5
7
.
5
0
.
17
:
)
24
.
0
(
3
.
1
5
.
1
9
.
0
1
.
11
:
1
20
30
6
3
6
1
2
6
4
1
1
poss
as
high
as
y
and
y
hue
y
achieve
to
Now
x
y
brightness
For
x
x
y
hue
For
x
x
x
y
strength
For
equations
fitted
squares
Least
=
=
-
=
+
-
=
+
+
+
=
Ù
Ù
Ù
5
.
1
5
.
6
,
5
.
2
5
.
27
,
5
167
3
2
1
-
=
-
=
-
=
t
x
c
x
T
x
3
~
2
~
1
~
1
2
1
2
46
.
0
48
.
0
75
.
0
8
.
3
9
.
59
x
x
x
X
where
X
y
+
+
=
-
=
Ù
)
ln
ln
/
exp(
)
,
,
(
2
1
t
c
T
t
c
T
f
of
function
a
is
y
R
X
X
P
T
T
k
k
+
+
-
=
¾
®
¾
+
¾
®
¾
+
b
Q
SYSEN5300FA18Lecture26CompareMultipleEntities&ANOVA.pptx
11/28/18
© 2018 by Oliver H. Gao and Wenqi Yi is licensed under CC BY 4.0.
1
SYSEN 5300 Systems Engineering and Six-Sigma for the Design and Operation of Reliable Systems
Lecture 26 Compare a Number of Entities and ANOVA
Dr. Oliver H. Gao and Dr. Wenqi Yi
11/28/18
© 2018 by Oliver H. Gao and Wenqi Yi is licensed under CC BY 4.0.
2
Outline
Compare two entities
Compare a number of entities and ANOVA
Factorial design at two levels
Comparing Two Entities
Null hypothesis: Two means may be considered to be equal
Test/analysis:
t-test (unknown and equal variance) (Z test for known variance modified t-test for unknown and unequal variance)
Experimental strategies:
Physical randomization
Randomized paired (block) comparison
3
4
Example 1
4
5
Example 2
A gardener conducted an experiment to discover whether a change in fertilizer mixture would result in improved tomato yield. 11 plants set out in a single row; 5 with standard fertilizer A and 6 with improved mixture B.
5
6
Need for Randomization in Example 1
The negative autocorrelation produces a reduction in the std. by a factor of 0.7. Thus the reference distr. obtained from past data has a smaller spread than the corresponding scaled t distribution RANDOMIZED DESIGN
6
7
Physical Randomization in Example 2
11!/(5!6!)=462, 154 of the possible 462 arrangements provide differences greater than 1.69. Significance probability: 154/462=33%. No significant difference.
7
8
Example 3
10 boys’ shoes: amount of wear of the soles (standard material A and a cheaper one B)
Tests were run in pairs—each boy wore a special pair of shoes (one with A and the other with B, randomized)
Some boys scuffed their shoes more than other, however for each boy his two shoes were subject to the same treatment.
8
9
Randomized Paired Comparison Design in Example 3
Increase precision by making comparisons within matched pairs of experimental material
By working with the 10 differences B-A most of the boy-to-boy variation could be eliminated
Randomization: distribution: 2^10=1024.
A difference of 0.41 is quite unusual (3 of 1024 differences), probability below 0.5% significant increase in the wear with B
T-test?
9
10
Blocking and Randomization
A block is a portion of the experimental material that is expected to be more homogeneous than the aggregate.
By confining comparisons to those within blocks, greater precision is usually obtained because the differences associated between the blocks are eliminated.
Pairs (blocks) in time and space
Block what you can and randomize what you can not to deal with unavoidable sources of variability
10
11
Comparison, Replication, Randomization, and Blocking in Simple Experiments
Conduct experiments to assess treatment A & B
Experiments should be comparative: modified and unmodified procedures should be run side by side
Genuine replication: variation among replicates can provide an accurate measure of errors
Blocking (pairing) should be used to reduce error
Randomization planned for homogeneous errors of both A and B
None of the above will necessarily alert you to the influence of bad values look at the original data
Sensitive to violation of NIID
11
12
One- vs. Two-Sided Tests
Conventional significance level: somewhat convinced at the 5% level and fairly confident at the 1% level. Confidence Interval?
12
13
Example: boys’ shoes. (1-alpha)=95% CI for the B-A. The observed average difference in wear was 0.41, its standard error was 0.12, and there were nine DOFs. The 5% level for such a t distribution is Pr(|t|>2.262)=5%. Thus
In general, the 1-alpha CI for delta would be
Confidence Interval for Differences in Means (paired design)
13
14
Example: tomato plant. (1-alpha)=95% CI for the B-A.
In general, the 1-alpha CI for delta would be
Confidence Interval for Differences in Means (Unpaired Design)
14
15
Testing the Ratio of Two Variance
A sample of n1 observations randomly drawn from a normal distr. with variance , a second sample of n2 observations from a second normal distri. with variance
Example: inexperienced chemist 1 and experienced chemist 2,
15
Compare a Number of Entities
Null hypothesis: All means may be considered to be equal
Test analysis: Analysis of variance (ANOVA): a generalization of the t-test used to compare two entities
Experimental strategies:
Completely randomized design
Randomized block design
16
17
Comparing a number of Entities Example Blood Coagulation Time
24 animals receiving four different diets A, B, C, D. Animals were randomly allocated to the diets, and the testing was done in the random order.
Question: Is there real difference between the mean coagulation times for the 4 diets? Analysis of Variance (ANOVA) table
17
18
Analysis of Variance (ANOVA) Table
Arithmetic breakup of deviation from grand mean=64
18
19
Entries in the ANOVA Table
Sum of Squares: SD, ST, SR
Degrees of Freedom (DOF)
Mean Squares: mT, mR
19
20
Geometry and the ANOVA Table
The 24 numbers in each of the Tables D, T, and R constitute vectors D, T, and R
<T, R>=0 T is orthogonal to R
Since the vector D is the hypotenuse of a right triangle with sides T and R, by extending Pythagoras’ theorem to n dimensions, SD=ST+SR
20
21
Exercise
Each of 21 student athletes, grouped into 3 teams A, B, and C, attempts to toss a basketball through a hoop. The number of successes is given. Are there real differences between three teams? Construct an ANOVA.
21
22
One Way ANOVA, an Additive Model
The underlying model
22
23
F-test
23
24
Graphical Checks (diagnostics) on Violation of Assumptions
Assumptions (additivity, IID errors, normality, constant variance): the ANOVA is quite robust (insensitive) to moderate nonnormailty and to moderate inequality of group variances. However, it is sensitive to serial correlation if testing was not well randomized.
Graphical checks are used for examining: outliers (plotting residuals), serial correlation (randomization can nullify the serious effect of autocorrelation), constant variance across treatment groups, systematic drift occurring during the experiment…
24
25
Graphical Checks (diagnostics) on Violation of Assumptions
Residual for each diet
Residual versus estimated values
Residual in time sequence
25
26
Randomized Block Design
By general randomization the effect of noise is homogenized between treatment and error comparisons and thus validates the experiment.
Example:
penicillin
yield
By randomly assigning the order in which the four treatment were run within each blend (block), validity and simplicity were maintained while blend differences were largely eliminated.
26
27
Randomized Block Design (cont’d)
R=D-B-T, the vectors R, B, and T are mutually orthogonal.
27
28
Randomized Block Design (cont’d)
28
29
Increase in Efficiency by Elimination of Block Differences
Advantage of using the randomized block arrangement: of the total sum of squares not associated with treatments or with the mean, almost half is accounted for by block-to-block variation.
If the experiment had been arranged on a completely randomized basis with no blocks, the error variance would have been much smaller/larger?
With randomized block design these errors were considerably less: of the total of SD=560, SB=264 has been removed by blocks.
The randomized block design greatly increased the sensitivity of experiment and made it possible to detect smaller treatment differences.
29
30
Implications of the Additive Model
30
31
Diagnostic Checks
31
32
Latin Squares: more than one blocking component
Experiment example: test the feasibility of reducing air pollution by modifying a gasoline mixture with very small amounts of chemicals A, B, C, and D. These 4 treatments were tested with 4 different drivers and 4 different cars. Two block factors: drivers and cars
The latin square design was used to help eliminate from the treatment comparisons possible differences between the drivers, and between the cars
32
33
Latin Squares: more than one blocking component (cont’d)
Each treatment (A, B, C, or D) appears once in every row (driver) and once in every column (car). Randomization was used.
Advantage: a wider inductive basis for conclusion
33
34
Latin Squares: more than one blocking component (cont’d)
Conclusions? No convincing evidence for differences between the treatments, but the Latin square design has been effective in eliminating a large component of variation due to drivers
34
35
Graeco-Latin Squares
A Graeco-Latin square is a k by k pattern that permits the study of k treatment simultaneously with three different blocking variables each at k levels. Example: one extra blocking variable in car emissions
Could be used to eliminate possible differences between, say days.
35
36
Hyper-Graeco-Latin Squares in Martindale Wear Tester
Martindale Wear Tester: a machine used for testing the wearing quality of types of cloth or other such materials. Record of weight loss suffered by the test piece in one machine cycle (rubbed against a std. grade of emery paper)
Four types of cloth (treatments) A, B, C, D are mounted in four specimen holder 1, 2, 3, 4. Each holder can be in any one of four positions P1, P2, P3, P4. Each emery paper sheet alpha, bita, gama, delta was cut into four quarters.
Objective: (1) Make accurate comparison of the treatments; (2) Understand variability caused by various factors-holders, positions, emery papers, and cycles
36
37
Hyper-Graeco-Latin Squares in Martindale Wear Tester (cont’d)
The design was effective both in removing sources of extraneous variation and in indicating their relative importance.
Because of the elimination of these disturbances, the residual variance was reduced by a factor of 8
We could detect much smaller differences in treatment
37
38
Hyper-Graeco-Latin Squares in Martindale Wear Tester (cont’d)
The F-stat=5.39 with 3, 9 DOF, significant at 2% level.
By using a design which makes it possible to remove the effects of many larger disturbing factors, differences between treatment were made detectable.
The analysis identified the large contributions to the total variation due to cycles and to emery papers. This suggested improvements which later led to changes in the design of the machine.
38
39
Hyper-Graeco-Latin Squares in Martindale Wear Tester (cont’d)
In graphical analysis, position P2 gives much less wear than others, indicating a need of improvement
39
40
Balanced Incomplete Block Designs
Suppose the Martindale wear tester allowed only three samples to be included in each cycle, but you had 4 treatment A, B, C, and D to compare
You have 4 treatments but a block size of 3, too small to accommodate all the treatments simultaneously balanced incomplete block design
Property: every pair of treatment occurs together in a block the same number of times.
40
41
Youden Squares: Doubly Balanced Incomplete Block Designs
Example: comparing 7 treatment in seven blocks of size 4 (e.g., test 7 types of cloth A, B, C, D, E, F, and G, but only 4 test pieces could be compared simultaneously in a single machine cycle)
Also had the opportunity to eliminate a second source of block variation, machine positions
41
42
Principles for Valid and Efficient Experiments
Make use of the specialist’s knowledge and experience. Statistical techniques are an adjunct, not a replacement, for special subject matter expertise
Involve the people responsible for operation, testing, and sampling
Be sure that everyone knows what it is they are supposed to do and try to make certain that the experiments are run precisely as required.
Use appropriate randomization so that the effect of noise on the treatment responses and on the residual errors is homogenized
Provide suitable statistical analysis, both computational and graphical
42
29
.
0
,
10
)
1
(
2
1
/
var
1
1
2
2
-
=
=
-
+
=
´
Ù
r
r
s
s
n
n
n
C
with
n
C
but
n
not
is
y
iance
The
10
3
.
0
...
6
.
0
8
.
0
±
±
±
±
=
d
%
8
.
0
)
4
.
3
|
Pr(|
4
.
3
12
.
0
0
41
.
0
0
:
.
0
:
0
0
1
0
=
³
=
-
=
-
=
¹
-
=
=
-
=
-
t
s
d
t
H
vs
H
Testing
Hypothesis
Sided
Two
d
A
B
A
B
d
h
h
d
h
h
d
%
4
.
0
)
4
.
3
Pr(
4
.
3
12
.
0
0
41
.
0
0
:
.
0
:
0
0
1
0
=
³
=
-
=
-
=
>
-
=
=
-
=
-
t
s
d
t
H
vs
H
Testing
Hypothesis
Sided
One
d
A
B
A
B
d
h
h
d
h
h
d
27
.
0
41
.
0
262
.
2
12
.
0
41
.
0
±
Þ
<
-
d
n
n
d
d
n
s
s
where
s
t
d
n
u
u
d
d
d
v
)
1
(
)
(
,
1
2
2
2
2
1
,
-
-
=
=
´
±
å
=
a
64
.
8
69
.
1
262
.
2
82
.
3
69
.
1
.
05
.
0
)
262
.
2
|
Pr(|
.
9
)
1
(
)
1
(
.
82
.
3
69
.
1
/
1
/
1
)
(
0
0
0
±
Þ
<
-
=
>
=
-
+
-
-
-
=
+
-
-
=
d
d
d
Thus
t
dof
n
n
with
distr
t
The
n
n
s
y
y
t
A
B
A
B
A
B
)
1
(
)
1
(
)
1
(
)
1
(
,
/
1
/
1
)
(
2
2
2
2
1
,
-
+
-
-
+
-
=
+
´
±
-
A
B
A
A
B
B
A
B
v
A
B
n
n
s
n
s
n
s
where
n
n
s
t
y
y
a
)
(
2
2
2
2
s
s
)
(
2
1
2
1
s
s
)
12
,
183
.
0
(
1
2
1
2
1
=
=
v
s
s
)
9
,
062
.
0
(
1
2
2
2
2
=
=
v
s
s
?
%
6
)
95
.
2
/
Pr(
)
95
.
2
062
.
0
/
183
.
0
(
/
,
:
.
:
2
2
2
1
9
,
12
2
2
2
1
0
2
2
2
1
1
2
2
2
1
0
mean
this
does
what
s
s
F
as
d
distribute
is
s
s
H
Under
H
e
Alternativ
vs
H
Null
Testing
Hypothesis
=
³
=
=
>
=
s
s
s
s
2
4
3
2
1
2
2
2
0
:
.,
.
,
)
(
)
(
,
:
,
:
)
(
:
:
,
s
t
t
t
t
s
s
t
e
t
h
e
t
h
estimate
m
and
m
both
Then
hypothesis
Null
e
i
treatments
four
the
in
difference
no
were
there
If
m
E
m
E
IID
assumed
error
associated
t
treatment
by
produced
deviation
the
effect
treatment
mean
grand
overall
t
group
diet
in
n
observatio
ith
the
y
where
y
R
T
R
t
T
ti
t
ti
ti
t
ti
=
=
=
=
=
+
=
+
+
=
å
alone
noise
is
ator
deno
the
noise
plus
signal
is
numerator
The
F
d
distribute
be
would
m
m
F
ratio
the
that
hypothesis
null
Under
tly
independen
d
distribute
be
would
m
and
m
then
NIID
d
distribute
normally
were
the
that
assumed
further
be
could
it
If
R
T
R
T
ti
min
,
.
/
,
0
.
,
)
(
20
,
3
2
=
=
å
t
e
.
int
,
11
5
6
,
5
4
6
3
:
mod
exp
,
3
treatments
and
blocks
between
occur
to
said
be
would
eraction
an
additive
not
were
effects
treatment
and
block
the
If
be
would
together
both
of
increase
the
by
response
the
increases
block
and
increment
provide
treatment
if
Additive
el
response
ected
underlying
The
where
y
t
i
ti
ti
t
i
ti
=
+
=
+
+
=
+
+
+
=
t
t
b
h
h
e
t
b
h
Lecture 16 Slides Multiple Regression and Comparing Two Entities(1).ppt
SYSEN 5300 (5310, 5320) - Systems Engineering and Six-Sigma for Systems Reliability and Quality
Introduction System Reliability (FMEA, Fault Tree) Six-sigma & Stat. Control Six-sigma & Systems Improvement (DOE) Six-sigma & Systems Improvement (RSM)
*
Introduction to Design of Experiments: Least Squares, Multiple Regression, and Why DOE
*
H. Oliver Gao *
Why experiments?
Example 1: a process change was made. Is it an improvement? By how much?
Example 2: Car is rated at 30 mpg. Is the rating justified?
Example 3: Data are available on the performance of multiple machines. Do the machines perform alike?
*
H. Oliver Gao *
*
Experimental Error
- An operation/experiment repeated under nearly the same condition, the observed results are NEVER identical
- Experimental error: fluctuation that occurs from one repetition to another
- Sources of error include: measurement, analysis, sampling
- Awareness of the possible experimental error is essential in analysis of data AND planning the generation of the data (i.e., experiment design)
*
H. Oliver Gao *
*
Multiple Regression
*
H. Oliver Gao *
*
Multiple Regression Example (1)
- An investigator wants to determine the relationship of a key process output variable, product strength, to two key process input variables:
- Hydraulic pressure during a forming process
- Acid concentration
*
H. Oliver Gao *
*
Multiple Regression Example (2)
- Some of the entries in this output are more important than others.:
- The predictor and coef. describe the prediction model
- The p-value give the significance level for each model term (p<=0.5)
- The coefficient of determination (R2) is presented as R-Sq and R-Sq(adj). This value represents the proportion of the variability accounted for by the model. In this example, the model accounts for a very large percentage of the variability
*
H. Oliver Gao *
*
Multiple Regression Example (3)
- In the analysis of variance portion of the output the F value is used to determine an overall P value for the model fit. In this case the resulting p value of 0.000 indicates a very high level of significance.
- The regression and residual sum of squares (SS) and mean square (MS) values are interim steps toward determining the F value
*
H. Oliver Gao *
*
Least Squares Estimation, an example (1)
*
H. Oliver Gao *
*
Least Squares Estimation, an example
*
H. Oliver Gao *
*
Least Squares Estimation, an example (2)
*
H. Oliver Gao *
*
Least Squares Estimation, an example (3)
*
H. Oliver Gao *
*
Example: Multiple regression best subset analysis (1)
- Results from a cause-and-effect matrix lead to a passive analysis of factors A, B, C, and D on Thruput
- Plastic molding process: thruput response might be shrinkage as a function of the input factors temp. 1, temp. 2, pressure 1, and holt time
- We’d like to create a model that provides a good estimate with the fewest number of terms
*
H. Oliver Gao *
*
Example: Multiple regression best subset analysis (2)
- A best subsets computer regression analysis yielded
- From this output we note:
- R-Sq: look for the highest value when comparing models with the same # of predictors
- Adj. R-Sq: look for the highest value when comparing models with the same # of predictors
- Cp: Look for models where Cp is small and close to the number of parameters in the model (e.g., look for a model with Cp close to four for a three-predictor model that has an intercept constant (often we just look for the lowest Cp)
- s: We want s, the estimate of the standard deviation about the regression, to be as small as possible.
*
H. Oliver Gao *
*
Example: Multiple regression best subset analysis (3)
- The regression equation for a 3-parameter model from a computer program is
*
H. Oliver Gao *
*
Example: Indicator variables with covariate (1)
- Consider the data set, which has created indicator variables and a covariate.
- The covariate might be a continuous variable such as process temp. or dollar amount for an invoice
*
H. Oliver Gao *
*
Example: Indicator variables with covariate (2)
*
H. Oliver Gao *
*
Example: Binary logistic regression (1)
- Binary logistic regression is applicable when the response is pass or fail, and the inputs are continuous variables.
- Example: Ingots prepared with different heating and soaking times are tested for readiness to be rolled
*
H. Oliver Gao *
*
Example: Binary logistic regression (2)
- Heat would be considered statistically significant;
- Question: which levels are important?
*
H. Oliver Gao *
*
Example: Binary logistic regression (3)
- From the p chart on the right, it appears that heat at the 51 level causes a larger portion of not readys
*
H. Oliver Gao *
*
Benefits to DOE
- Koselka (1996) lists the following applications
- Reducing the rejection rate of a touch-sensitive computer screen from 25% to less than 1% within months
- Maintaining paper quality at a mill while switching to a cheaper grade of wood
- Reducing the risks of misusing a drug in a hospital by incorporating a standardized instruction sheet with patient-pharmacist discussion
- Reducing the defect rate of the carbon-impregnated urethane form used in bombs from 85% to zero
- Improving the sales of shoes by using an inexpensive arrangement of shoes by color in a showcase, rather than an expensive, flashy alternative.
- Reducing errors on service orders while at the same time improving response time on service calls
- Improving bearing durability by a factor of five.
*
H. Oliver Gao *
*
Residuals and Degrees of Freedom
In later application you will encounter examples where, because of the need to calculate several sample quantities to replace unknown population parameters, several constraints are necessarily placed on the residuals.
When there are p independent linear constraints on n residuals, their sum of squares and resulting sample variance and standard deviation are all said to have n-p DOF
*
H. Oliver Gao *
*
Student’s t Distribution
*
H. Oliver Gao *
*
Sampling Distribution of a Sum and a Difference
*
H. Oliver Gao *
*
Random Sampling from a Normal Population
- A random sampling of n observations from a normal distribution
*
H. Oliver Gao *
*
The Chi-Square and F Distribution
- Random sampling from normal distr.
- Chi-square distr. from which you can derive the distribution of the sample variance
- F-distr. from which you can obtain the ratio of two sample variances
*
H. Oliver Gao *
*
Comparing Two Entities
- Comparing two entities experimentally to decide whether the differences are genuine (statistically significant) or merely due to chance.
*
H. Oliver Gao *
*
Comparing Two Entities (cont.)
- F 3.1
- F3.2 and 3.3
*
H. Oliver Gao *
*
Comparing Two Entities (cont.)
*
H. Oliver Gao *
*
Comparing Two Entities (cont.)
The negative autocorrelation produces a reduction in the std. by a factor of 0.7. Thus the reference distr. obtained from past data has a smaller spread than the corresponding scaled t distribution RANDOMIZED DESIGN
*
H. Oliver Gao *
*
Randomized Design
- Example: A gardener conducted an experiment to discover whether a change in fertilizer mixture would result in improved tomato yield. 11 plants set out in a single row; 5 with standard fertilizer A and 6 with improved mixture B. How did he randomize?
- Fisher argued that physical randomization would make it possible to conduct a valid significant test without making assumptions of independent errors and normality. Why?
*
H. Oliver Gao *
*
Randomized Design (cont’d)
- 11!/(5!6!)=462, 154 of the possible 462 arrangements provide differences greater than 1.69. Significance probability: 154/462=33%. No significant difference.
*
H. Oliver Gao *
*
Randomized Design (cont’d)
- T-test
*
H. Oliver Gao *
*
Randomized Paired Comparison Design
- Increase precision by making comparisons within matched pairs of experimental material
- Example: 10 boys’ shoes: amount of wear of the soles (standard material A and a cheaper one B)
- Tests were run in pairs—each boy wore a special pair of shoes (one with A and the other with B, randomized)
- Some boys skuffed their shoes more than other, however for each boy his two shoes were subject to the same treatment.
- By working with the 10 differences B-A most of the boy-to-boy variation could be eliminated
*
H. Oliver Gao *
*
Randomized Paired Comparison Design (cont’d)
- Null hypothesis: B=A
*
H. Oliver Gao *
*
Randomized Paired Comparison Design (cont’d)
- Randomization
- distribution: 2^10=1024.
- A difference of 0.41 is quite unusual (3 of 1024 differences), probability below 0.5% significant increase in the wear with B
- T-test?
*
H. Oliver Gao *
*
Blocking and Randomization
- A block is a portion of the experimental material that is expected to be more homogeneous than the aggregate.
- By confining comparisons to those within blocks, greater precision is usually obtained because the differences associated between the blocks are eliminated.
- Pairs (blocks) in time and space
- Block what you can and randomize what you can not to deal with unavoidable sources of variability
*
H. Oliver Gao *
*
Comparison, Replication, Randomization, and Blocking in Simple Experiments
- Conduct experiments to assess treatment A & B
- Experiments should be comparative: modified and unmodified procedures should be run side by side
- Genuine replication: variation among replicates can provide an accurate measure of errors
- Blocking (pairing) should be used to reduce error
- Randomization planned for homogeneous errors of both A and B
- None of the above will necessarily alert you to the influence of bad values look at the original data
- Sensitive to violation of NIID
*
H. Oliver Gao *
*
One- and Two-Sided Tests
Conventional significance level: somewhat convinced at the 5% level and fairly confident at the 1% level. Confidence Interval?
*
H. Oliver Gao *
*
Example: boys’ shoes. (1-alpha)=95% CI for the B-A. The observed average difference in wear was 0.41, its standard error was 0.12, and there were nine DOFs. The 5% level for such a t distribution is Pr(|t|>2.262)=5%. Thus
In general, the 1-alpha CI for delta would be
Confidence Interval for Differences in Means (paired design)
*
H. Oliver Gao *
*
Example: tomato plant. (1-alpha)=95% CI for the B-A.
In general, the 1-alpha CI for delta would be
Confidence Interval for Differences in Means (Unpaired Design)
*
H. Oliver Gao *
*
Testing the Ratio of Two Variance
- A sample of n1 observations randomly drawn from a normal distr. with variance , a second sample of n2 observations from a second normal distri. with variance
- Example: inexperienced chemist 1 and experienced chemist 2,
*
k
k
k
k
k
k
x
x
x
x
b
x
b
x
b
b
b
b
b
e
x
x
x
s
e
x
x
x
x
x
x
...,
,
,
of
es
given valu
for
Y
of
value
predicted
the
is
Y
where
...
Y
equation
prediction
for the
)
...
,
,
(
(LSE)
Estimates
Squares
Least
the
data
from
determine
to
is
regression
multiple
in
object
The
...
Y
to
reduces
model
general
the
factors),
(or
variables
predictors
k
are
there
there
terms
polynomial
wihtout
situation
For the
DOE
in
use
great
of
is
on x
Y
of
model
quadratic
full
This
error.
random
is
e
and
parameters
unknown
are
'
Where
Y
e.g.,
,
variables
more
or
one
in
terms
polynomial
includes
model
general
A
2
1
^
2
2
1
1
0
^
1
0
2
2
1
1
0
2
1
5
2
2
4
2
1
3
2
2
1
1
0
+
+
+
+
=
+
+
+
+
+
=
+
+
+
+
+
+
=
b
b
b
b
b
b
b
b
b
b
b
1
1
0
0
^
2
1
1
0
0
1
1
0
0
y
that
so
),
S(
minimizing
by
ts
coefficien
unknown
the
estimate
to
is
)
-
(y
)
S(
model
the
from
calculated
values
the
and
values
data
the
between
ies,
discrepanc
the
fo
squares
of
sum
for the
equation
he
consider t
Now
y
be
expect to
was
ip
relationsh
the
x1,
and
x0
of
ranges
relevant
the
over
and
zero,
were
x1
and
both x0
when
zero
y was
formation
of
rate
mean
The
dimer x1
of
ion
concentrat
the
(2)
and
monomer
of
ion x0
concentrat
the
(1)
:
factors
on two
depend
y
iimpurity
e
undesirabl
an
of
formation
of
rate
initial
the
how
determine
to
experiment
an
from
data
of
set
ve
illustrati
small
a
shows
table
following
The
x
b
x
b
LSE
x
x
e
x
x
+
=
+
=
+
+
=
å
b
b
b
b
b
b
DOF
n
have
squares
their
of
sum
the
hence
residuals
n
The
y
y
residuals
the
on
constra
linear
a
s
constitute
y
y
i
1
)
(
int
0
)
(
^
^
-
-
=
-
å
)
.
.
,
1908
(
'
.
.
)
1
,
0
(
~
0
0
0
Gosset
S
W
chemist
by
on
distributi
t
s
Student
a
have
to
known
is
s
y
t
s
std
sample
with
substitute
we
Suppose
unknown
always
almost
is
practice
In
N
y
z
h
s
s
s
h
-
=
-
=
B
A
B
A
B
A
B
A
B
A
B
A
B
A
B
A
B
A
B
A
B
A
y
y
y
y
V
Y
V
y
E
y
E
y
y
E
Y
E
y
y
V
Y
V
y
E
y
E
y
y
E
Y
E
^
+
=
-
=
-
=
-
=
-
=
+
=
+
=
+
=
+
=
+
=
2
2
2
2
)
(
)
(
)
(
)
(
)
(
)
(
)
(
)
(
)
(
)
(
)
(
)
(
s
s
h
h
s
s
h
h
2
1
2
2
2
1
2
2
2
1
2
2
2
2
2
2
1
2
1
~
~
)
1
(
~
)
(
~
)
(
2
var
,
~
-
-
-
=
-
Û
-
Þ
-
Þ
-
å
å
å
n
n
n
u
n
u
v
v
u
u
n
s
s
n
y
y
y
v
iance
and
v
mean
with
z
c
s
c
s
c
s
c
s
h
c
1
,
1
2
2
2
2
2
1
2
1
,
2
2
1
2
2
2
1
1
2
1
2
1
~
~
-
=
-
=
Þ
n
v
n
v
v
v
v
v
F
s
s
F
v
v
s
s
c
c
29
.
0
,
10
)
1
(
2
1
/
var
1
1
2
2
-
=
=
-
+
=
´
Ù
r
r
s
s
n
n
n
C
with
n
C
but
n
not
is
y
iance
The
10
3
.
0
...
6
.
0
8
.
0
±
±
±
±
=
d
%
8
.
0
)
4
.
3
|
Pr(|
4
.
3
12
.
0
0
41
.
0
0
:
.
0
:
0
0
1
0
=
³
=
-
=
-
=
¹
-
=
=
-
=
-
t
s
d
t
H
vs
H
Testing
Hypothesis
Sided
Two
d
A
B
A
B
d
h
h
d
h
h
d
%
4
.
0
)
4
.
3
Pr(
4
.
3
12
.
0
0
41
.
0
0
:
.
0
:
0
0
1
0
=
³
=
-
=
-
=
>
-
=
=
-
=
-
t
s
d
t
H
vs
H
Testing
Hypothesis
Sided
One
d
A
B
A
B
d
h
h
d
h
h
d
27
.
0
41
.
0
262
.
2
12
.
0
41
.
0
±
Þ
<
-
d
n
n
d
d
n
s
s
where
s
t
d
n
u
u
d
d
d
v
)
1
(
)
(
,
1
2
2
2
2
1
,
-
-
=
=
´
±
å
=
a
64
.
8
69
.
1
262
.
2
82
.
3
69
.
1
.
05
.
0
)
262
.
2
|
Pr(|
.
9
)
1
(
)
1
(
.
82
.
3
69
.
1
/
1
/
1
)
(
0
0
0
±
Þ
<
-
=
>
=
-
+
-
-
-
=
+
-
-
=
d
d
d
Thus
t
dof
n
n
with
distr
t
The
n
n
s
y
y
t
A
B
A
B
A
B
)
1
(
)
1
(
)
1
(
)
1
(
,
/
1
/
1
)
(
2
2
2
2
1
,
-
+
-
-
+
-
=
+
´
±
-
A
B
A
A
B
B
A
B
v
A
B
n
n
s
n
s
n
s
where
n
n
s
t
y
y
a
)
(
2
2
2
2
s
s
)
(
2
1
2
1
s
s
)
12
,
183
.
0
(
1
2
1
2
1
=
=
v
s
s
)
9
,
062
.
0
(
1
2
2
2
2
=
=
v
s
s
?
%
6
)
95
.
2
/
Pr(
)
95
.
2
062
.
0
/
183
.
0
(
/
,
:
.
:
2
2
2
1
9
,
12
2
2
2
1
0
2
2
2
1
1
2
2
2
1
0
mean
this
does
what
s
s
F
as
d
distribute
is
s
s
H
Under
H
e
Alternativ
vs
H
Null
Testing
Hypothesis
=
³
=
=
>
=
s
s
s
s
Lec 17 Statistical Process Control--other useful Charts.ppt
SYSEN 5300 (5310, 5320) - Systems Engineering and Six-Sigma for Systems Reliability and Quality
Introduction System Reliability (FMEA, Fault Tree) Six-sigma & Stat. Control Six-sigma & Systems Improvement (DOE) Six-sigma & Systems Improvement (RSM)
*
Lecture 17 Statistical Process Control: Other Useful Charts
*
H. Oliver Gao *
*
Managing a Process
Monitoring, controlling, and improving a process
- Risks: risk of false alarm, risk of not detecting a process shift
- Costs: off-target products, sampling, corrective actions
This involves special circumstances not considered by the traditional variable and attribute control charts
*
H. Oliver Gao *
*
Five additional control charts
- Risk-based charts: explicitly manage the two risks of making wrong decisions
- modified limit charts: useful when it is uneconomical for frequent adjustment (high capability and adjustment cost)
- charts to detect small shifts: for rapid detection of small but sustained shifts (e.g., low capability process)
- short-run charts: the same process used to produce multiple products
- charts for nonnormal distri: departure from a normal distribution
*
H. Oliver Gao *
*
Two risks of making a wrong decision with control chart
: Probability of concluding that a process is out of control when it is not. Leads to false alarms and wasted efforts to detect a process shift when not exists
: Probability of concluding that a process is in control when it is not. Implies inability to detect process shifts when they have occurred.
- Both can be controlled by a proper selection of control limits and subgroup size
Risk-based control charts
*
H. Oliver Gao *
*
Control limits and risks
Narrower control limits?
*
H. Oliver Gao *
*
Subgroup size and risks
n increases from 1 to 4:
the Std of subgroup mean reduces by a factor of two
The control limits tighten and become half as wide
For a fixed alpha risk, increasing subgroup size reduces the bita risk
*
H. Oliver Gao *
*
By properly selecting control limits and subgroup size, any desired alpha and bita risks can be obtained
*
H. Oliver Gao *
*
Risk-Based Chart
For process mean chart, the control limits and subgroup size can be determined to meet any specified alpha and bita risks based upon a “single point outside control limits” as the out-of-control rule
*
H. Oliver Gao *
*
Example
Approximate subgroup size for 3 sigma limit process mean charts
Bita is the probability of not detecting a shift in the first subgroup after the shift
For d<1.5, a typical n of 3-5 usually won’t detect the shift
With n=5, shifts greater than 1.5 sigma can be detected with…
Larger n is necessary to immediately detect smaller shifts
*
H. Oliver Gao *
*
Detecting sustained shifts
The probability that a shift will be detected on the kth subgroup following the shift is
The expected number of subgroups to detect a sustained shift
E.g., for n=4, the probability of detecting a 1.5 sigma shift in the first subgroup is 50%, in the second is 25%, in the third is 12.5%. Avg=2
Exercise: n=6, 1.5 sigma shift
Point: If a shift is detected in the kth subgroup, it may have occurred not just in the most recent interval, but much prior to that.
*
H. Oliver Gao *
*
Product weight example
Sigma=1.074
For mean shift Delta=1.5 sigma, subgroup size=5
What is the bita risk?
How to interpret this? (sporadic vs. sustained shift)
What is the desirable subgroup size to detect a 1.5 sigma shift in one period with a bita risk of 10%
*
H. Oliver Gao *
*
X chart
Subgroup size fixed at 1
The bita risk is uncontrolled and is generally very large for a chart of individual values.
The X chart has a very limited ability to detect shifts rapidly
*
H. Oliver Gao *
*
Risk-based attribute charts
For alpha=0.3%, the sample size for p and u charts is approximately determined by
For attribute charts, very large sample sizes are required to achieve meaningfully small bita risk.
E.g., p=0.05, a shift delta=0.02, bita=20%Zβ=0.84, d= 0.02/sqrt(0.05*0.95)=0.092 n=?
If a subgroup size of 100 is used, bita=?
1742
*
H. Oliver Gao *
*
Modified control limit chart
- The usual Shewhart 3 sigma CL charts: distinguish between common and special causes; uneconomical when Cpk is high and the cost is high for corrective action
- Modified CL charts: reduce cost of corrective action while ensuring no out-of-specification product
- How? –by letting the process drift just high enough and just low enough before taking action
- Also know as acceptance control charts (Montgomery)
*
H. Oliver Gao *
*
Modified control limit chart (cont.)
- Sigma=0.002
- Shewhart CL: 0.006
- Cpk: 5.0
- Modified CLs have a different function: approximate economic action limits intended to signal the need for action
*
H. Oliver Gao *
*
Chart design
Determining economic action limits should explicitly consider
- The cost of sampling
- The cost of identifying and taking corrective actions
- The cost of off-target products
Montgomery charts provide appro. Economic action limits assuming
- The cost of identifying and correcting special causes >> the cost of off-target products
- The range chart is in control and the within-subgroup std. is constant
- Large # of obs. available to estimate the within-subgroup std., sigma is known
- The product characteristic is equally acceptable as long as it anywhere within the SLs
- Montgomery charts can not be designed in all cases (wider limits, permissible Cpk and sample size n)
*
H. Oliver Gao *
*
Chart design (cont.)
*
H. Oliver Gao *
*
Chart design (cont.)
*
H. Oliver Gao *
*
Chart design (cont.)
Minimum required Cpk to implement modified control limit charts
*
H. Oliver Gao *
*
Modified limit chart example
*
H. Oliver Gao *
*
Moving Average Control Chart
- X-bar chart with usual subgroup size can not easily detect small shifts in the mean
- To detect small shifts in the mean, use moving average (MA)
- X-bar chart: good for detecting large sporadic shifts
- MA chart: good for detecting small sustained shifts
*
H. Oliver Gao *
*
MA Control Chart (example)
*
H. Oliver Gao *
*
Short-run control charts
- Short-run: a particular product is manufactured only for a short period of time
- Difficult to use conventional control charts
- Conversion from short- to long-run situation
Cutoff lengths:
Targeted 20 for A and 30 for B
*
H. Oliver Gao *
*
Short-run control charts
It would be beneficial if data from multiple products could be charted on the same control chart select a statistic to plot that has a fixed distribution over time regardless of the product.
Individual product control chart
- Proliferation of control chart per process
- A longer time period is needed for meaningful CL
- Not applicable to few of a kind products
- Fragments the continuous running record of the process
*
H. Oliver Gao *
*
Short-run individual and MR charts
Process out of control
- Delta X and mR charts: if products differ only in terms of target values and the variability is constant from product to product, simply plot
Delta x = x-Ti target value for product I
Centerline =0, std. is sigma
Control limits delta x =0 + and/or – 2.66 mR
*
H. Oliver Gao *
*
- Delta X and R charts: if products differ only in terms of target values and the variability is constant from product to product
- Z bar and W bar charts: variability also changes from product to product
- CV constant: A special case when each product has a different mean and standard deviation but in a manner as to keep the coefficient of variation CV constant. We can control chart , with mean equal to 1 and std. =CV/sqrt(n)
Short-run average and range charts
*
H. Oliver Gao *
*
Short-run variable charts
*
H. Oliver Gao *
*
Short-run attribute charts
*
H. Oliver Gao *
*
Charts for nonnormal distributions
- Situations where the characteristic of interest does not have a normal distribution
- E.g., Microbiological counts and particulate counts; time intervals between events; waiting times
- For nonnormal distributions, the 6 sigma limits do not enclose 99.73% of the pop.; the alpha risk changes; not a significant issue for X-bar charts
- But a big issue for charts of individual values
- Two approaches
- Identify the dist. of the data to some know dist., then construct centerline and CLs using their parameters
- Transform the dist. of X into normal distribution: Y=f(X)
*
H. Oliver Gao *
*
Charts for lognormal distribution
- Transform the dist. of X into normal distribution: Y=f(X)
- Y=Ln(X)
*
risks
a
risks
b
k
shift
a
ecting
not
chance
risk
alarm
false
of
risk
)
50
.
0
(
2
det
%
50
%
5
.
4
-
=
=
s
b
a
k
t
risk
risk
)
84
.
0
(
shif
2
σ
a
detecting
not
chance
%
84
alarm
false
of
%
27
.
0
-
=
=
b
a
k
shift
same
a
ecting
not
chance
risk
alarm
false
of
risk
)
16
.
0
(
det
%
16
%
27
.
0
D
=
=
b
a
k
shift
a
ecting
not
chance
risk
alarm
false
of
risk
)
84
.
0
(
2
det
%
84
%
27
.
0
s
b
a
=
D
=
=
X
data
historical
from
obtained
be
may
and
values
individual
of
std.
term
short
the
is
,
/
,
)
(
lim
2
2
/
2
/
2
/
-
D
=
÷
÷
ø
ö
ç
ç
è
æ
+
=
Þ
-
D
+
=
+
=
±
=
s
s
s
s
s
b
a
b
a
a
d
where
d
Z
Z
n
n
Z
x
n
Z
x
UCL
n
size
subgroup
select
To
n
Z
x
its
Control
)
1
(
1
b
b
-
-
k
b
b
b
-
=
-
-
¥
=
å
1
1
)
1
(
1
1
k
k
k
%
37
354
.
5
.
1
3
5
2
=
Þ
=
Þ
÷
÷
ø
ö
ç
ç
è
æ
+
=
b
b
b
Z
Z
8
14
.
8
5
.
1
28
.
1
3
5
.
1
3
2
2
1
.
0
»
=
÷
ø
ö
ç
è
æ
+
=
÷
÷
ø
ö
ç
ç
è
æ
+
=
=
b
Z
n
î
í
ì
D
-
D
=
÷
÷
ø
ö
ç
ç
è
æ
+
=
product
per
defects
of
#
average
denotes
,
/
defective
fraction
denotes
,
)
1
(
/
3
2
u
charts
u
the
for
u
p
chart
p
the
for
p
p
d
d
Z
n
b
X
δ
is
spec.
of
out
of
prob.
the
,
μ
or
μ
to
drifts
mean
process
the
even
L
U
<
-
-
+
=
-
=
s
m
s
m
d
d
Z
LSL
Z
USL
L
U
d
m
m
m
b
b
m
m
b
m
m
<
-
-
³
³
-
<
£
>
<
³
<
)
(
,
)%
1
(
100
,
x
)
(
,
x
observed
the
if
such that
drawn
is
LCL
The
)
(
,
x
observed
the
if
such that
drawn
is
UCL
The
spec
of
out
prob
hence
that
sure
are
we
UCL
to
LCL
within
is
as
long
So
prob
LCL
prob
UCL
L
U
L
U
)
n
1
(1
3
LSL
LCL
Modified
)
n
1
(1
3
-
USL
UCL
Modified
3
Z
Z
then
0.0013,
are
and
both
If
+
+
=
+
=
=
=
s
s
b
d
b
d
n
C
for
n
Z
Z
n
C
gives
n
n
Z
Z
USL
UCL
Shewhart
UCL
Modified
USL
C
then
to
closer
USL
pk
pk
pk
2
1
0013
.
0
3
)
(
1
3
)
(
3
,
,
centerline
Let
limits.
Shewhart
than
wider
be
to
limits
modified
for the
Cpk
minimum
required
0
0
0
0
0
0
0
0
+
³
=
=
+
+
³
-
+
³
-
+
-
-
³
-
-
=
=
b
d
m
s
m
m
s
m
m
s
m
m
m
b
d
b
d
screening
and
action
corrective
:
chart
limit
control
modified
the
Implement
7.
Step
ion
considerat
usual
upon the
based
interval
sampling
e
appropriat
an
Select
6.
Step
006
.
0
1
limits
Shewhart
the
that
Note
979
.
0
)
n
1
(1
3
LSL
LCL
Modified
021
.
1
)
n
1
(1
3
-
USL
UCL
Modified
then
4,
n
If
limits.
control
modified
the
Compute
5.
Step
2?
or
1
Cpk
if
What
size.
subgroup
on
ns
restrictio
no
)
n
(2/
1
5
Cpk
here,
Cpk
minimum
required
Cpk
that
so
size
subgroup
miminum
Compute
4.
Step
3
Z
Z
0.0013),
both
(here
and
of
values
Select the
3.
Step
5
,
002
.
0
/
R
centered
is
mean
0.03,
-
or
1
spec.
Here
index.
Cpk
Calculate
2.
Step
estimated)
precisely
be
can
y,
satisfctor
equal
control,
in
chart
range
cost,
n
(correctio
s
assumption
basic
Ensure
1.
Step
2
±
=
=
+
+
=
=
+
=
=
=
Þ
+
³
=
>
=
=
=
=
=
=
+
s
s
b
d
s
s
d
b
n
all
for
Cp
Cpk
d
charts
X
or
X
to
compared
w
of
factor
a
by
narrower
become
its
control
the
y
Essentiall
R
m
n
n
for
R
A
n
n
for
nw
x
chart
MA
for
its
Control
nw
N
M
n
N
X
w
x
x
x
M
t
w
t
t
t
t
lim
,
66
.
2
3
,
1
;
3
,
1
3
lim
)
/
,
(
~
)
/
,
(
~
...
2
2
2
1
1
=
=
=
>
±
=
Þ
+
+
+
=
+
-
-
s
s
s
s
m
s
m
w
t
for
t
R
A
x
nt
x
w
t
for
w
span
with
chart
MA
A
x
R
n
<
±
=
±
>
±
=
±
=
=
=
=
=
2
2
3
72
.
0
66
.
3
4
)
5
.
2
(
577
.
0
66
.
3
4
577
.
,
66
.
3
,
5
.
2
,
5
s
time
over
fixed
N
X
Z
)
1
,
0
(
~
)
(
s
m
-
=
product
ith
for the
)
(
/
)
(
2
R
T
x
n
d
n
T
x
Z
i
i
i
-
=
-
=
s
R
A
T
x
x
x
i
2
0
Limits
Control
±
=
-
=
D
D
i
T
X
/
?
,
3
,
1
:
)
1
ln(
)
1
ln(
2
1
)
ln(
3
2
2
2
2
=
=
=
=
+
=
+
-
=
+
X
X
X
X
X
X
Y
X
X
X
Y
UCL
Exercise
e
UCL
Y
Y
s
m
m
s
s
m
s
m
m
s
m
Lec 15 Statistical Process Control--CI and PI.ppt
SYSEN 5300 (5310, 5320) - Systems Engineering and Six-Sigma for Systems Reliability and Quality
Introduction System Reliability (FMEA, Fault Tree) Six-sigma & Stat. Control Six-sigma & Systems Improvement (DOE) Six-sigma & Systems Improvement (RSM)
Lecture 15 Statistical Process Control: Process Capability
H. Oliver Gao *
Process Capability (PC) vs. Stability (PS)
- PC: ability to meet product specifications
- A capable process: all product predicted to be within specifications
- Capability can not be determined without knowing product specifications
- PS: a process is only influenced by common causes
- PS: product specifications not necessary for judging PS
H. Oliver Gao *
Stability and Capability
- A stable process: a constant and predictable distribution over time
- Capable (prediction within specifications)
- Not capable
- Unstable process: impossible to predict
- Stability is a prerequisite for defining capability.
- A process
- Stable and capable
- Stable and incapable
- Unstable but potentially capable
- Unstable and incapable
Improvement action
H. Oliver Gao *
Goal and Outline for PC
Quantification of process capability for both stable and unstable processes, in terms of capability and performance indices
- Methods for capability indices (CI) and confidence intervals
- Connection between a CI and tolerance interval
- Six sigma goal (meaning and rationale)
- Application of CI: setting goals, assessing process, identifying improvement actions
H. Oliver Gao *
Capability and performance indices
- Capability indices: measure what a stable process would be capable of. Two capability indices (Cp, and Cpk)
- Process indices: measure the current performance of the process regardless of whether it is stable or not. Two performance indices (Pp, Ppk)
H. Oliver Gao *
Cp Index
Basic assumptions
- Specification is two-sided
- Process is perfectly centered in the middle of the specification
- Process is stable
- Process is normally distributed
H. Oliver Gao *
Cp Index (cont.)
Examples: car garage, lane width
The Cp index can be improved by widening specification width or reducing short-term variability
H. Oliver Gao *
Cpk Index
Practical modification of Cp (relax the two-sided and center assumptions)
Negative to positive infinity. Improve by: widening specification; reducing short-term variability; and changing the process mean
H. Oliver Gao *
Pp Index
Pp Index measures the performance of the process without assuming it to be stable.
Basic assumptions
- Specification is two-sided
- Process is perfectly centered in the middle of the specification
- Process is normally distributed
H. Oliver Gao *
Ppk Index
Measure the current performance of the process without assuming a two-sided specification or stable centered process
Negative to positive infinity. Improve by: widening specification; reducing special cause variability; reducing common cause variability; and changing the process mean
H. Oliver Gao *
Relationships between Cp, Cpk, Pp, and Ppk
- CI and PI assumptions
- One-sided spec: only Cpk and Ppk, Cpk > Ppk
- Two-sided spec:
- Cp>Cpk > Ppk and Cp>Pp > Ppk
- PpCpk=CpPpk
- Pp>Cpk if off-centering is more severe than instability; Pp<Cpk if instability is more severe than off-centering
- Ppk=Cpk and Pp=Cp for stable process, Ppk=Pp and Cpk=Cp for centered process
H. Oliver Gao *
Relationships between Cp, Cpk, Pp, and Ppk (cont.)
- Example: process unstable and not centered, with current Ppk=1
- Stabilization improves the process to Cpk=?
- Further, if centered, process improved to Cp=?
- Alternatively, if the process first gets centered, Pp=?; then further stabilized, Cp=?
- In this example: Cpk=Pp
H. Oliver Gao *
Estimating CI and PI
- Point Estimates
- Process mean
- Short-term standard deviation
- Long-term standard deviation
- Subgrouped data
Pooled within-subgroup std.
H. Oliver Gao *
Estimating CI and PI (cont.)
- Individual data: data are available as n individual values collected over time
- Data sources: X-mR chart, short-term capability studies, and process validation
H. Oliver Gao *
Example 1 (Cpk and Ppk)
- Declared weight: 250 grams, single-sided lower specification limit
- Estimate
- Calculate Cpk and Ppk (the process has a single-sided lower spec. limit: 250 grams)
H. Oliver Gao *
Example 1 (Cpk and Ppk, continued)
H. Oliver Gao *
Example 2 (process capability and performance indices)
- The x-bar and R chart of the data in the table on the right indicates that the process is in control. Now we are interested in calculating the capability and performance indices:
- Specification limits: Lower: 0.500; Upper: 0.900
- Calculate the process capability and performance indices
H. Oliver Gao *
Example 2 (process capability and performance indices)
The following should be noted:
- Process capability and process performance metrics are noted to be almost identical.
- Calculations for short-term variability were slightly larger than long-term variability, which is not reasonable because short-term variability is a component of long-term variability. Using range is not as statistically powerful as using the following
H. Oliver Gao *
Confidence Intervals for CI and PI (1)
- A process capability index calculated from sample data is an estimate of the population process capability index. It is highly unlikely that the true population index is the exact value calculated.
- A confidence interval adds a probability range for the true population value given the results of the sample, including sample size.
H. Oliver Gao *
Confidence Intervals for CI and PI (2)
- Uncertainties in estimated CI and PI:
- Cp and Pp: short-term variability is not known precisely
- Cpk and Ppk: added uncertainty from unknown process mean
- Uncertainties quantified by confidence intervals (a function of the DOF available to estimate the index, wider for Cpk and Ppk)
H. Oliver Gao *
Confidence Intervals for CI and PI (3)
- Approximate confidence intervals for Cp, Pp, and for Cpk or Ppk close to 1
- We need more than 100 to 200 observations to get a reasonable estimate of CI and PI. Even then, a 10% uncertainty
H. Oliver Gao *
Confidence Intervals for CI and PI (example)
- Product weight example: 22 subgroups of size 5, calculate the confidence interval (one-sided and two sided) of Cpk=1.14. How do we interpret them?
H. Oliver Gao *
Connection with Tolerance Intervals
Process validation: a tolerance interval can be constructed to contain 100(1-p)% of the population with 100(1-a)% confidence, based on n observations over a relatively short time.
k depends on a, p, and n
Validation passes if tolerance interval is within specification limits.
Example: It is a common practice in some industries to construct a 95/95 tolerance interval, meaning that we are 95% sure that 95% of the population is within the constructed tolerance interval.
Process is validated if this tolerance interval is within spec. limits
In the context of process validation, there is connection btn the tolerance interval and th lower bound on the Cpk or the lower bound on the Ppk
H. Oliver Gao *
Connection with Tolerance Intervals (continued)
Two-sided specification:
two-sided tolerance interval is inside the specification interval. The limiting case: when one tolerance limit exactly matches the corresponding specification limit or when the tolerance interval exactly matches the specification interval. In both cases, we are 100(1-α)% sure that no more than 100p% of the product is outside the specification. This means that the distance between the estimated process mean and the nearest specification limit must be at least σZp/2. Thus, we are 100(1-α)% sure that
One-sided specification:
H. Oliver Gao *
Validation Acceptance and Minimum Cpk
Process validation: The process is validated with 100(1-a)% confidence provided that 100(1-p)% of the pop is enclosed inside the specification interval:
For process validation, calculating min Cpk is better than using tolerance intervals:
- Min Cpk allows for an assessment of the goodness of the process on a continuous scale
- Max fraction defective can be predicted
- Allows for examination of process stability and centering
H. Oliver Gao *
Six Sigma Goal
A stable process: Cp=Pp; Cpk=Ppk.
What should be the targets for the Cp and Cpk indices?
What does a Cpk of one mean? Is this acceptable?
- The consequence of a characteristic being outside spec. (e.g., for safety characteristic, the risk of 0.3% is not acceptable; for a minor degradation characteristic, Cpk=1 is reasonable)
- The # of key product characteristics that control the total performance of the product. (e.g., for 10 and 100 independent characteristics, each with Cpk of one, the probabilities that the system will perform well are 97.3% and 76.3%, not acceptable)
- The closeness of the estimated Cpk to the population Cpk (confidence limits)
- Deviation from the continuous stability assumption (the process needs to be designed to a Cpk greater than one in order to achieve a Cpk of one in practice)
Better than 99.73% of the individual characteristic values are within specification or no more than three individual values out of 1000 are expected to be outside specifications.
H. Oliver Gao *
Six Sigma Goal (cont.)
- A six sigma goal: design Cp>2 and manufacturing Cpk>1.5
H. Oliver Gao *
Planning for improvement (if and how)
- What is the current performance?
- What should be the process capability targets?
- What improvement actions are necessary?
The Cp, Cpk, Pp, and Ppk indices permit an assessment of the process stability, centering, and capability.
H. Oliver Gao *
Six categories of processes
Based on CI and PI (assume capability targets to be six-sigma, i.e., Cp>2; Ppk>1.5
Stable and capable
Stable and potentially capable
Stable and incapable
Unstable but capable
Unstable but potentially capable
Unstable and incapable
Pr6
p
short
SpecificationwidthUSLLSL
C
ocesswidth
s
-
==
24
2
/,/,
()/(1)
shortw
totaltotali
x
Rdscors
sxxnkalldata
m
s
s
»
=
éù
»=--
êú
ëû
å
,,
shorttotal
mss
6
p
total
pp
USLLSL
P
CP
s
-
=
³
3
3
pk
total
pk
total
USLMean
P
MeanLSL
orP
s
s
-
=
-
=
2
/1.128,(1)
()/(1),(1)
short
totaltotali
x
mRnDOF
sxxnalldatanDOF
m
s
s
»
=-
éù
»=---
ëû
å
3
p
pk
Z
C
³
/2
3
p
pk
Z
C
³
xks
±
098
.
1
074
.
1
66
.
253
2
=
»
=
»
=
»
total
total
short
s
d
R
x
s
s
m
11
.
1
)
098
.
1
(
3
250
66
.
253
14
.
1
)
074
.
1
(
3
250
66
.
253
=
-
=
=
-
=
pk
pk
P
C
2
Z
,
confidence
95%
For
estimate
to
1)
-
(nk
and
estimate
to
1)
-
k(n
and
nk
N
n,
size
of
subgroups
k
for
e.g.,
ns.
observatio
of
number
total
the
is
N
).
P
and
(P
for
or
)
C
and
C
(for
estimate
to
freedom
of
degrees
where
9
1
2
1
:
index
P
For the
9
1
2
1
:
index
C
For the
2
1
:
index
P
For the
2
1
:
index
C
For the
/2
total
short
pk
p
total
pk
p
short
2
2
pk
2
2
pk
2
p
2
p
»
=
=
=
+
±
+
±
±
±
a
a
a
a
a
s
s
n
s
s
n
n
n
n
n
pk
pk
pk
pk
pk
pk
p
p
p
p
NP
P
Z
P
NC
C
Z
C
P
Z
P
C
Z
C
0.99.
an
greater th
is
index
e
performanc
true
that the
sure
95%
are
We
99
.
0
15
.
0
14
.
1
)
14
.
1
)(
110
(
9
1
)
88
(
2
1
)
14
.
1
)(
64
.
1
(
1.14
is
C
for
bound
lower
95%
The
2.
1.32.
and
0.96
between
is
index
e
performanc
true
that the
sure
95%
are
We
32
.
1
96
.
0
18
.
0
14
.
1
)
14
.
1
)(
110
(
9
1
)
88
(
2
1
)
14
.
1
)(
96
.
1
(
1.14
is
C
for
CI
95%
sided
-
two
The
1.
9
1
2
1
:
index
C
For the
2
pk
2
pk
2
2
pk
=
-
=
+
-
=
±
=
+
±
+
±
to
NC
C
Z
C
pk
pk
pk
n
a
)
64
.
1
Z
CI,
95%
(e.g.,
with Z
Z
replace
interval
confidence
sided
-
one
For
96
.
1
Z
,
confidence
95%
For
estimate
to
1)
-
(nk
and
estimate
to
1)
-
k(n
and
nk
N
n,
size
of
subgroups
k
for
e.g.,
ns.
observatio
of
number
total
the
is
N
).
P
and
(P
for
or
)
C
and
C
(for
estimate
to
freedom
of
degrees
where
9
1
2
1
:
index
P
For the
9
1
2
1
:
index
C
For the
2
1
:
index
P
For the
2
1
:
index
C
For the
/2
/2
total
short
pk
p
total
pk
p
short
2
2
pk
2
2
pk
2
p
2
p
=
=
=
=
=
+
±
+
±
±
±
a
a
a
a
a
a
a
a
s
s
n
s
s
n
n
n
n
n
pk
pk
pk
pk
pk
pk
p
p
p
p
NP
P
Z
P
NC
C
Z
C
P
Z
P
C
Z
C
0811
.
0
94
.
0
07627613
.
0
4
=
=
=
c
s
s
[
]
6629
.
0
9688
.
0
,
6629
.
0
min
3
,
3
min
3
,
3
min
8159
.
0
)
081716
.
0
(
6
5
.
0
9
.
0
6
determine
can
we
estimate,
deviation
standard
this
Using
0817
.
0
1
80
)
7375
.
0
(
1
)
(
deviation
standard
term
-
long
The
8136
.
0
0819
.
0
5
.
0
9
.
0
6
0819
.
0
326
.
2
1906
.
0
deviation
standard
short term
A
0.1906
R
0.7375
x
5
size
sample
Subgroup
yields
chart
control
process
The
sample
80
1
2
1
2
sample
/
R
2
/
R
2
2
=
=
ú
ú
û
ù
ê
ê
ë
é
-
-
=
ú
ú
û
ù
ê
ê
ë
é
-
-
=
=
-
=
-
=
=
-
-
=
-
-
=
=
=
-
=
-
=
=
=
=
=
=
=
=
å
å
=
=
sample
sample
sample
sample
pk
p
i
i
n
i
i
d
p
d
LSL
x
x
USL
LSL
x
x
USL
P
LSL
USL
P
x
n
x
x
s
LSL
USL
C
d
R
s
s
s
s
s
s
s
s
s
SYSEN5300 Lec 14 Statistical Process Control--control Charts.ppt
SYSEN 5300 (5310, 5320) - Systems Engineering and Six-Sigma for Systems Reliability and Quality
Introduction System Reliability (FMEA, Fault Tree) Six-sigma & Stat. Control Six-sigma & Systems Improvement (DOE) Six-sigma & Systems Improvement (RSM)
Lecture 14 Six Sigma and Statistical Process Control: Control Charts
H. Oliver Gao *
Motivation for Control Charts
- Data are often collected over time
- General descriptive data summaries (e.g., mean, Std., histogram) don’t preserve the time dimension such as a time trend
- Control chart: one way to plot data over time
H. Oliver Gao *
Outline
- Role of control charts
- Basic principles behind determining control limits
- Formulae to design the most commonly used variable and attribute control charts
- Out-of-control rules to detect special causes
- Key success factors for implementing effective charts
H. Oliver Gao *
Role of Control Charts
- No two products are exactly alike variability in the process
- Normal causes stable and predictable variation
- Special causes unstable and unpredictable variation
- To improve product uniformity: reduce the special and common causes of variation or reduce their effects
- Redesign product and process
- Ensure the operation of the process
Confusion between common and special causes of variation is expensive and leads to counterproductive corrective actions.
H. Oliver Gao *
Control Charts (Shewhart)
- A graphical method to distinguish between common and special causes of variation
- Old way of quality control: quality defined as meeting specifications, inspection after produced; no effort of improving toward meeting the ideal product targets.
- A better way: prevention strategy based on an understanding of the process, the causes of variability, and the nature of actions necessary to reduce variation
H. Oliver Gao *
Process and process quality
- All product and services are a result of some process.
- Process quality the degree to which product and service performance characteristics are consistently on target.
H. Oliver Gao *
Cause of variation in a process
- People (operators, training, and experience)
- Machines (machine-to-machine difference, wear, maintenance)
- Methods (temperature control)
- Materials (lot-to-lot and within-lot differences)
- Environment (ambient temp., humidity)
- Time to place an order; the measurement systems
Common and Special causes
H. Oliver Gao *
Common causes of variation
- A part of the normal operation of the process and are constantly present
- Short-term variability, large in number and small in effect
- Process in statistical control, output predictable within limits.
- A stable process: constant mean, std, and distribution over time
H. Oliver Gao *
Special causes of variation
- Not always present or not always present to the same degree
- Large change in the output, long-term (because of a new cause, or larger than usual change in a key common cause)
- Process out of control or unstable: unpredictable change in the mean, variance, or shape of the distribution of output attributes. prediction impossible
H. Oliver Gao *
Improvement Actions
Two ways:
reduce common causes or their effects change the process itself
Reduce special causes or their effects identification and removal of special causes so that the process is executed as designed.
Two mistakes: (confusion between common and special causes is costly)
- Ascribe variation to a special cause when it is the result of a common cause overadjustment (reacting to noise)
- Ascribe variation to a common cause when it is the result of a special cause ignoring a signal
H. Oliver Gao *
Examples of mistakes
A stable process with no adjustment
A stable process with adjustment
An unstable process with no adjustment
It is therefore necessary to be able to distinguish between the two types of variation.
A control chart is a graphical method used to distinguish between common cause variation and special cause variation.
The role of a control chart is to help us identify the presence and nature of special causes.
H. Oliver Gao *
Logic of control limit
- Control Limit distinguishes a control chart from a simple plot of data over time
- Control Limit: a bound between common and special cause variation
- Logic of control limit (example)
H. Oliver Gao *
Process Mean Control Chart
- Is the process mean constant over time?
- Center line depends on the purpose of control chart
- Whether the process is stable? grand mean of all data
- Whether the mean is on target? targeted value
H. Oliver Gao *
Control Limit
Symmetry between UCL and LCL
3 sigma distance from center line (sample mean will has less than 0.3% chance of falling outside of 3 sigma control limit)
3 Sigma of sample mean=
Computing sigma (only reflect the common cause variability of the process): within subgroup variance
Control Limit=
Discussion: mean control limit vs. specification limit (for individual values)
H. Oliver Gao *
Variable control charts
- Apply when the characteristic of interest is measured on a continuous scale.
- The average and range
- The average and standard deviation
- The individual and moving range
Two types of control charts: variable control charts and attribute control charts.
H. Oliver Gao *
Average and range chart
- Product weight example (weight control)
Two types of control charts: variable control charts and attribute control charts.
H. Oliver Gao *
H. Oliver Gao *
Average and standard deviation chart
- Std. is a better estimate of within-group variability than range. Mean and std. chart is preferred over mean and range.
Two types of control charts: variable control charts and attribute control charts.
H. Oliver Gao *
Comparison
H. Oliver Gao *
Individual and moving range chart
- Only one measurement per subgroup (within-subgroup variance, i.e., short-term variation, is presented by the difference between successive values (mR))
Two types of control charts: variable control charts and attribute control charts.
H. Oliver Gao *
Individual and moving range chart
H. Oliver Gao *
Design Variable Control Chart
summary
Comparison for Product Weight Data
H. Oliver Gao *
Attribute (count data) control chart
- Characteristic is measured on a discrete scale (e.g., # occurrences, # defects): loss in discrimination, increase in subgroup size, reduction in ability for continuous improvement
Selecting Attribute Control Chart
H. Oliver Gao *
Fraction defective (p) chart
- # of defectives has a binomial distribution (i.e., 1. each of the n items being tested is being classified into only two categories: defective and not defective; 2. the probability p of a defective item is constant for every item)
- If X represent the number of defectives in n items, then the probability of finding x defective in n items is
H. Oliver Gao *
Inventory accuracy example
How do we design a p chart for this?
In order to assess inventory accuracy, a simple data collection scheme: check 100 items each week and record the number of misplaced items.
H. Oliver Gao *
Procedure to design a p chart
Obtain # misplaced (xi) and number checked (ni)
Calculate the fraction misplaced, pi=xi/ni
Calculate centerline:
Under binomial assumptions, the standard deviation of pi is
3-sigma control limits for p chart=
H. Oliver Gao *
Control limits for attribute charts
p chart for inventory data
H. Oliver Gao *
Defect per product (u) chart
- Assumption: # of defects per product follows Poisson distribution
- E.g., # of accidents per month at a manufacturing facility
H. Oliver Gao *
Procedure to design a u chart
Obtain # defects (accidents) xi and corresponding sample size ni
Calculate ui=xi/ni. For this example, ui=xi, ni=1
Calculate the centerline,
For Poisson dist., the std. is
H. Oliver Gao *
Interpreting control charts
- Control charts provide more information regarding process instability: identify and provide information about the nature of special causes
- Pattern of points test for a special cause
H. Oliver Gao *
Tests for the chart of averages
Sporadic shift or beginning of sustained shift
Early warning of a shift
Early warning of sustained shift
Upward or downward trend
Subgroups from different distri.
Sustained shift
Gauge is faulty, variability decreased, wrong control limits
A systematic factor: alternate machines, suppliers, or operators
H. Oliver Gao *
Key factors for successful control charts
- Key characteristics to control (e.g., clarity of TV picture, time for 0-60 acceleration)
- Rational subgroup (e.g., sources of variation only include common cause variability)
- Proper control chart (good understanding of the process), control limits, subgroup size, and sampling interval
- Control chart redesign (review and update)
- Corrective actions (in case of rule violation)
l
s
l
s
l
m
l
l
=
Þ
=
=
=
=
-
2
...
3
,
2
,
1
,
0
!
)
(
x
for
x
e
x
P
x
R
D
UCL
R
D
LCL
Range
for
Limits
Control
R
A
x
one
Simpler
n
x
mean
for
Limits
Control
short
4
3
2
;
:
:
?
3
:
=
=
±
=
±
s
i
n
u
u
3
chart
u
for
limits
Control
±
=
)
(
R
and
X
ï
ï
î
ï
ï
í
ì
)
(
?)
(
?)
(
lim
?)
(
long
how
period
sampling
many
how
size
subgroup
CL
from
far
how
its
control
lower
and
upper
where
line
center
chart
X
X
individual
of
std
n
,
:
/
3
s
s
mean
grand
x
n
x
short
,
:
3
s
±
)
(
S
and
X
)
(
mR
and
X
iability
process
monitor
to
chart
R
the
mean
process
monitor
to
chart
X
NID
X
stability
Under
var
)
,
(
~
,
s
m
s
B
UCL
s
B
LCL
Std
for
Limits
Control
s
A
x
mean
for
Limits
Control
4
3
3
;
:
?
:
=
=
=
±
R
m
x
x
Limits
Control
Chart
X
R
m
d
R
m
short
short
66
.
2
3
:
128
.
1
2
±
=
±
=
=
s
s
)
1
(
)
1
(
)!
(
!
!
)
(
p
np
np
p
p
x
n
x
n
x
P
x
n
x
-
=
=
-
-
=
-
s
m
n
p
p
/
)
1
(
-
n
p
p
p
/
)
1
(
3
-
±
checked
total
misplaced
total
p
#
#
=
size
sample
Total
accidents
defects
observed
Total
u
)
(
=