Application 2 – Annotated Bibliography
International Journal of Performability Engineering Vol. 6, No. 6, November 2010, pp. 531-546.
© RAMS Consultants
Printed in India
* Corresponding author’s email: nschneid@nps.navy.mil 531
Successful Application of Software Reliability: A Case Study
NORMAN F. SCHNEIDEWIND
Fellow of the IEEE
2822 Raccoon Trail
Pebble Beach, California 93953 USA
(Received on July 30, 2009, revised on May 3, 2010)
Abstract: The purpose of this case study is to help readers implement or improve a
software reliability program in their organizations, using a step-by-step approach based on
the Institute of Electrical and Electronic Engineers (IEEE) and the American Institute of
Aeronautics and Astronautics Recommended (AIAA) Practice for Software Reliability,
released in June 2008, supported by a case study from the NASA Space Shuttle.
This case study covers the major phases that the software engineering practitioner
needs in planning and executing a software reliability-engineering program. These phases
require a number of steps for their implementation. These steps provide a structured
approach to the software reliability process. Each step will be discussed to provide a good
understanding of the entire software reliability process. Major topics covered are: data
collection, reliability risk assessment, reliability prediction, reliability prediction
interpretation, testing, reliability decisions, and lessons learned from the NASA Space
Shuttle software reliability engineering program.
Keywords: software reliability program, Institute of Electrical and Electronic Engineers
and the American Institute of Aeronautics and Astronautics Recommended Practice for
Software Reliability, NASA Space Shuttle application
1. Introduction
The IEEE\AIAA recommended practice provides a foundation on which
practitioners and researchers can build consistent methods [1]. This case study will
describe the SRE process and show that it is important for an organization to have a
disciplined process if it is to produce high reliability software. To accomplish this purpose,
an overview is presented of existing practice in software reliability, as represented by the
recommended practice [1]. This will provide the reader with the foundation to understand
the basic process of Software Reliability engineering (SRE). The Space Shuttle Primary
Avionics Software Subsystem will be used to illustrate the SRE process.
The reliability prediction models that will be used are based on some key definitions
and assumptions, as follows:
Definitions
Interval: an integer time unit t of constant or variable length defined by t-1 <t <t+1, where
t>0; failures are counted in intervals.
Number of Intervals: the number of contiguous integer time units t of constant or variable
length represented by a positive real number.
Norman F. Schneidewind
.
532
Operational Increment (OI): a software system comprised of modules and configured from
a series of builds to meet Shuttle mission functional requirements.
Time: continuous CPU execution time over an interval range.
Assumptions
1. Faults that cause failures are removed.
2. As more failures occur and more faults are corrected, remaining failures will be
reduced.
3. The remaining failures are "zero" for those OI's that were executed for extremely
long times (years) with no additional failure reports; correspondingly, for these
OI's, maximum failures equals total observed failures.
1.1 Space Shuttle Flight Software Application
The Shuttle software represents a successful integration of many of the computer
industry's most advanced software engineering practices and approaches. Beginning in the
late 1970's, this software development and maintenance project has evolved one of the
world's most mature software processes applying the principles of the highest levels of the
Software Engineering Institute's (SEI) Capability Maturity Model (the software is rated
Level 5 on the SEI scale) and ISO 9001 Standards [2]. This software process includes
state-of-the-practice software reliability engineering (SRE) methodologies.
The goals of the recommended practice are to: interpret software reliability
predictions, support verification and validation of the software, assess the risk of
deploying the software, predict the reliability of the software, develop test strategies to
bring the software into conformance with reliability specifications, and make reliability
decisions regarding deployment of the software.
Reliability predictions are used by the developer to add confidence to a formal
software certification process comprised of requirements risk analysis, design and code
inspections, testing, and independent verification and validation. This case study uses the
experience obtained from the application of SRE on the Shuttle project, because this
application is judged by NASA and the developer to be a successful application of SRE
[6]. These SRE techniques and concepts should be of value for other software systems
1.2 Reliability Measurements and Predictions
There are a number of measurements and predictions that can be made of reliability
to verify and validate the software. Among these are remaining failures, maximum
failures, total test time required to attain a given fraction of remaining failures, and time to
next failure. These have been shown to be useful measurements and predictions for: 1)
providing confidence that the software has achieved reliability goals; 2) rationalizing how
long to test a software component (e.g., testing sufficiently long to verify that the measured
reliability conforms to design specifications); and 3) analyzing the risk of not achieving
remaining failures and time to next failure goals [6]. Having predictions of the extent to
which the software is not fault free (remaining failures) and whether a failure it is likely to
occur during a mission (time to next failure) provide criteria for assessing the risk of
deploying the software. Furthermore, fraction of remaining failures can be used as both an
Successful Application of Software Reliability: Case Study
533
operational quality goal in predicting total test time requirements and, conversely, as an
indicator of operational quality as a function of total test time expended [6].
The various software reliability measurements and predictions can be divided into the
following two categories to use in combination to assist in assuring the desired level of
reliability of the software in mission critical systems like the Shuttle. The two categories
are: 1) measurements and predictions that are associated with residual software faults and
failures, and 2) measurements and predictions that are associated with the ability of the
software to complete a mission without experiencing a failure of a specified severity. In
the first category are: remaining failures, maximum failures, fraction of remaining failures,
and total test time required to attain a given number of fraction of remaining failures. In
the second category are: time to next failure and total test time required to attain a given
time to next failure. In addition, there is the risk associated with not attaining the required
remaining failures and time to next failure goals. Lastly, there is operational quality that is
derived from fraction of remaining failures. With this type of information, a software
manager can determine whether more testing is warranted or whether the software is
sufficiently tested to allow its release or unrestricted use. These predictions provide a
quantitative basis for achieving reliability goals [2].
1.3 Interpretations and Credibility
The two most critical factors in establishing credibility in software reliability
predictions are the validation method and the way the predictions are interpreted. For
example, a "conservative" prediction can be interpreted as providing an "additional margin
of confidence" in the software reliability, if that predicted reliability already exceeds an
established "acceptable level" or requirement. It may not be possible to validate
predictions of the reliability of software precisely, but it is possible with "high confidence"
to predict a lower bound on the reliability of that software within a specified environment.
If there historical failure data were available for a series of previous dates (and there
is actual data for the failure history following those dates), it would be possible to compare
the predictions to the actual reliability and evaluate the performance of the model. Taking
this approach will significantly enhance the credibility of predictions among those who
must make software deployment decisions based on the predictions [9].
1.4 Verification and Validation
Software reliability measurement and prediction are useful approaches to verify and
validate software. Measurement refers to collecting and analyzing data about the observed
reliability of software, for example the occurrence of failures during test. Prediction refers
to using a model to forecast future software reliability, for example failure rate during
operation. Measurement also provides the failure data that is used to estimate the
parameters of reliability models (i.e., make the best fit of the model to the observed failure
data). Once the parameters have been estimated, the model is used to predict the future
reliability of the software. Verification ensures that the software product, as it exists in a
given project phase, satisfies the conditions imposed in the preceding phase (e.g.,
reliability measurements of mission critical software components obtained during test
conform to reliability specifications made during design) [5]. Validation ensures that the
software product, as it exists in a given project phase, which could be the end of the
project, satisfies requirements (e.g., software reliability predictions obtained during test
correspond to the reliability specified in the requirements) [5].
534 Norman F. Schneidewind
Another way to interpret verification and validation is that it builds confidence that
software is ready to be released for operational use. The release decision is crucial for
systems in which software failures could endanger the safety of the mission and crew (i.e.,
mission critical software). To assist in making an informed decision, software risk analysis
and reliability prediction are integrated and provide stopping rules for testing. This
approach is applicable to all mission critical software. Improvements in the reliability of
software, where the reliability measurements and predictions are directly related to mission
and safety, contribute to system safety.
2. Implementing a Software Reliability Engineering Program
In broad terms, implementing a software reliability program is a two-phased
process. It consists of (1) identifying the reliability goals and (2) testing the software to see
whether it conforms to the goals. The reliability goals can be ideal (e.g., zero defects) but
should have some basis in reality based on tradeoffs between reliability and cost. The
testing phase is more complex because it involves collecting raw defect data and using it
for assessment and prediction.
The following are major SRE steps in the recommended practice, keyed to the phases
of the software development life cycle (not necessarily in chronological order):
2.1 State the Reliability Criteria (requirements analysis phase)
This might be stated, for example, as “no failure that would result in loss of life or
mission”.
2.2 Collect Fault and Failure Data (testing and operations phase)
For each system, there should be a brief description of its purpose and functions and
the fault and failure data, as shown below. Days # could be hours, minutes, as appropriate.
Code the Problem Report Identification to indicate Software (S) failure, Hardware (H)
failure, or People (P) failure.
• System Identification
• Purpose
• Functions
• Days # (since start of test)
• Problem Report Identification
• Problem Severity
• Failure Date
• Module with Fault
• Description of Problem
2.3 Establish Problem Severity Levels (requirements analysis phase)
Use a problem severity classification, such as the following:
1. Loss of life, loss of mission, abort mission.
2. Degradation in performance.
3. Operator annoyance.
4. System ok, but documentation in error.
5. Error in classifying a problem (i.e., no problem existed in the first place).
Note: Not all problems result in failures.
Successful Application of Software Reliability: Case Study
535
2.4 Develop Reliability Assurance Criteria(requirements analysis phase)
Two criteria for software reliability levels will be defined. Then these criteria will
be applied to the risk analysis of mission critical software. In the case of the Shuttle
example, the "risk" represents the degree to which the occurrence of failures does not meet
required reliability levels, regardless of how insignificant the failures may be. Although it
may be counterintuitive to include minor failures in reliability assessments, in reality,
doing so provides a conservative lower bound on assessment. That is, the actual reliability
is highly unlikely to be lower than the assessment.
Next, a variety of equations that are used in reliability prediction and risk analysis
will be defined and derived, including the relationship between time to next failure and
reduction in remaining failures. Then it is shown how the prediction equations can be used
to integrate testing with reliability and quality. An example is shown of how the risk
analysis and reliability predictions can be used to make decisions about whether the
software is ready to deploy. Note that these equation are based on the model in [9] because
this model is used on the Shuttle and is one of the models recommended in the
recommended practice [1]. Other models could be used, such as those in [9].
If the reliability goal is the reduction of failures of a specified severity to an
acceptable level of risk [7], then for software to be ready to deploy, after having been
tested for time t, it must satisfy the following criteria:
1) Predicted mean number of remaining failures r(t) < rc, (1)
where rc is a specified critical value , and
2) predicted mean time to next failure TF(t) > tm, (2)
where tm is mission duration.
For systems that are tested and operated continuously like the Shuttle, tt, TF (t), and tm
are measured in execution time. Note that, as with any methodology for assuring software
reliability, there is no guarantee that the expected level will be achieved. Rather, with these
criteria, the objective is to reduce the risk of deploying the software to a "desired" level.
2.5 Apply the Remaining Failures Criterion (testing phase)
Criterion (1) sets the threshold on remaining failures that must be satisfied in order to
deploy the software (i.e., no more than a specified number of failures).
If it is predicted that r(t) ≥ rc, then the process is to continue to test for a time t' > t
that is predicted to achieve r(t') <rc, using the assumptions 1 and 2 that more failures will
be experienced and more faults will be corrected so that the remaining failures will be
reduced by the quantity r(t) - r(t'). If the developer does not have the resources to satisfy
the criterion or is unable to satisfy the criterion through additional testing, the risk of
deploying the software prematurely should be assessed. It is known that it is impossible to
demonstrate the absence of faults [3]; however, the risk of failures occurring can be
reduced to an acceptable level, as represented by rc. This scenario is shown in Figure 1. In
case A, r (t) <rc is predicted and the mission begins at t. In case B, r (t) ≥ rc is predicted
and the mission would be postponed until the software is tested for time t' when r (t')<rc is
predicted. In both cases criterion 2) would also be required for the mission to begin.
536 Norman F. Schneidewind
Figure 1: Remaining Failures Criterion Scenario
2.6 Apply the Time to Next Failure Criterion (testing phase)
Criterion 2 specifies that the software must survive for a time greater than the
duration of the mission. If TF (t) ≤ tm, is predicted, the software is tested for a time t’ that
is predicted to achieve TF (t’) > tm, using assumptions 1and 2 that more failures will be
experienced and faults corrected, so that the mean time to next failure will be increased by
the quantity TF (t’) -TF (t). Again, if it is infeasible for the developer to satisfy the criterion
for lack of resources or failure to achieve test objectives, the risk of deploying the software
prematurely should be assessed. This scenario is shown in Figure 2.
Figure 2: Time to Next Failure Criterion Scenario
Start Test End Test, Begin Mission End Mission
End
Mission
r(tt)<rc r(tt) rc
tt
tt tt ’
Start Test Continue
Test
End Test
Begin
Mission
r(tt)<rc
TF (tt ’’ )
tt
tt tt ’’
tm
tm
Start Test End Test, Begin Mission End Mission
Start Test
End Test
Begin
Mission End
Mission Continue
Test
TF (tt)
TF (tt)
Successful Application of Software Reliability: Case Study
537
In case A, TF (t) > tm is predicted and the mission begins at t. In case B, TF (t) ≤ tm is
predicted, and in this case the mission would be postponed until the software is tested for
time tt' when TF (t’) > tm is predicted. In both cases criterion 1) would also be required for
the mission to begin. If neither criterion is satisfied, the software is subjected to additional
inspection and testing, to remove more faults, until the desired level of risk is achieved.
2.7 Make a Risk Assessment (pre deployment or launch phase)
Reliability Risk pertains to executing the software of a mission critical system where
there is the chance of injury (e.g., astronaut injury or fatality), damage (e.g., destruction of
the Shuttle), or loss (e.g., loss of the mission) if a serious software failure occurs during a
mission. In the case of the Shuttle, where the occurrence of even trivial failures is rare, the
fraction of those failures that pose any reliability risk is too small to be statistically
significant. As a result, in order to have an adequate sample size for analysis, all failures
(of any severity) over the entire 20-year life of the project have been included in the failure
history database for this analysis. Therefore, the risk criterion metrics to be discussed for
the Shuttle quantify the degree of risk associated with the occurrence of any software
failure, no matter how insignificant it may be. As mentioned previously, this approach
provides a conservative lower bound to reliability predictions.
As an example, the Schneidewind Software Reliability Model (other software
reliability models could be used as well) is used to compute a parameter: fraction of
remaining failures as a function of the archived failure history during test and operation
[6]. The prediction methodology uses this parameter and other reliability quantities to
provide bounds on total test time, remaining failures, operational quality, and time to next
failure that are necessary to meet defined Shuttle software reliability levels.
The test time t can be considered a measure of the degree to which software
reliability goals have been achieved. This is particularly the case for systems like the
Shuttle where the software is subjected to continuous and rigorous testing for several years
in multiple facilities, using a variety of operational and training scenarios (e.g., by the
contractor in Houston, by NASA in Houston for astronaut training, and by NASA at Cape
Canaveral). In Figure 3, t is interpreted as an input to a risk reduction process, and r (t)
and TF (t) as the outputs, with rc and tm as risk thresholds of reliability that control the
process.
Figure 3: Risk Reduction Process
Reliability
Measure
Risk
Reduction
rc tm
r(tt)
TF(tt)
tt
Total Test Time
Risk Criteria Levels
538 Norman F. Schneidewind
While it must be recognized that test time is not the only consideration in developing
test strategies and that there are other important factors, such as the consequences for
reliability and cost in selecting test cases [11], nevertheless, for the foregoing reasons, test
time has been found to be strongly positively correlated with reliability growth for the
Shuttle [9].
2.8 Evaluate Remaining Failures Risk (pre deployment or launch phase)
To obtain the mean value of the risk criterion metric (RCM) in equation (4), first,
the mean remaining failures must be predicted in equation (3).
( ) α
r(t )= exp -β(t -(s-1)) β
(3)
Then, the mean value of the risk criterion metric (RCM) for criterion 1 is formulated
as follows:
RCM r(t)= (r(t) - rc) / rc = (r(t) / rc) - 1 (4)
Equation (3) is plotted in Figure 4 as a function of t for rc = 1, for the Shuttle software
release OID, a software system comprised of modules and configured from a series of
builds to meet Shuttle mission functional requirements, where positive, zero, and negative
values correspond to r (t) > rc, r (t) = rc, and r (t) < rc, respectively.
Figure 4: RCM for Remaining Failures, (rc = 1), OID
In Figure 4, these values correspond to the following regions: above the X-axis
predicted remaining failures are greater than the specified value; on the X-axis predicted
remaining failures are equal to the specified value; and below the X-axis predicted
remaining failures are less than the specified value, which could represent a "safe"
threshold or in the Shuttle example, an "error-free" condition boundary. In the example it
can be seen that at t = 80 the risk transitions from the high risk region to the low risk
region.
18
-0.7
33.5 49 64.5 80
1.3
3.3
5.3
7.3
DESIRED
CRITICAL
r(tt)>rc
r(tt) = rc
Total Test Time (30 Day Intervals)
r(tt) < rc
Successful Application of Software Reliability: Case Study
539
2.9 Evaluate Time to Next Failure Risk (pre deployment or launch phase)
The mean value of the risk criterion metric (RCM) for criterion 2 is formulated as
follows:
RCM TF (t) = (tm - TF (t)) / tm=1 - (TF (t)) / tm (4)
Equation (4) is plotted in Figure 5 as a function of test time t for tm = 8 thirty day
intervals, for OID, where there is high risk for TF(tt) < tm. Once TF(tt) > tm, the risk is low.
Figure 5: RCM for Time to Next Failure (tm = 8 days) OIC
3. Make Reliability Predictions (test and operations phases)
In order to support the reliability goal and to assess the risk of deploying the
software, various reliability and quality predictions are made during the test phase to
validate that the software meets requirements. For example, suppose the software
reliability requirements state the following: 1) ideally, after testing the software for time t,
the mean predicted remaining failures shall be less than one; 2) if the ideal of 1) cannot be
achieved due to cost and schedule constraints, mean time to next failure, predicted after
testing for time t, shall exceed the mission duration; and 3) the risk of not meeting 1) and
2) shall be assessed.
3.1 Additional Risk Evaluation (test and operations phases)
In addition to remaining failures and time to failure risk, which have already been
discussed, various other predictions are made in order to provide a comprehensive
assessment of risk. These predictions are based on the Schneidewind Software Reliability
Model [1, 8, 9, 10]. Again, other models recommended in the Recommended Practice for
Software Reliability [1] could be used. The Statistical Modeling and Estimation of
Reliability Functions for Software (SMERFS) [4] tool is used to support predictions.
In the following equations, parameter α is the failure rate at the beginning of interval s; parameter β is the negative of the derivative of failure rate divided by failure
rate (i.e., relative failure rate); t is test time or the last interval of observed failure data; s is
the starting interval for using observed failure data in parameter estimation that provides
20
-73
24 28 32 44
-53
-33
-13
7
DESIRED TF(tt)>Tm
Tm = 8 days
CRITICAL TF(tt) < Tm
Total Test Time (30 Day Intervals)
36 40
TF(tt) =Tm
540 Norman F. Schneidewind
the best estimates of α and β and the most accurate predictions [8]; Xs-1 is the observed failure count in the range [1,s-1]; Xs, t is the observed failure count in the range [s,t]; and Xt=Xs-1+Xs,t. Failures are counted against operational increments (OIs).
Cumulative Failures: When estimates are obtained for the parameters α and β, with s as
the starting interval for using observed failure data, the predicted failure count in the range
[1,t] is obtained (i.e., cumulative failures) [6]:
F (t)=(α/β)[1-exp (-β ((t-s+1)))]+Xs-1 (6)
Figure 6 provides risk reduction in the sense that the predicted cumulative failures
provide an upper bound on the actual failures (i.e., there is assurance that the actual
failures will ne exceed the predicted values). In addition, risk is mitigated by the fact that
the predictions increase at an increasing rate. Also shown in this figure is the mean relative
error (MRE) between actual and predicted values. The MRE is high due to the fact that
predictions are consistently higher that actual values.
Figure 6: Total Test Time and Remaining Failures vs. Fraction Remaining Failures, OIA
Maximum Failures: Let t→∞ in equation (6) and obtain the predicted failure count
in the range [1,∞] (i.e., maximum failures over the life of the software):
F (∞) = α/β+Xs-1 (7)
Applying equation (7), the predicted maximum failures = 18.4706. Thus, we would
have low risk that the actual cumulative failures will not exceed the value.
Fraction of Remaining Failures: If equation (3) is divided by equation (7), fraction of
remaining failures, predicted at time t is obtained:
p(t)= r(t) /F(∞) (8)
According to the manager of Shuttle software development, equation (8) is an
excellent management tool for providing confidence that the software is ready to deploy,
T o ta
l T
e st
T im
e t
t (3
0 D
a y I
n te
rv a ls
)
0
0
0.1 0.2 0.3 0.4
40
80
120
160
tt
Total Test Time (30 Day Intervals)
0.5
0
1
2
3
4
5
++ +
+
+
+
r(tt)
N u
m b e r
o f
R e m
a in
in g
F a il
u re
s r (
t t )
Successful Application of Software Reliability: Case Study
541
as the fraction remaining failures becomes miniscule, with increasing testing, as Figure 7
attests [5].
Figure 7: Operational Quality (Fraction Fault Removal) vs. Total Test Time, OIA
Operational Quality: The operational quality of software is the complement of p(t). It is
the degree to which software is free of remaining faults (failures), using the assumption 1
that the faults that cause failures are removed. It is predicted at time t as follows:
Q (t) = 1-p (t) (9)
This risk metric is useful because some software engineers and managers would
prefer to see things in a positive light -- quality growth. Figure 7 demonstrates that after t =
100 the improvement in quality becomes miniscule, and the cost to remove additional
faults would be significant. Thus this figure metrics for risk assessment and a sopping rule
for when to terminate testing.
Total Test Time to Achieve Specified Remaining Failures. The predicted test time
required to achieve a specified number of remaining failures at t, r (t), is obtained from
equation (3) by solving for t:
t = 1 r(t)β
(β(s-1)-log( ) β α
(10)
Equation (10) is another risk reduction metric based on the concept that the
predicted test time to achieve a specified number of remaining failures reveals how much
test time and effort would be required to achieve various levels of risk, as represented by
specified remaining failures, as shown in Figure 8, where, naturally, the test time and cost
becomes significantly high in order to achieve significant reductions in risk.
3.2 Interpret Software Reliability Predictions (pre deployment or launch phase)
Total Test Time (30 Day Intervals)
0
0.67
40 80 120
0.75
0.84
0.92
1.0
160
0.59
542 Norman F. Schneidewind
Successful use of statistical modeling in predicting the reliability of a software
system requires a thorough understanding of precisely how the resulting predictions are to
be interpreted and applied [9]. The Shuttle software (430 KLOC) is frequently modified,
at the request of NASA, to add or change capabilities using a constantly improving
process.
Figure 8: Launch Decision: Remaining Failures vs. Total Test Time, OIA
Each of these successive versions constitutes an upgrade to the preceding software
version. Each new version of the software (designated as an Operational Increment, OI)
contains software code that has been carried forward from each of the previous versions
("previous-version subset") as well as new code generated for that new version ("new-
version subset"). We have found that by applying a reliability model independently to the
code subsets we can obtain satisfactory composite predictions for the total version [9].
It is essential to recognize that this approach requires a very accurate code change
history so that every failure can be uniquely attributed to the version in which the defective
line(s) of code were first introduced. In this way, it is possible to build a separate failure
history for the new code in each release. To apply SRE to a software system, it should be
broken your down into smaller elements to which a reliability model can be more
accurately applied. This approach has been successfully applied to predict the reliability of
the Shuttle software for NASA [9].
3.3 Use Software Reliability Tools (test and operations phases)
It is infeasible to do large-scale reliability prediction by hand. Therefore, there are
software reliability tools available to make the model predictions easier to achieve. The
Statistical Modeling and Estimation of Reliability Functions for Software (SMERFS) is a
software package available for this purpose [4]. However, it is important for the user to
understand the capabilities, applicability, and limitations of such tools.
0
1
40 80 120
2
3
4
5
Total Test Time (30 Day Intervals)
160
0
r = Remaining Failures
tt = Total Test Time Until
Launch
EXAMPLE:
(r = 0.6, tt = 52)
Successful Application of Software Reliability: Case Study
543
4. Lessons Learned
Several important lessons have been learned from the experience of twenty years in
developing and maintaining the Shuttle software, which you could consider for adoption in
your SRE process:
1) No one SRE process method is the "silver bullet" for achieving high reliability.
Various methods, including formal inspections, failure modes analysis, verification
and validation, testing, statistical process control, risk analysis, and reliability
modeling and prediction must be integrated and applied.
2) The process must be continually improved and upgraded. For example, recent
experiments with software metrics have demonstrated the potential of using metrics as
early indicators of future reliability problems. This approach, combined with
inspections, allows many reliability problems to be identified and resolved before
testing.
3) The process must have feedback loops so that information about reliability
problems discovered during inspection and testing is fed back not only to
requirements analysis and design for the purpose of improving the reliability of future
products but also to the requirements analysis, design, inspection and testing
processes themselves. In other words, the feedback is designed to improve not only
the product but also the processes that produce the product.
4) Given the current state-of-the-practice in software reliability modeling and
prediction, practitioners should not view reliability models as having the ability to
make highly accurate predictions of future software reliability. Rather, software
managers should interpret these predictions in two significant ways: a) providing
increased confidence, when used as part of an integrated SRE process, that the
software is safe to deploy; and b) providing bounds on the reliability of the deployed
software (e.g., high confidence that in operation the time to next failure will exceed
the predicted value and the predicted value will safely exceed the mission duration).
5. Conclusions
We showed how software reliability predictions can increase confidence in the
reliability of mission critical software such as the NASA Space Shuttle Primary Avionics
Software System. These results are applicable to other mission critical software.
Remaining failures, maximum failures, total test time required to attain a given fraction of
remaining failures, and time to next failure were shown to be useful reliability
measurements and predictions for: 1) providing confidence that the software has achieved
reliability goals; 2) rationalizing how long to test a piece of software; and 3) analyzing the
risk of not achieving remaining failure and time to next failure goals. Having predictions
of the extent that the software is not fault free (remaining failures) and whether it is
likely to survive a mission (time to next failure) provide criteria for assessing the risk of
deploying the software. Furthermore, fraction of remaining failures can be used as both an
operational quality goal in predicting total test time requirements and, conversely, as an
indicator of operational quality as a function of total test time expended.
Software reliability engineering is a tool that software managers can use to provide
confidence that the software meets reliability goals.
544 Norman F. Schneidewind
References
[1]. IEEE/AIAA P1633™, Recommended Practice on Software Reliability, June 2008.
[2]. Billings C., J. Clifton, B. Kolkhorst, E. Lee, and W.B. Wingert. Journey to a Mature
Software Process. IBM Systems Journal 1994; 33 (1): 46-61.
[3]. Dijkstra E. Structured Programming, Software Engineering Techniques. eds. J. N.
Buxton and B. Randell, NATO Scientific Affairs Division, Brussels 39, Belgium April
1970 : 84-88.
[4]. Farr W. and O. Smith. Statistical Modeling and Estimation of Reliability Functions for
Software (SMERFS) Users Guide. NAVSWC TR-84-373, Revision 3, Naval Surface
Weapons Center, Revised September 1993.
[5]. IEEE Standard Glossary of Software Engineering Terminology, IEEE Std 610.12.1990.
The Institute of Electrical and Electronics Engineers, New York, New York, March 30,
1990.
[6]. Keller T., N. Schneidewind, and P. Thornton. Predictions for Increasing Confidence in
the Reliability of the Space Shuttle Flight Software. Proceedings of the AIAA
Computing in Aerospace 10, San Antonio, TX, March 28, 1995: 1-8.
[7]. Schneidewind N. Reliability Modeling for Safety Critical Software, IEEE Transactions
on Reliability March 1997; 46(1):88-98.
[8]. Schneidewind N. Software Reliability Model with Optimal Selection of Failure Data.
IEEE Transactions on Software Engineering November 1993;19(11):1095-1104.
[9]. Schneidewind N. and T. Keller. Application of Reliability Models to the Space Shuttle.
IEEE Software July 1992; 9(4)28-33.
[10]. Schneidewind N. Analysis of Error Processes in Computer Software. Proceedings of the
International Conference on Reliable Software, IEEE Computer Society, 21-23 April
1975:337-346.
[11]. Weyuker E. Using the Consequences of Failures for Testing and Reliability Assessment,
Proceedings of the Third ACM SIGSOFT Symposium on the Foundations of Software
Engineering, Washington, D.C., October 10-13, 1995:81-91.
Bibliography
1. Boehm B. Software Risk Management: Principles and Practices. IEEE Software
January 1991; 8(1): 32-41.
2. Dalal S. and A. McIntosh. When to Stop Testing for Large Software Systems with
Changing Code. IEEE Transactions on Software Engineering April 1994; 20(4):
318-323.
3. Dalal S. and A. McIntosh. Some Graphical Aids for Deciding When to Stop
Testing. IEEE Journal on Selected Areas in Communications February 1990;
8(2):169-175.
4. Ehrlich W., B. Prasanna, John Stampfel, and Jar Wu. Determining the Cost of a
Stop-Test Decision. IEEE Software, March 1993:10(2) 33-42.
5. Keller T. and N. Schneidewind. A Successful Application of Software Reliability
Engineering for the NASA Space Shuttle. Software Reliability Engineering Case
Studies. International Symposium on Software Reliability Engineering, ,
Albuquerque, New Mexico, November 4, 1997: 71-82.
6. Leveson N. Software Safety: What, Why, and How. ACM Computing Surveys
June 1986; 18(2):125-163.
Successful Application of Software Reliability: Case Study
545
7. Lyu M. (Editor-in-Chief), Handbook of Software Reliability Engineering.
Computer Society Press, Los Alamitos, CA and McGraw-Hill, New York, NY,
1995.
8. Musa J. and A. Ackerman. Quantifying Software Validation: When to Stop
Testing? IEEE Software May 1989; 6(3):19-27.
9. Musa John D., Anthony Iannino, and Kazuhira Okumoto. Software Reliability:
Measurement, Prediction, and Applications. McGraw-Hill, New York 1987.
10. Nikora A., N. Schneidewind, and J. Munson. Practical Issues In Estimating Fault
Content And Location In Software Systems. Proceedings of the AIAA Space
Technology Conference and Exposition, Albuquerque, NM, Sep 29-30, 1999.
11. Nikora A., N. Schneidewind, and J. Munson. IV&V Issues in Achieving High
Reliability and Safety in Critical Control Software. Final Report, Volume 1 –
Measuring and Evaluating the Software Maintenance Process and Metrics-Based
Software Quality Control, Volume 2 – Measuring Defect Insertion Rates and
Risk of Exposure to Residual Defects in Evolving Software Systems, and Volume
3 – Appendices, Jet Propulsion Laboratory, National Aeronautics and Space
Administration, Pasadena, California, January 19, 1998.
12. A. Nikora, N. Schneidewind, and J. Munson. IV&V Issues in Achieving High
Reliability and Safety in Critical Control System Software. Proceedings of the
Third International Society of Science and Applied Technologies Conference on
Quality in Design, Anaheim, California, March 12-14, 1997: 25-30.
13. Schneidewind N. Measuring and Evaluating Maintenance Process Using
Reliability, Risk, and Test Metrics. IEEE Transactions on Software Engineering
November/December 1999; 25(6): 768-781.
14. Schneidewind N. Software Validation for Reliability. Wiley Encyclopedia of
Electrical and Electronics Engineering, John G. Webster, editor, John Wiley &
Sons, Inc., 1999;19: 607-618.
15. Schneidewind N. Reliability Modeling for Safety Critical Software. IEEE
Transactions on Reliability March 1997; 46(1):88-98.
16. Singpurwalla N. Determining an Optimal Time Interval for Testing and
Debugging Software. IEEE Transactions on Software Engineering April 1991;
17(4): 313-319.
17. Voas J. and K. Miller. Software Testability: The New Verification. IEEE
Software May 1995; 12(3):17-28.
Norman F. Schneidewind, Ph.D., is Professor Emeritus of Information Sciences in the
Department of Information Sciences and the Software Engineering Group at the Naval
Postgraduate School. He is now doing research and publishing in software reliability and
metrics with his consulting company Computer Research. Dr. Schneidewind is a Fellow of
the IEEE, elected in 1992 “for contributions to software measurement models in reliability
and metrics, and for leadership in advancing the field of software maintenance”. In
2001, he received the IEEE Reliability Engineer of the Year award from the IEEE
Reliability Society. In 1993 and 1999, he received awards for Outstanding Research
Achievement by the Naval Postgraduate School.
Dr. Schneidewind was selected for an IEEE USA Congressional Fellowship for
2005 and worked with the Committee on Homeland Security and Government Affairs,
United States Senate, focusing on homeland security, cyber security, and privacy. In
March, 2006, he received the IEEE Computer Society Outstanding Contribution Award
546 Norman F. Schneidewind
for “outstanding technical and leadership contributions as the Chair of the Working Group
revising IEEE Standard 982.1”.
He is the developer of the Schneidewind software reliability model that was used by
NASA to assist in the prediction of software reliability of the Space Shuttle, by the Naval
Surface Warfare Center for Tomahawk cruise missile launch and Trident software
reliability prediction, and by the Marine Corps Tactical Systems Support Activity for
distributed system software reliability assessment and prediction. This model is
recommended by the IEEE and the American Institute of Aeronautics and Astronautics
Recommended Practice for Software Reliability. In addition, the model is implemented in
the Statistical Modeling and Estimation of Reliability Functions for Software (SMERFS),
software reliability-modeling tool.