Short Answer (250 words)
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 1
The Establishment and Management of a
Software Quality Assessment and Prediction Program *
Richard E. Nance James D. Arthur
Systems Research Center and
Department of Computer Science Virginia Polytechnic Institute & State University
Blacksburg, Virginia 24061-0251
30 June 1997
* Work supported by the Joint Logistics Commanders through the Systems Research Center under contract number
N60921-89-D-A239 B029
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 2
ABSTRACT
The philosophy and rationale for a software quality measurement program is described within the context of the Objectives/Principles/Attributes framework developed by the authors in the late 1980s. Using statistical indicators, such a program relies heavily on automated data collection in both code and document examination. Technical issues such as assessment versus prediction, the relationships among objectives, principles and attributes, factors affecting software quality, and the process/product interaction are addressed. Management issues are covered as well, including the launching of a successful program, presentations to different groups, and different but consistent messages based on interest and needs. Watchwords and cautions for those newly involved with software quality measurement are sprinkled throughout the Handbook. CR Categories: Keywords:
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 3
TABLE OF CONTENTS
1. INTRODUCTION
1.1 Software Measurement: Why, What, How and When
1.2 Frequent and Hidden Pitfalls
1.3 The Rationale of an Holistic Approach
1.4 Background Efforts Contributing to the Handbook
1.4.1 An Evaluation of Software Development
Methodologies
1.4.2 Software Quality Assessment, Prediction and
Validation: A Four Year, On-Site Investigation
1.5 Purpose and Scope of the Handbook
2. FOUNDATIONS FOR A COMPREHENSIVE APPROACH TO
SOFTWARE QUALITY ASSESSMENT
2.1 Establishing a Quality Focus
2.2 The Objectives/Principles/Attributes (OPA) Framework
2.3 Software Quality Indicators
2.3.1 Establishing a Basis for Measuring the
Unmeasurable
2.3.2 Applying the Social Indicator Concept to
Software Quality Measurement
2.3.3 Measuring Characteristics of Process and
Product
3. AN OPA MEASUREMENT PROGRAM
3.1 Accommodating the Organizational Process Model
3.2 Measurement Program Responsibilities
3.2.1 Organizational
3.2.2 Project Management
3.2.3 Software Management
3.3 Derivation of Software Quality Indicators
3.3.1 A Systematic Definitional Procedure
3.3.2 Product and Process Differences
3.3.3 Warnings and Watchwords
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 4
3.4 Interpretation of SQI Results
3.5 Decisions and Actions
4. PROCESS MEASUREMENT
4.1 The Inherent Difficulty of Process Measurement
4.2 Indicator Derivation and Distinctions
4.2.1 Requirements Volatility
4.2.1.1 Detection of Requirements Defects
4.2.1.2 Reverification
4.2.1.3 Reduction of Cohesion
4.2.1.4 Addition of Coupling
4.2.1.5 Disruption in Traceability
4.2.2 Software Development Folders (Files)
4.2.2.1 An Historical Repository
4.2.2.2 An Evolutionary Record
4.2.3 Software Quality Assurance Infrastructure
4.2.4 Process Stability
4.2.5 Product Test
4.3 Interpretation, Decisions and Actions
5. PRODUCT MEASUREMENT: DOCUMENTATION
5.1 Motivation and Distinctions
5.2 The Characteristics of High Quality in Documentation
5.2.1 Principles Leading to High Quality
5.2.2 Attributes of High Quality Documentation
5.3 Measurement Approaches
5.3.1 A Manual Procedure
5.3.2 An Automated Procedure
5.3.3 A Combined Approach
5.4 Interpretation, Decisions and Actions
5.4.1 Beginning a Document Quality Measurement
Program
5.4.2 Integrating Process and Document
Measurement
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 5
6. PRODUCT MEASUREMENT: CODE
6.1 Indicator Distinction and Derivation
6.2 Attributes of Code Quality
6.2.1 Cohesion
6.2.1.1 An Indicator of Cohesion: The Use of
Block Statements
6.2.1.2 Other Indicators of Cohesion
6.2.2 Complexity
6.2.2.1 An Indicator of Complexity: Mixing the
Order of Parameters in a Call Statement
6.2.2.2 Other Indicators of Complexity
6.2.3 Coupling
6.2.3.1 An Indicator of Coupling: The Use of
Strucutred Data Types as Parameters
6.2.3.2 Other Indicators of Coupling
6.2.4 Ease of Change
6.2.4.1 An Indicator of Ease of Change: The Use
of Symbolic Constants
6.2.4.2 Other Indicators of Ease of Change
6.2.5 Readability
6.2.5.1 An Indicator of Readability: Use of GOTOs
6.2.5.2 Other Indicators of Readability
6.2.6 Traceability
6.2.6.1 An Indicator of Traceability: Use of
Comments Referencing Project Documents
and “Who Called Me?”
6.2.6.2 Other Indicators of Traceability
6.2.7 Well-defined Interfaces
6.2.7.1 An Indicator of Well-defined Interfaces:
The Use of Parameterless Procedures
6.2.7.2 Other Indicators of Well-defined
Interfaces
6.2.8 Attributes of Code Quality: A Summary
6.3 The Automated Collection of Code Indicators
6.4 Interpretation, Decisions and Actions
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 6
7. AN EXAMINATION OF INDICATOR VALUES AND THEIR
AGGREGATION
7.1 Using and Interpreting Indicator Values
7.2 Integrating Code, Documentation and Process Quality
Indicators in a Software Quality Measurement Program
8. PREDICTION AND ASSESSMENT: DIFFERENCES IN GOALS
8.1 Effects of Process Models
8.1.1 The Waterfall Model
8.1.2 The Domain-Dependent Life-Cycle Model
8.1.3 Boehm’s Spiral Model
8.2 Software Reuse
8.3 Sustaining Responsibilities and Quality Measurement
8.3.1 Initial Assessment
8.3.2 Continuing Assessment
8.4 The Quality Database and Validity of Indicators
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 7
LIST OF FIGURES AND TABLES
Figure 2.1 Illustration of the Relationship among Objectives, Principles, Attributes in the
Software Development Process
Figure 2.3 Linkages Among the Objectives, Principles and Attributes
Figure 2.3 Exploiting Both Process and Product Indicators
Figure 2.4 Measurement Scale
Figure 3.1 The Software Production Task
Figure 5.1 Principle and Attribute Relationships for Document Quality
Table 5.1 Criteria for Measurement of document Quality Suggested by the OPA Framework
Table 5.2 Examples of Documentation Quality Indicators (DQI’s) (DQI = Document
Attribute/ Document Property)
Table 5.3 Document Quality Indicators Automated in DOCALYZE
Table 5.4 Checklist for Diagnosing Possible Problems with Document Quality
Measurement
Figure 7.1 Integrating Code, Documentation and Process Quality Indicators in a Software
Quality Measurement Program
Figure 8.1 The Software Development Life Cycle for DoD-STD-2167 (from [Lavender
1988, p. 18])
Figure 8.2 Gidding’s Domain Dependent Software Life-Cycle Model (from [Giddings 1984,
p. 431])
Figure 8.3 The Spiral Model [Boehm 1986]
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 8
1. INTRODUCTION
Within the software engineering community we are beginning to see the emergence of models
and assessment procedures that focus on software quality measurement and the capability of an
organization to produce a quality product [Betz and O’Neill 1990; Jones 1986; Humphrey 1989].
While certainly a step in the right direction, these procedures describe where we would like to be,
but provide little assistance in how to get there. The material presented in this handbook
describes not only the destination, i.e. a process supporting software quality assessment and
prediction, but a road map outlining how to get there.
The objectives of this handbook are three-fold:
(1) to provide guidance to persons within a software development or software support
organization who are interested in establishing a measurement program that includes software quality prediction and assessment;
(2) to offer guidance to those persons employing software quality measurement for the purpose
of increasing the effectiveness and efficiency of their activities, and (3) to share lessons learned during the research and application of software quality
measurement, with the hope that improvement can be achieved through a broader recognition of common problems and a deeper understanding of the fundamental issues in designing, implementing and supporting software systems.
1.1 Software Measurement: Why, What, How and When
To set the objectives for software measurement and to focus on the approach taken to achieve
these objectives, we pose four crucial questions followed by brief answers.
Why does an organization pursue the establishment of a measurement program?
More specifically, what goals underlie the establishment of a measurement program? The goals
touted most often are those of increased productivity, enhanced product quality and an improved
development process. An examination of these goals from an organizational perspective reveals
their "bottom line" characteristics in terms of an organizational "buy-in" and cost impact.
Increased productivity is measured on both the individual and group levels, often related to
short-term goals focusing on the speedy development of individual software units and usually
expressed in terms of staff-hours (or staff-days) and cost savings. The ability to claim enhanced
quality, however, is often elusive because proposed measures of quality are typically subjective
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 9
and controversial. Currently no universally accepted scale, standard or procedure exists to either
measure or compare/contrast quality. The claim of enhanced quality is often associated with
more intermediate- or long-term goals of an organization, and like productivity, quality is
expressed in terms of organizational costs, but within time frames measured in months rather
than hours or days. An improved process more directly reflects an organizational view and is
almost always tied to the longer-term goals. Because "pay-off" is only realizable in the long
term, e.g. years, the costs associated with improving a development process are not so easily
determined.
Based on an examination of the three goals in the paragraph above, one can begin to understand
and appreciate why the achievement of one particular goal might be more emphasized than
another. In particular, increased productivity is most often emphasized by management because
the benefits are more quickly realizable. Establishing a continuous process improvement
program has lower priority because of the significant "up front" costs and delayed (although
substantial) benefits. More importantly, this reasoning suggests why an abundance of
measurement programs stressing increased productivity exists relative to programs focusing on
enhanced quality and process improvement. Ironically, to achieve a process that consistently
produces a quality product, process improvement must be a first priority. Enhanced quality, and
to a lesser degree, increased productivity, are the consequents of continuous process
improvement.
What is software quality?
The meaning of software quality can vary, depending on a person's (or a group's) perspective.
For the software engineer, quality characteristics are often stated in terms of attributes associated
with individual software components, e.g., high code cohesion and low coupling among
modules. On the other hand, a project manager views product quality as related to the
achievement of project level objectives, e.g., maintainability and reliability. Because software
quality must reflect characteristics of the product as a whole, and not just components thereof,
we maintain that the proper framework for expressing product quality must ultimately
accommodate the project manager's perspective, i.e., that associated with project level
objectives. Work by McCall, et. al., identifies an initial set of 13 quality factors and discusses
their relationship to product characteristics and measures supporting their assessment [McCall et
al. 1977]. Based on a survey of current literature and focusing on product development from a
software engineering perspective, we have identified the seven (7) most widely accepted project-
level objectives attributable to software quality: adaptability, correctness, maintainability,
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 10
portability, reliability, reusability and testability [Arthur and Nance 1987]. We prefer the term
“objective” to factor because each represents a project-level characteristic that is sought.
Notably missing from the above list is any mention of cost, schedule, and efficiency. This
purposeful omission stems from the fact that cost, schedule and efficiency are not objectives of
the software engineering process, but constraints which are imposed at the systems engineering
level. Effectively, cost, schedule and efficiency are "givens" which bound the limits of
flexibility afforded to the software developers in producing a quality product. For example,
performance requirements might dictate that intermodule communication be implemented
through global variables rather than through parameter passing. In achieving maintainability, a
system whose modules communicate through parameter passing is certainly preferred to one
where intermodule communication relies primarily (or even partially) on the use of global
variables.
Clearly, cost, schedule and efficiency can and do impact product quality. For example, if a
project is behind schedule, management is more likely to accept developmental "short-cuts" that
can adversely impact the quality of a product. Moreover, if meeting the schedule is viewed as a
quality criterion, one can easily (and incorrectly) surmise that because a project is on schedule, a
quality product is being produced.
How is software quality measured?
An effective software quality measurement program cannot be developed using semi-related
measures combined in an ad hoc, unnatural fashion. To the contrary, measures of quality, and
the framework within which they are applied, require a realistic characterization of the software
development process. In particular, both the definition and application of software quality
measures must be guided by a systematic process that recognizes inherent linkages between the
software development process and the achievement of software engineering desirables. The
Objectives/Principles/Attributes (OPA) Framework provides the basis and rationale for such a
process [Arthur and Nance 1990]. More specifically the OPA Framework advances the
following rationale for software development:
• a set of project level objectives should be identified from those characterizing software quality,
• to achieve those objectives, certain principles are employed that govern the process by which software is created, and
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 11
• adherence to a process governed by those principles should result in a process and a product (programs and documentation) that possess attributes considered desirable and beneficial.
In effect, the OPA Framework characterizes the raison d'être for software engineering; that is, it
embodies the rationale and justification for software engineering. The objectives represent those
desirable claims about the project in total, e.g. the extent to which the product is maintainable
and reliable. The software engineering principles, e.g., information hiding and structured
programming, stipulated by the development methodology, express how the activities of the
development process are performed to achieve the stated project objectives. Quality attributes of
the product, like low inter-module coupling and high code cohesion, result from a process
governed by the use of specified principles.
Following the rationale induced by the OPA Framework, the identification and synthesis of
software quality measures must reflect product properties attesting to the extent to which defined
attributes are present in, or absent from, product components. Pairwise linkages (attribute to
principle and principle to objective) are then employed to propagate property/attribute measures
to the objectives level, resulting in product quality information attuned to project level
objectives.
When is software quality measured?
To predict product quality, measurement must begin early in the software development life
cycle. Typically, measurement begins with the measurement of software requirements.
Requirements volatility is a prime candidate for predicting the quality of the final product.
Further, additional process-oriented measures are applied throughout the life-cycle phases.
Product quality assessment entails an examination of the product (code and documentation).
Within the classical waterfall life-cycle model [Sommerville 1992, pp. 5-10], assessment is
performed after the coding phase. If the development process follows an incremental model
[Sommerville 1992, pp. 109-110], assessment can be performed on pieces of the product, from
which those attendant measurement values can be used to predict the quality of components yet
to be developed. Clearly, an instrumented development process is essential to the prediction of
software quality.
1.2 Frequent and Hidden Pitfalls
The intent of this section is to outline several of the most common pitfalls encountered in
establishing and managing a software quality assessment program. Many of the problems
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 12
presented reflect a "common sense" approach to software quality assessment. Other difficulties
are derived from experience. While we do not consider the set to be comprehensive in scope, it
does represent many of the more common mistakes .
(1) Starting too big: Today, knowledgeable software people recognize the need for and utility
of a software quality assessment program. Because such programs impact many
organizational entities, the committee tasked with establishing a measurement program is
often, by necessity, quite large. Many good ideas are advanced, all having sound
justification for inclusion. What often emerges from a committee, however, is a formidable
design that, realistically, is unmanageable, cost-prohibitive and would take years to
implement. Organizations attempting to implement such an ambitious program tend to end
up with only an incomplete subset of the identified metrics. More often than not, the
measures that are captured are the less critical ones thus failing to reflect adequately a
comprehensive view of software quality. The end result is that users develop a mistrust for
the existing measures and a tendency to discount any future endeavors to develop even a
scaled-down metrics program.
When designing a metrics program, start small! Focus the efforts and resources on a
specific process element or sub-organization, and achieve success. Use the knowledge
gained from this limited effort to expand the measurement program, demonstrating the
benefits and successes of the initial efforts.
(2) Collecting too much data: A natural tendency is to collect data "because it is there", or
because it might be useful in the future. As stated by Basili and Weiss [1985], such an
approach often: (a) leads to volumes of useless data consuming large amounts of disk
storage, (b) obscures the real value of essential data element and (c) promotes an
inadvertent omission in the collection of other important data elements. Moreover, an
undesirable ramification of indiscriminate data collection is the error-prone "massaging" of
existing data elements to produce a substitute for a data element that is not collected.
Rarely does a retrospective examination of data collected without a clear purpose reveal
those crucial insights supporting effective quality assessment and process control.
The Objectives/Principles/Attributes Framework discussed in [Arthur and Nance 1991] and
the Goal/Question/Metric paradigm outlined in [Basili and Rombach 1988] offer focused
approaches to data collection. In particular, both encourage the identification and
collection of essential data elements based on a thorough investigation of what is to
measured and why. That is, first identify the measurement goals or purposes. Use these to
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 13
derive more concrete characteristics. And finally, determine computational forms and
supporting data elements to measure those characteristics.
(3) Supplying the proper level of information for management decision making: Management
exists at different levels. Consequently, software quality reports must be tailored to reflect
the concerns at each level. For example, the software engineer focuses on producing
program units that are highly cohesive and easily readable. To him or her, reports
summarizing quality measures in terms of desirable software engineering attributes on a
per unit basis are most helpful. On the other hand, the project manager is concerned with
the achievement of project level objectives such as reliability. Providing the project
manager with a report that details unit-level characteristics provides minimal (if any)
beneficial insights into behavior which is of direct and immediate concern. Conversely,
providing the software engineer information about the project as a whole does not meet that
individual’s pressing needs. Accordingly, an effective reporting process is one that
supplies information reflecting the proper perspective across multiple levels of
management. Moreover, that information should be consistent; i.e. problems indicated at a
high level should be explained by examination of more detailed information.
(4) Misusing information designed for quality measurement: At the program unit level, error
density and person-hours spent to correct defects are prime indicators of the reliability and
maintainability that unit. One might also infer from such information that the software
engineer who produces a unit having a high incidence of errors is lacking in the necessary
skills to produce a reliable or maintainable product. Such inferences have the appearance
of being logically sound, but the appearance is misleading. In the above situation, the
software engineer might be reacting to a set of requirements that are constantly changing --
producing a quality program is beyond his or her control. Accordingly, one must recognize
that software quality measures are developed with one goal in mind -- to measure quality.
The use of quality measures for any other purpose, such as judging individual competence,
should be strongly discouraged. Such activities paint a negative picture of quality
measures and can have a detrimental impact on quality assessment by motivating the
reporting of incomplete or invalid data.
(5) Relying too much on a manual process: Software quality measurement is a time-
consuming activity which requires a dedicated commitment by the conducting personnel.
Data collection, the most demanding activity associated with quality measurement, can
exact an inordinately high cost in terms of personnel time and effort if it is not supported by
automation. At its best, data collection (and validation) is still an error-prone, tedious
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 14
process. Our experiences underscore the absolute necessity of a (semi-)automated process
to assist in the data collection process [Arthur and Nance 1987]. In support of such a
process, data elements must be defined as objectively as possible, and in a manner that
facilitates their automated collection. Moreover, computation of individual metric values,
aggregation of these values, and reporting procedures supporting quality analysis should be
automated to the extent possible.
(6) Using a single metric to measure quality: Quality is reflected in a software product
through the combination of many distinct characteristics. For example, code cohesion,
module coupling and readability all contribute to the project level, software quality
objective of maintainability. All too often, however, we hear of quality measurement
programs that base their assessment process exclusively on one or two complexity
measures. Prevalent among these are McCabe's Cyclomatic Complexity Measure [McCabe
1976], Interface Design Metrics [Zage 1995] and the ration of open to closed software
trouble reports. Such metrics reflect aspects of quality, but each focuses on a single facet.
For example, McCabe's Cyclomatic Complexity Measure computes the unique number of
paths through a module. While an excessive number of paths does indicate potential
problems, it reflects only one of the many facets of software quality.
In summary, our experience suggests that the set of software quality measures should be:
• based on a clear statement of realizable measurement goals,
• sufficient in number to provide complementary and contrasting data,
• used to report results that are meaningful and consistent to software engineers, software managers and project managers, and
• automated to the extent possible within the software evolutionary process and recognizing the desirables listed above.
1.3 The Rationale of an Holistic Approach
Today, software development processes and practices are being structured to improve the
likelihood of achieving goals and objectives set forth in systems engineering and software
engineering. Most approaches include a well-defined sequence of activities that embody
requirements definition, design, implementation and unit/integration/system testing. Associated
with each of these is a structured process or methodological approach that outlines how one
carries out each activity. These activities, structured processes and methodologies have evolved
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 15
over time and reflect a wealth of experience. Prudence dictates that if these activities, processes
and methodologies do represent through lessons learned a better approach to software
development, then we should exploit their contributions in the design and implementation of an
effective software quality assessment program. The rationale described below reflects an holistic
approach to software quality assessment, one that capitalizes on the advantages of a structured
approach to software development.
Underlying any major software development effort is a set of methods and procedures for
applying them during development activities. While some methods may be more explicitly
defined than others, the influence of implicit procedures is significant. More formally, these
methods and procedures constitute a methodology. The methods and procedures of a software
development methodology emphasize and should prioritize project level software engineering
objectives. Examples of such objectives are maintainability, reliability, adaptability and
reusability. To achieve such objectives the methodology should identify the proper set of
principles to be used in the development process. These process principles also form the basis
by which one specifies the environment tools required to support the development process.
Adherence to a process governed by these principles results in a product exhibiting desirable and
beneficial attributes. In effect, a natural set of linkages relate the use of particular principles in
the development process to the achievement of individual objectives, and subsequently, to the
manifestation of the product attributes.
The above observations reflect the rationale behind the Objectives/Principles/Attributes (OPA)
Framework and serves as a basis for the establishment of an effective software quality
assessment program. The OPA Framework and the underlying rationale succinctly express the
guiding motivations that link project to process and process to product. Our approach to
software quality assessment, and that outlined in this Handbook, recognizes the utility of those
linkages in defining a systematic procedure for evaluating the quality of software. That is, we
identify product properties that are definitively related to the presence (or absence) of beneficial
attributes. These attribute/property pairs (or software quality indicators -- SQIs) form the basis
for metric definition and measurement, and provide evidence attesting to the existence of
desirable software engineering attributes in the product. SQI values are propagated through the
linkages and aggregated at the principles level to provide a characteristic reflection of the
development process. The aggregated values are again propagated along the linkages relating
principles to objectives and further aggregated to form a picture that depicts the extent to which
project level objectives are being achieved. This assessment can be conducted during the
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 16
development period to predict the quality of the forthcoming product or applied to an existing
product to assess the quality attained.
1.4 Background Efforts Contributing to the Handbook
The knowledge and experience conveyed in this Handbook is drawn from two applied research
efforts. The first, extending over three years, culminated with the development and application
of a procedural approach to evaluating software development methodologies. Refinements of
the methodology evaluation procedure have led to a more comprehensive approach to the
assessment and prediction of software quality, subsequently subjected to a longitudinal
validation effort over a five-year period.
1.4.1 An Evaluation of Software Development Methodologies
The evolution of the OPA Framework and, to lesser extent, experiences that have helped to
shape the recommendations outlined in this Handbook have their roots in an effort focusing on
the evaluation and comparison of two distinct software development methodologies [Arthur and
Nance 1987]. That comparison is based on assessing, from a software engineering perspective,
the adequacy and effectiveness of the development methodologies.
Methodological adequacy is defined as the degree to which a methodology can support the
achievement of stated project-level goals and objectives. Fundamental to gauging adequacy is a
clear statement in the methodology outlining: (a) the primary software engineering objectives,
and (b) the (process) principles one uses to achieve those objectives. To implement an effective
measurement program the identification of expected (product) attributes resulting from the
application of such principles would also be necessary. The methodology might state, however,
that no measurement program is planned. Consequently, we base an assessment of
methodological adequacy on a "top down" comparison relating: (1) how well the methodological
objectives correspond to stated project level goals and objectives, (2) the extent to which the
methodology emphasizes those principles linked to the achievement of stated objectives, and (3)
if identified, the comparison of targeted product attributes, either implied or stated in the
assessment of quality and acceptance decisions, to those emphasized by the governing principles
[Arthur and Nance 1990].
In contrast, the effectiveness of a methodology is defined as the degree to which a methodology
produces the desired results identified in the objectives stated by the development methodology.
Recognizing and employing the relationships described above, i.e., the achievement of objectives
through the use of proper development principles and the subsequent realization of desirable
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 17
attributes in the product, assessing the effectiveness of a methodology begins with a "bottom-up"
examination of the product (code and documentation) for the presence or absence of desirable
attributes. Observing the extent to which attributes are present in the product provides a basis for
inferring the use of software engineering principles in the development process. In turn, this
information enables one to claim the achievement of stated software engineering objectives.
1.4.2 Software Quality Assessment, Prediction and Validation: A Four Year, On-Site
Investigation
In October of 1990 the JLC-CRM began funding a four-year effort that focused on:
(1) the refinement of product quality indicators underlying an OPA approach to assessment,
(2) the development of automated document and code analyzers to extract pertinent data elements supporting the computation of software quality measures,
(3) the identification of a suitable Ada-based software development project by which the OPA approach could be validated,
(4) on-site process instrumentation and data collection supporting quality assessment, and
(5) a validation study to determine the effectiveness of an OPA approach to software quality assessment and prediction.
Over a four-year period VTSRC personnel worked closely with project personnel to set up and
carry out the validation study. Intentionally, we maintained a low profile to minimize the impact
of our presence on the development effort and any perturbation of the resultant statistical study.
The effort included attending code walkthroughs and design reviews, meetings with the sponsor,
with the developing team and an IV&V team, the identification of critical process activities,
process instrumentation, data collection, metric computation and statistical analysis. The results
of that investigation are forthcoming in a report to the JLC-CRM and to the organization with
which the study was done.
1.5 Purpose and Scope of the Handbook.
As stated earlier, the intent of this Handbook is to provide guidance in the establishment and
management of a software quality assessment and prediction process. That guidance is based on
lessons learned during the development and refinement of the OPA Framework for quality
assessment, and on the knowledge gained from actually implementing an assessment and
prediction "program." The material outlined in this Handbook is descriptive as well as
prescriptive in nature, and is intended to support the project manager, software manager and
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 18
software engineer in their efforts to establish and manage a software quality measurement
program. The OPA Framework approach describe herein is not intended as the "end-all, be-all".
However, our firm belief is that the approach described herein reflects a sound comprehensive
basis for establishing a software quality assessment program that demonstrates the best current
knowledge.
Chapter 2 of this Handbook presents the OPA Framework and its coupling with the Software
Quality Indicator concept to produce a sound, holistic approach to the measurement of software
quality. It relates the approach to the goals of prediction and assessment, and notes the influence
of the process model on the application. Chapter 3 examines the "theoretical" guidance
presented in Chapter 2 in the context of the realities of organizational capabilities, limitations
and constraints. Specific guidance is provided in the derivation of Software Quality Indicators,
and some cautions are shared from past experience. Chapter 4 explains the application of the
measurement program to the activities of development and maintenance, noting the distinctive
characteristics of process examination. The important role of the softwaqe quality assurance
function is addressed. Chapter 5 relates the application of the measurement program to
documentation artifacts, noting the particular capabilities and limitations of measuring document
quality. Chapter 6 focuses on code assessment as an integral part of quality assessment. Similar
to our discussions of process and documentation, we strive to point out both the capabilities and
limitations associated with code assessment. The challenge of integrating indicator values to
produce a complete picture is treated briefly in Chapter 7. Chapter 8 attempts to place assessment
and prediction in perspective. In particular, it focuses on: (a) the effects of process models on
measurement, (b) the differences in using measurement in sustainment (maintenance) or
development and (c) the view and management of measurement data as a corporate asset.
2. FOUNDATIONS FOR A COMPREHENSIVE APPROACH TO SOFTWARE QUALITY ASSESSMENT
The objectives of this section are three-fold:
(1) to motivate the necessity of establishing a software quality assessment program based on fundamental software engineering concepts,
(2) to present the OPA Framework with Software Quality Indicators (SQIs) as a sound comprehensive approach to quality measurement, and
(3) to relate process and product measures to the goals of prediction and assessment.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 19
2.1 Establishing a Quality Focus.
To determine the rainfall for any particular day one could periodically measure humidity and
temperature for the selected day and then, based on established physical laws, compute how
much rain one would have expected to have fallen. Similarly, one might advance conjectures
about the quality of a product based on influencing measures like conformance to schedule,
productivity, and cost estimates. In both cases, the substitute measures are "somewhat related"
to the stated measurement goal, but lack that definitive connection which directly relates the
measurement process to the final objective. For example, a more accurate process to measure
rainfall is to place a measurement device outside to directly collect and document the amount of
rain that actually falls. Moreover substitute measures can often lead to false conclusions.
Schedule slippage, for example, can and often does adversely impact product quality. Can one
draw the reasonable conclusion then that if a project is on schedule, quality is present in the
product? The obvious answer is, “No.”
In concert with the above observations, we offer the following guidance as preliminary steps to
establishing a software quality assessment program:
(1) identify software quality as the major goal of the underlying measurement process, and
(2) define measures that
• are objective, • directly related to software quality, and • reflect inherent characteristics of the software engineering process.
Emphasizing quality measures that reflect characteristics of the software engineering process is
of particular importance because it focuses attention on the domain from which such measures
are extracted. More specifically from a software engineering perspective, software quality is not
about efficiency, scheduling, cost or even functionality. These are systems engineering
objectives that place constraints on the software engineering process, and subsequently, on the
achievement of software quality goals. For example, to achieve mandated efficiency levels the
software engineer might employ the use of global variables for inter-module communication.
Although necessary from the efficiency viewpoint, the use of global variables for inter-module
communications is detrimental to maintainability. Similar examples can be cited for schedule,
cost and functionality. Recognition of the differences between systems and software
engineering goals is crucial, and the achievement of specific software engineering objectives is
often constrained by the "givens" established at the higher systems engineering level. In turn,
such recognition enables one to focus attention on the identification and definition of measures
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 20
derived from the trends and artifacts of the software engineering process which more accurately
reflect product quality.
The remainder of Chapter 2 expands on the guidance provided above by identifying and
describing a framework that characterizes the software engineering process, while serving as the
focusing agent for measuring software quality. Software quality indicators (SQI's) are also
presented. SQI's play an integral role in the definition of quality measures reflecting the
presence (or absence) of desirable product attributes. Additionally, we discuss the impact an
established (or proposed) process model can have on the identification and definition of quality
measures, and finally, distinguish between the measurement goals of assessment and prediction.
2.2 The Objectives/Principles/Attributes (OPA) Framework
The rationale of the Objectives/Principles/Attributes (OPA) Framework [Arthur and Nance
1990] is briefly described in Chapter 1; more detail is given here. As illustrated in Figure 2.1,
the framework enunciates definitive linkages among project level objectives, software
engineering principles, and desirable product attributes, advancing the following rationale for
software development:
• a set of objectives can be defined that correspond to project level goals and objectives,
• achieving those objectives requires adherence to certain principles that characterize the process by which the product is developed, and
• adherence to a process governed by those principles should result in a product that possesses attributes considered to be desirable and beneficial.
Underlying this rationale is a natural set of relations, depicted in Figure 2.2, that link individual
objectives to one or more principles, and each principle to one or more attributes. For example,
to achieve maintainability one might employ the principle of information hiding in the
development process. In turn, employing information hiding will result in a product that exhibits
a well-defined interface.
The OPA Framework differs from other structurally similar frameworks, e.g. McCall's
Factor/Criteria/Metric [McCall et al. 1977] and Basili's Goal/Question/Metric [Basili and
Rombach 1988], in that all OPA measures are linked to project-level objectives through software
engineering principles that guide the software development process. Analogically, principles
function like a fulcrum, providing the supporting capability reflected in the software attributes to
lift the product in attaining the designated objectives. More specifically, principles
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 21
• provide the foundational definition of the desired or proper process for developing software, and
• enable one to reason about and identify those activities that contribute to or adversely impact the software development process.
Main tainab ili ty Co rrectness Reu sability Testability Reliability Portability Ad aptability
OBJECTIVES
Hierarch ical Deco mp osition Fun ction al Decompositio n In fo rmatio n Hiding Stepwise Refinement Stru ctu red Pro gramming Life-Cycle Verification Co ncurrent Docu men tation
PRINCIPLES
Reduced Coupli ng Enh anced Cohesion Red uced Complexity Well-Defined Interfaces Read ability Ease of Chan ge Traceability Visibility of Behav io r Early Error Detectio n
ATTRIBUTES
ATTRIBUTES
DOCUMENTATION PROGRAMS
PropertiesProp erties
(+)
OBJECTIVES
PROCESS
PRODUCT
PRINCIPLES
PROJECT
Figure 2.1: Illustration of the Relationship Among Objectives, Principles, Attributes in the Software Development Process
How does one determine if, and to what extent, a product possesses desirable attributes? The
answer lies in the observation of product properties, i.e. observable characteristics of the product.
For example, the use of global variables indicates that a module interface is not well-defined
[Dunsmore and Gannon 1980, p. 149]. More specifically, the number of global variables used
relative to preferable forms of communications, e.g. parameter passing, indicates the extent to
which the interface is ill-defined.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 22
Adaptability
Co rrectness
Main tainab ility
Portab ility
Reliab ility
Reusab ility
Testability
In fo rmatio n Hiding
Con curren t Docu men tation
Fu nctio nal Decompositio n
Hierarchical Decompositio n
Life Cycle Verification
Stepwise Refinement
Stru ctu red Programming
Coh esio n
Complexity
Cou plin g
Early Error Detectio n
Ease o f Chang e
Read ability
Traceability
Visib ility o f Behavior
Well-Defined Interfaces
OBJECTIVES PRINCIPLES ATTRIBUTES
Figure 2.2: Linkages Among the Objectives, Principles and Attributes
Implementing an effective quality measurement program mandates a systematic approach that
reflects the best current software engineering practices. We recommend an approach that
embraces the OPA Framework as a basis. Through its property/attribute pairs and linkages
relating attributes to principles and principles to objectives, the OPA Framework supports a well-
defined, systematic approach to examining product and process quality. The OPA Framework
definitively links the achievement of software engineering objectives to the use of specific
principles and the use of such principles to the realization of desirable attributes in the product.
Subsequently, by observing product properties to determine the extent to which desirable
attributes are present in the product, one can determine the extent to which particular principles
are governing the development process and, in turn, the extent to which stated software
engineering objectives are achieved.
Moreover, guided by an OPA characterization of the software (both the artifacts and the
development or sustainment process), one can analyze and examine relationships in the
interpretation of quality measures. For example, if one observes a value indicating a low degree
of achievement for a software engineering objective (not consistent with expectations), then
contributing principles are examined (based on the defined linkages among objectives and
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 23
principles) for anomalous values. Similarly, the linkages among principles and attributes point
to candidate attributes to be examined to identify the contributing source(s). Finally the
attribute/property relations enable the identification of the most prominent process or product
characteristic(s) influencing the original objective value. The identification of an anomalous
value for an attribute/property pair indicates the misuse (or omission) of a critical software
engineering principle. The points where this principle is most utilized in the process become the
prime candidates for attention. With appropriate reporting one can also determine if the
offending product component(s) are isolated or the problem is widespread.
2.3 Software Quality Indicators
The OPA Framework and its enunciated rationale binds measurement and measurement
interpretation to a realistic characterization of how software is actually produced. Below, we
describe the concept of Software Quality Indicators (SQI's) that reflect an OPA perspective and
provide a sound basis on which quality measures are defined.
2.3.1 Establishing a Basis for Measuring the Unmeasurable
"Software quality factors," "software quality metrics," and "software quality indicators" -- are all
terms used in the conviction that the quality of the software product should be measurable, at
least in a relative sense. In a paper by Kearney, et. al., [1986] the authors issue a rather
compelling criticism of the inadequate basis for measuring software complexity and of the
shortcomings of experimental research intended to support complexity metrics. We share the
opinions of Kearney and his colleagues, and propose the use of statistical indicators as the basis
for scalar determination of product and process characteristics. The motivation for using
statistical indicators of software quality stems from the qualified successes in applying them to
unmeasurable economic and social concepts. This motivation, as well as extension of the
applicable theory to the derivation of software quality indicators, is described below
Both economic and social indicators are based on the premise that directly unmeasurable
qualitative conditions can be indirectly assessed by measurable quantitative characteristics. The
economic indicators of a "good or improving economy" are routinely discussed in business news.
Social indicators like "safe streets" are often cited as contributing elements of policy decisions.
Meier and Brudney provide an instructive definition for social indicators that serves as the
foundation for our definition of software quality indicators [Meier and Brudney 1981, pp. 95-
96]:
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 24
An indicator is a variable that can be measured directly and is linked to a concept through an operational definition. An operational definition is a statement that tells the analyst how a concept will be measured.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 25
Two important characteristics of social indicators are stressed by Carley [1981, p. 2]:
• Social indicators are "surrogates" that do not stand by themselves -- a social indicator must always be related back to the unmeasurable concept for which it serves as a proxy.
• Social indicators are concerned with information which is conceptually quantifiable, and must avoid dealing with information which cannot be expressed on some ordered scale.
The parallels which can be drawn between the concept of social indicators and that of software
quality indicators are: (1) both attempt to measure the directly unmeasurable through the use of
surrogate (or substitute) measures that are directly observable, and (2) an undeniable relationship
must exist between the surrogate measure and the concept being measured.
2.3.2 Applying the Social Indicator Concept to Software Quality Measurement
The concept of software quality indicators is a natural extension of the use of statistical
indicators in the social sciences. The need arises from the fact that certain characteristics cannot
be measured directly and require surrogate measures in order to obtain quantitative assessment
[Carmines and Zeller, 1979, pp. 9-11]. An example in software is the measurement of cohesion,
which cannot take a simple direct form; thus, the need exists to define an indicator that can
reflect either desirable (high) or undesirable (low) cohesion in a software component. Multiple
indicators can perform confirming and contrasting roles to permit a “hardening” of the softness
typically associated with this indirect form of measurement.
Software quality indicators are embodied in the OPA Framework through attribute/property
relationships. For example, an intangible attribute of the development process, like early error
detection, can be indirectly assessed through measurable properties, like the changing of
requirements after the software specification review. For clarification purposes we note that our
use of the term "Software" in "Software Quality Indicators" is not intended to be restrictive, but
applicable to both process and product quality indicators.
A Software Quality Indicator (SQI) is a variable whose value can be determined through direct analysis of product or process characteristics, and whose evidential relationship to one or more attributes is undeniable [Arthur and Nance, 1987, p. 25].
Crucial in this working definition is that
• the value is directly measurable through the analysis of the software development process or products of that process, e.g., programs and documentation, and
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 26
• SQIs are always attribute/property pairs denoting undeniable relationships, and indicative of the presence or absence of one or more attributes.
Consider, for example, an SQI based on code analysis: coupling through the use of structured data
types (CP/SDT). The property in this SQI is the use of structured data types, and the attribute is
coupling. One can argue that the use of a structured data type as a parameter argument has a
detrimental impact on module coupling. That is, structured data types allow the consolidation of
data items perceived to be related in a given context. When passed as a parameter, however,
rarely is every data item in the structure accessed by the calling module. Consequently, these
extraneous items, from the perspective of the calling module, unnecessarily increase the coupling
between the calling and called modules. [Troy and Zweben, 1981, p. 115]. A candidate measure
for this coupling is the ratio of the number of structured data types used as parameters relative to
the total number of parameters.
Structured Data Types Passed as Parameters/Coupling = # of SDT in Parameter List | Parameter List | where | Parameter List | is the number of parameters in the parameter list (the cardinality function).
Note that: (a) the value is directly measurable, (b) the SQI is an attribute/property pair, (c) the
relationship described between the use of structured data types and coupling is undeniable (and
intuitive), and (d) the stated SQI can indicate the presence (or absence) of coupling between two
modules.
To summarize, we want to measure quality in terms of characteristics set forth in the OPA
Framework, i.e., project-level objectives, process principles, and desirable product attributes.
Product attributes, although still not directly measurable, are significantly less abstract than
process principles and project objectives, and serve as the basis on which software quality
indicators are defined. More specifically, we identify process and product properties that: (a) are
directly measurable and, (b) undeniably reflect the presence (or absence) of specific process and
product attributes. In turn, these measures are propagated along the linkages defined by the OPA
Framework, yielding subsequent measures reflecting the proper use of process principles and the
achievement of stated software engineering objectives.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 27
Assuming that valid quality indicators can be formed from quantifiable characteristics of the
process, code and documentation, then automatic or semi-automatic (human assisted) procedures
can be developed to assess software quality [Nance and Arthur 1994].
2.3.3 Measuring Characteristics of Process and Product
Because software evolution begins with requirements specification activities and continues
throughout the life of the product (including attendant maintenance activities), SQIs must
embrace both process and product measures, and ideally, must admit to at least semi-automatic
computation. As illustrated in Figure 2.3, we propose the use of SQIs throughout the product
software life cycle. Initially SQI measures must reflect process characteristics because little, if
any, product is available. As development continues and products become more readily
available, SQI measures should expand correspondingly to reflect product characteristics.
Preliminary work in the SQI domain suggests that process, documentation and code indicators
are needed [Arthur, et al., 1991, p. 5].
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 28
Requirements
Analysis
Coding and
Unit Testing
CSC Integration
and Testing
Design
Time
Assessment of
PROCESS
Assessment of
PRODUCT
IDEAL
Maintenance Deployment and
Figure 2.3: Exploiting both Process and Product Indicators
Measuring Quality Through an Accumulation of Evidence
Software quality measurement should not be based on a single measure. If such a measure
attempts to incorporate many aspects of quality, it becomes unwieldy and unintuitive [Gaffney
and Cruickshank 1980]. If it focuses on a single product or process characteristic, e.g.,
McCabe's Cyclomatic Complexity Measure, then pertinent information is inappropriately
constrained, providing only a limited view of product quality. The SQI determination, embedded
within the OPA Framework, however, is predicated on the exploitation of multiple measures,
each attesting to the presence or absence of particular attributes in the product. OPA embraces
the philosophy that demonstrating that software possesses a desired attribute (or does not) is not
a proof exercise; rather, it resembles an exercise in civil litigation in that evidence is gathered to
support both contentions (the presence or absence) and weighed on the scales of comparative
judgment [Nance et al. 1986 and Nance and Arthur 1994]. As illustrated in Figure 2.4, measures
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 29
reflecting the absence of an attribute provide values in the -5 to zero range; measures attesting to
the presence of a desirable span the range of zero to +5. For example, if we consider the extent
to which a product exhibits a well-defined interface, the use of global variables for inter-
procedural communication has a detrimental impact. The use of parameterized calls, on the
other hand, supports such a contention. Hence, for any given product attribute the aggregation of
multiple confirming and contrasting measures yields one value in the range [-5,+5] indicating the
degree to which a desirable attribute is present or absent in the product. Note that values falling
in the designated range [-0.5,0.5] might occur because evidence of both presence and absence is
detected or because no evidence is available (which results in a zero).
Figure 2.4
Absence Presence
None or
Offsetting
-5 50
Measurement Scale
In effect, the SQI approach offers four substantial advantages over the single metric approach to
software quality measurement: (1) multiple measures, (2) measures which confirm or refute the
existence of a quality attribute, (3) a relative measurement scale reflecting consistency of
judgment, and (4) measures that are simple and intuitive. Sections 3.3 and 3.4 outline systematic
processes for defining and interpreting software quality indicators.
2.4 Influences of the Process Model
Within an established development process, well-defined procedures and guidelines serve as the
basis for structured activities supporting product development while emphasizing specific
organizational goals. Among organizations such goals usually emphasize similar objectives, i.e.,
producing a quality product on time and within budget; their development processes, however,
often vary in approach and magnitude. For example, one organization's process might employ
the conventional waterfall approach, while another might follow an incremental approach guided
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 30
by critical path analysis. Although the OPA Framework with the SQI concept embedded is
defined independently of any particular software development methodology, its application must
be tempered by the realities of the prevailing process model underlying the development effort.
In effect, the process model and attendant activities prescribe artifacts and timing, i.e., the focus
of measurement.
Consider, for example, an organization that employs an incremental approach to software
development, and desires only to examine code for quality characteristics. One possible
approach is to analyze each code unit when it is first placed under configuration management
(CM). While such an approach meets its intended objectives, i.e., providing the software
engineer and program manager with quality-related information, it constrains the measurement
process to focus primarily on code assessment, and correspondingly, on those activities related to
placing code under CM.
Clearly, a better picture of quality could be obtained if assessment includes an examination of the
design document before coding begins and a tracking of software trouble reports (STRs) written
against the code after it is placed under CM. Nonetheless, practical considerations, such as
limited resources and implementation deadlines, often dictate sub-optimal quality assessment
procedures. Similarly, particulars of the development process can, and do, define when and
where measurement activities are feasible. In effect, tradeoffs must be made to balance the
benefits of additional quality assessment (and prediction) with the organizational costs associated
with collecting and computing such information.
Crucial to the above observations is that, in establishing a measurement program, one must
balance needs with cost and practicality. To do so, one first examines the process model to
determine where each necessary data element can be obtained, and then, based on organizational
constraints and priorities and on the practicality of being able to collect the requisite data
elements, one identifies those data collection points that yield the most "bang for the bucks.”
Once the appropriate "where, when and what" are determined, the OPA Framework offers an
appealing approach to establishment of an effective measurement program. More specific
discussion of the effects of the process model on the application of the OPA Framework is given
in Chapter 8 (Section 8.1).
2.5 Establishing Measurement Goals: Assessment or Prediction
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 31
Within the framework of software quality measurement two complementary concepts exist:
quality assessment and quality prediction. Quality assessment entails an examination of the
product for characteristics deemed desirable and beneficial after the product is developed.
Quality prediction, on the other hand, focuses on the examination of artifacts that enables one to
infer, with confidence, the extent (or probability) that a product will possess desirable quality
characteristics before development is completed.
In establishing a measurement program, an a priori determination of the purpose is necessary:
assessment, prediction, or both. Such determination is crucial because process instrumentation
can differ dependent on the purpose. In particular, assessment requires an examination of the
product, while prediction focuses on an examination of process artifacts. Product code and
documentation are examples of the former; software development folders and process trends
exemplify the latter. Our experience has shown that predictive measurement, while having the
greatest potential for controlling quality, is the more difficult and costly of the two to achieve.
Predictive measurement requires process artifacts which are the hardest to identify and collect
because: (1) they are non-standard and often amorphously defined, and (2) no two development
processes are identical, making the direct application of procedures developed by others difficult,
awkward and at best only partially effective. Recalling our admonition against trying to do too
much (Section 1.2), we suggest that the start of a measurement program adopt assessment as the
initial purpose, but with the understanding that both assessment and prediction form the ultimate
goal.
3. AN OPA MEASUREMENT PROGRAM
As discussed in Section 2 the OPA and SQI concepts are coupled to provide a structured
approach to building an effective software measurement program. Such a program is sufficiently
flexible to accommodate the diverse particulars of many prominent development methodologies
and the attendant process activities. In this section we examine that "theoretical" guidance in the
context of organizational capabilities, limitations and constraints.
3.1 Accommodating the Organizational Process Model
Within a software development organization, an effective quality assessment and prediction
program must meet the needs and requirements dictated by two distinctly different organizational
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 32
components. The first component is represented by staff members who are directly engaged in
performing measurement activities. To support their needs the foundation on which the quality
assessment program is built must: (a) accommodate the identification and definition of necessary
and sufficient measurement activities that, with minimal effort, can be integrated into the
existing development process, and at the same time (b) provide an overall framework that
permits the aggregation of characteristic values into meaningful measures. As described in
Sections 2.2 through 2.4, the utilization of the SQI definition within the OPA Framework offers
an appealing approach to providing both (a) and (b) above.
The second component is characterized by multiple levels of management, who need to make
decisions based on information provided by the assessment program. In particular, within a
software development organization the scope and impact of decisions correspond closely with
the level of management responsibility. Top-level executives, for example, are responsible for
resource allocation, cost containment, and profit margins. They make decisions which affect
company viability. While producing a quality product as a whole is certainly one of their goals,
they do not, as a rule, concern themselves with which particular quality objective is (or is not)
met. On the other hand, the prime responsibility of the project manager is to produce a quality
product on time and within budget; the extent to which individual quality objectives are (or are
not) being achieved directly impacts that prime responsibility. For example, if a desired
reliability level is lacking, additional testing and software rework become necessary, which in
turn, extends the estimated completion date. The process by which quality objectives are
attained is the responsibility of the software manager. Often in concert with an organizational
software management function, the software manager within a project is responsible for the
effectiveness of the development environment and the support staff. In effect, each level of
management must make decisions based on reports and information that: (a) are tailored to
reflect the needs of each specific management level and (b) present a consistent picture of quality
across all management levels. Through its directed attention to software engineering objectives,
principles and attributes, and its enunciation of linkages that interrelate them, the OPA
Framework provides a basis for producing a consistent picture of product quality across multiple
levels of management.
3.2 Measurement Program Responsibilities
The division of responsibilities for software production, technical support, process conformance
and product quality must be clearly defined, understood and accepted in establishing and
operating a software quality program. Based on our experiences we view those responsibilities
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 33
as falling into three distinct categories: organizational, project management, and software
management.
3.2.1 Organizational
Overall, the responsibility for supporting an assessment program is an organizational-wide
concern. Support is derived from and based on the desire to evolve toward a more mature
software development process that provides the feedback necessary to enhance the development
practices during and after major software development efforts. Driving this desire is a
fundamental understanding that a better software development process translates into a higher
quality product being produced, and subsequently, increased corporate profit.
Recognizing the relationships between systems engineering and software engineering is a first
step toward enabling an effective software development process. As illustrated in Figure 3.1, a
critical aspect of that relationship is that system engineering objectives often form constraints on
the software engineering process. For example, efficiency goals, which are often erroneously
viewed as software engineering objectives, are actually constraints on the software engineering
process imposed at the systems engineering level. They are “a given” and must be viewed as one
of the several boundaries that the software engineering process cannot compromise. The early
work by McCall, et. al., illustrates this fact by showing that efficiency has an adverse effect on
all quality factors but one [McCall et al. 1977]. Other prominent system engineering objectives
that constrain the software engineering process are cost and schedule.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 34
Systems Requirements
Systems Engineering
Subsystems
Software Requirements
Software Engineering
Requirements (Documentation) Design (Documentation) Source Code and Documentation Object File Load File
Figure 3.1 The Software Production Task
Within an organization the body that is most responsible for ensuring the production of a quality
product is Software Quality Assurance (SQA). In effect, SQA is the "heart and soul" of any
measurement program. SQA has the responsibility to monitor development activities, sample
product quality, identify problem areas and, when necessary, initiate corrective actions.
Although independence is a necessary element, SQA cannot (and should not) act as an isolated
organizational unit. They must work closely with project- and program-level managers, and be
proactive in identifying problems; yet strive to define solutions through collaborative efforts.
3.2.2 Project Management
From the perspective of software quality the project manager (PM) derives his or her goals from
those stated by the organization. The PM is assisted by SQA in the identification and resolution
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 35
of quality problems. The PM perspective of software quality, however, often differs from that of
higher-level management and SQA. In particular, higher-level management and SQA view
product quality as a single, all encompassing characteristic. The PM, on the other hand, is
concerned with individual elements comprising product quality, e.g. maintainability, reliability,
adaptability and so forth. While still high-level concepts, these software engineering objectives
provide a more concrete foundation for establishing the extent to which software quality has (or
has not) been achieved. Ideally, the PM receives product quality reports reflecting an assessment
outlining the extent to which desirable software engineering objectives are being achieved.
These reports are produced by SQA and/or the software manager.
When an unacceptable quality level is noted, the PM confers with both SQA and the software
manager to identify: (a) the problem source, and (b) process changes needed to correct the
current deficiency and to prevent further such occurrences. The PM then directs the software
manager to effect the appropriate changes in the development process. The latitude of a PM to
direct changes in the process for that project differs widely among organizations. In some cases,
concurrence of SQA is required. The responsibility for changes in the software development or
sustainment process would require the approval of SQA without question. We believe that the
OPA Framework, through its set of linkages relating the achievement of objectives to the use of
process principles and the embodiment of attributes in the product, provides a systematic and
natural approach to the identification of necessary process changes.
Relative to project management, SQA is (or should be) viewed as an independent entity which
plays a supportive role in producing a quality product by providing feedback attesting to both
process and product quality. Configuration Management (CM), on the other hand, is a powerful
process-oriented tool that is directly controlled by the PM and promotes the attainment of
product quality through software version control. In effect, as software units are baselined, they
are placed under CM; any subsequent access to or modification of baselined units must follow an
established set of guidelines defined by the PM. Such guidelines ensure a managed environment
where, when problems do surface, the appropriate personnel are informed, and subsequently,
problem resolution is an intentional rather than ad hoc action.
3.2.3 Software Management
SQA and CM provide information which the project manager uses to direct actions focused on
producing a quality product. One of the tasks of the software manager is to ensure that those
directed actions are carried out in a satisfactory manner. Additionally, the software manager
continuously monitors the software development process in an attempt to recognize potential
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 36
problems and initiate corrective actions before product quality suffers irreparably. To this end,
the software manager must have a firm understanding of the methodology being applied and the
implications of that methodology relative to: (a) the structure and composition of the
development environment, and (b) the expertise and training needs of development personnel.
As emphasized within the OPA approach to software development, the methodology being
employed should state its primary objectives: these include maintainability, reliability,
correctness and so forth. Linked to those objectives are software development principles that
must be employed in the development process to achieve the stated objectives. In turn, these
principles form a basis for deriving necessary environment requirements. For example,
functional decomposition requires a tool that supports the specification of functional abstractions
and their subsequent decomposition and refinement. Based on a comparison among
methodological principles and environment tools, the software manager can determine the
adequacy of the development environment for supporting the process defined by the
methodology within the process model.
Similarly, the methodology, development process and environment tools define expected
personnel expertise levels. By comparing a profile of expected expertise levels with the current
profile of staffing capabilities, the software manager can: (a) determine if additional personnel
training is needed, (b) identify the critical elements that the training effort must address, and (c)
initiate the training process before those capabilities are needed in the development process.
In effect, the software manager is process focused. He/she must have: (a) a fundamental
understanding of how the development process is expected to operate, (b) knowledge of
personnel expertise, (c) access to process and product quality reports indicating the extent to
which quality is (or is not) being achieved, and (d) the recognition of how to address and correct
problems as they surface. Our experience indicates that viewing software development from an
OPA-like perspective provides that necessary insight to realize each of the above qualifications.
3.3 Derivation of Software Quality Indicators
Developing measures of software (code) quality has been a continuous challenge in computer
science and software engineering. A literature survey reveals that numerous metrics are
suggested for measuring software, most often the characteristic examined is “complexity.” Some
well documented metrics include Halstead’s Software Science [1977], McCabe’s Cyclomatic
Number [1976], and Henry and Kafura’s Information Flow [1981]. A major criticism of many
of these metrics is the lack of a “clear specification of what is being measured” [Kearney, et
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 37
al.1986, p.1050]. Another author notes that software metrics should “empirically and intuitively
describe software behavior;” yet this capability is missing from most metrics [Ejiogu 1987, p.
61]. Today, these concerns are even more critical in light of the expanded complexity of
currently proposed software systems.
3.3.1 A Systematic Definitional Procedure
A first step in addressing such criticisms, and the eventual evolution toward a controlled
software quality assessment process, is recognition that the design of a quality assessment
program, and in particular the definition of SQIs, must follow a systematic path of development.
More specifically that systematic procedure must
• naturally relate the definition of quality measures to both product and process characteristics,
• provide linkages among multiple measures to support a meaningful aggregation of values and the synthesis of information which is tailored to the management hierarchy, and
• capture the fundamental relationships between indicator measures and product quality to - promote an understanding of quality implications at both the technical and managerial
levels, and - facilitate reasoning about alternative problem solutions.
Our experiences indicate that the formulation of a systematic procedure must be based on a
foundation that directly relates product properties and process activities to the achievement of
project-level software engineering objectives and to the presence (or absence) of quality
attributes induced in the product. The steps outlined below define a systematic procedure for
developing a set of quality indicators.
(1) Identify Appropriate Process/Product Properties: Step one focuses on the identification of accepted characteristics of the development activity and product properties that influence (contribute to or adversely impact) product quality.
For example, the use of appropriate indentation within a program is acknowledged as an aid to understanding the program structure and its execution behavior.
(2) Determine the Impact of the Property: Step two provides a description relating the presence (or absence) of a property to its specific impact on the achievement of quality.
The use of global variables violates the principle of information hiding, and thereby, because of the potential "ripple effect", unduly complicates any maintenance activity applied to the offending module.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 38
(3) Identify the OPA Entity Affected: Step three links the identified property to a specific software engineering attribute. That linkage is determined relative to the property's impact on software quality. A positive or negative impact is also determined at this time.
Clearly, the use of global variables has a detrimental impact on the attribute of well-defined interfaces. More specifically, the use of global variables breaks down a module's interface structure by exposing its communication mechanism to any module that has access to the global communication variable.
(We note that a single property can be linked to more than one attribute. Consequently, the remaining steps in the definitional process, i.e., steps 4 - 7, necessarily differ for each distinct attribute association.)
(4) Provide a Rationale for Linking the Property to the Attribute: Step four is most crucial because it provides the justification that definitively links each property to a specific attribute. In particular, the justification describes why and how the property affects the attribute to which it is paired.
A well-defined interface is one that restricts information access to only designated communication partners. Additionally, a well-defined interface promotes the exchange of only the minimum information necessary to support a module's function. The use of global variables expands the accessibility (and the potential modification) of "restricted" information to any module having "visibility" to that global variable.
(5) Define the Measurement Approach: The measurement approach descriptively relates the existence of observable process/product properties to their impact on the identified attribute. Additionally, it provides a justification as to why those particular properties are chosen and outlines how they are to be used in the formulation of a metric.
To measure the extent to which a well-defined interface is present in a given module, one must examine all possible forms of inter-module communication mechanisms, e.g. parameter passing and the use of global variables, and then formulate a metric that measures the impact of global variables relative to the impact of all other communication forms used.
(6) Define the Metric(s): Using the measurement approach as a guide, one or more metrics are defined that reflects the impact of the identified property on its related attribute.
For the use of global variables relative to well-defined interfaces we define the following metric:
Let GVU be the number of global variables uniquely used for communication
purposes, and PRM be the number of unique parameters used in procedure calls. For any given module we define the impact of the use of global variables on the
existence of well-defined interfaces to be GVU / (PRM + GVU)
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 39
(7) Define the Indicator: The definition of the indicator is formulated separately from that of
the metric to impose the chosen measurement scale. By scaling indicator values from -5 to +5, we stipulate that -5 denotes a poorly defined
iterface, 5 represents a well-defined interface, and 0 implies that we cannot pass judgment. Because the use of global variables can only have a detrimental impact on well defined
interfaces we define this particular indicator, denoted WDI:UGV , to be: WDI:UGV = 0 - 5 * (GVU / (PRM + GVU))
Although the example used above pertains to the evaluation of code quality, the identical
systematic procedure is applied to the definition of documentation and process quality indicators.
3.3.2 Product and Process Differences
Clearly, both process and product measures play important roles in assessing software quality.
An examination of each, however, reveals distinctive differences between the two. Those
differences are crucial, and must be recognized and exploited in the definition of software quality
indicators. From a process perspective, trends and (non-deliverable) process artifacts are
excellent sources of information for judging product quality. An example of the former is the
characterization of open and closed software trouble reports over time; the use of software
development folders to trace requirements and design changes to their manifestation in the code
is an example of the latter. The (deliverable) product, on the other hand, can be more directly
examined for properties that indicate the existence of desirable attributes. For example, one
measure of readability is the complementary use of structured constructs and code indentation.
Effectively, distinguishing among process trends, process artifacts, and code and documentation
characteristics permits the identification and partitioning of quality characteristics which, in turn,
leads to a more focused approach to defining SQIs.
3.3.3 Warnings and Watchwords
The systematic approach to defining SQIs derives its power from a structured process that
employs "divide and conquer" strategies that enable one to focus on individual elements of
quality while maintaining a project-level perspective. Because such an approach easily
generalizes, one must exercise caution when defining the systematic procedure to ensure that
inappropriate "quality measures" are not introduced. As emphasized in Section 2.1, productivity,
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 40
cost and schedule are not elements of software engineering quality. Their inclusion can lead to
misinformation and misuse of reported results.
3.4 Interpretation of SQI Results
As outlined above, SQIs are defined through a systematic procedure that assumes the existence
of natural linkages relating the use of proper software engineering principles in the development
process to the achievement of desirable project-level objectives and to the presence (or absence)
of beneficial attributes in the process and product. (Section 2.2 asserts the validity of such an
assumption.) Further, assuming an underlying framework like OPA, the SQI measures are
computed at the property/attribute level. That is, specific process and product properties are
examined, followed by computed measures relating each property to the existence of an attribute.
Through established linkages, these measures are aggregated and propagated through the
principles, to the objectives level.
Interpretation of these computed and aggregated measures can be initiated at any of three distinct
levels: at the objectives level, at the principles level, or at the attributes level. For example,
suppose that the project manager observes that a particular software engineering objective shows
an unexpected low score. Through defined linkages this anomaly can be traced to the ineffective
use of specific process principles. Because the effectiveness of a development process relies on
the proper use of stated principles, the software manager is informed of the low value and must
question whether: (a) there is a deficiency in the development methodology, i.e., not specifying
the appropriate principle, (b) the support personnel are inadequately trained in the use of the
methodology, or (c) the environment lacks the proper tools to support the development principles
enunciated by the methodology. The conclusion as to the source of the low value for a principle
does not halt the corrective assistance. A software engineer seeking to rectify the problem can
continue the examination by following the principle-to-attribute linkage to determine the
ramifications of the problem, e.g., the use of global variables to support inter-module
communication. The ability to trace and explain the basis for a score on an objective such as
reliability in terms of interface definition at the attribute level and to further decompose the
defined relationship to reveal properties contributing to the unacceptable value represents a
powerful capability for software quality control.
Utilizing the linkages in the other direction, i.e. from attributes through principles to objectives,
is also extremely effective because it provides for the propagation of measures that present an
inclusive and consistent picture of quality from the technical level of the software engineer
(attributes) to the project management level (objectives).
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 41
3.5 Decisions and Actions
As evidenced from the above discussions, relying on a model of software development that is
consistent with accepted practices and which embraces a definitive approach to reasoning about
software quality paves the way for informed decisions and actions that tend to minimize adverse
impacts. In particular, an examination of propagated values within a framework that links the
process principles to project objectives and to process/product attributes enables one to recognize
that a problem is emerging and provides the mechanism to determine where and why such a
problem is surfacing.
Additionally, such a framework encourages the investigation of alternative solutions by directly
linking indicator measures to product quality through intuitive arguments. This characteristic
invites management and technical personnel to question the "whys" of conventional wisdom and
the "what ifs" of proposed changes.
4. PROCESS MEASUREMENT
Considering the three components contributing to software quality: process, documentation and
quality, process is clearly the most challenging and the most resistant to automated measurement.
Yet, measurement of process properties is essential to the goal of predictive use in software
quality control.
4.1 The Inherent Difficulty of Process Measurement
We use the term "software evolution" to emphasize the fact that major systems are expected to
function over long deployment periods and subjected to major functional changes as the systems
in which they are embedded change or as new technology influences application improvements.
No clear boundary exists between the completion of development and the inception of
maintenance (better termed as "life-cycle support"). The software truly must evolve; change is
an ever-present requirement. The activities necessary to effect successful evolutionary software
demand assessment, review and revision following a structured, well-defined and
conscientiously executed approach; i.e. a process measurement program. Unfortunately, process
activities exhibit a challenge to the measurement of quality which surpasses that inherent in
product measurement.
The difficulty of process measurement stems from three sources:
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 42
(1) the lack of universal acceptance of methodological techniques for software development
and maintenance forces the adaptation of measurement procedures to the evolutionary process model in use within the organization,
(2) the evaluation of process activities requires an active, concurrent assessment rather than the
retrospective analysis often possible with software products, and (3) the tempting use of obvious statistical properties can lead to superficial characterization
that is misleading and counter-productive.
As an example of the third difficulty listed above, consider the challenge to extract an indicator
of software quality from a design review. A readily apparent statistic is the number of action
items generated in the review. However, on more extensive reflection, we might question the
true cause for the review of software component A producing twice as many action items as
reported in the review for software component B. Is the quality of A markedly less than that of
B or is the review of A more thorough and more revealing (higher quality) than that of B? Only
by direct observation of both reviews, coupled with comparison with the reviews of other
software components and the attention to and disposition of action items could an answer be
given as to the source of the apparent disparity.
A frequent reaction to the difficulty of process measurement is to omit process assessment at the
project level, perhaps using something akin to the SEI Capability Maturity Model [Humphrey
1990] for a high-level organizational picture. This picture provides a snapshot at one point in
time of a continuously changing landscape. The outcome is likely to provide little guidance for
process improvement at the project level. The Capability Maturity Model (CMM) provides an
estimate of organizational potential, but gives little direction for revising and restructuring
project-level activities. Recognizing this limitation, a guidebook to process measurement with
the intent of recognizing measurement principles and using them to evaluating and controlling
the software evolution within an organization is a recent product of the SEI [Florac et.al. 1997].
Note that while the process assessment in the OPA Framework is restricted to its influence on
product quality and within a project focus, the evaluation for process improvement in the SEI
Guidebook takes an organizational perspective with the primary goal of process improvement.
The CMM can serve a useful role in laying out the process model for software evolution within
the organization. This model can take a very general form, permitting alternative
methodological approaches or can be specific and constraining, dictating a single methodology.
However, the process model should be documented, most likely in the organizational Software
Engineering Manual, and actively used in process auditing and employee training. The
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 43
Guidebook, probably used by the SQA group, can provide a basis for measurement to improve
the software process across an organization, but the investment in such a program requires major
management commitments beyond what is entailed in the OPA Framework.
4.2 Indicator Derivation and Distinctions
An advantage of the statistical indicator concept is the ability to employ a number of indicators
and to use them for confirming or contrasting purposes. Unfortunately, at this juncture in
software engineering, the understanding of the relationships between process activities and
quality has yet to reach the level so that this advantage can be utilized. We simply do not know
enough about the effect of process actions on consequent product (software) quality to suggest
numerous measures with any degree of confidence. Our experience has led to the identification
of five root sources of product quality imparted from process activities: (1) requirements
volatility, (2) use of software development folders, (3) the software quality assurance (SQA)
infrastructure, (4) process stability, and (5) testing policies, procedures and performance. In the
description of each source below, we follow the systematic definitional procedure presented in
Section 3 to explain the resulting process indicator.
4.2.1 Requirements Volatility
One of the few issues finding widespread agreement among those working in software
engineering is that requirements definition is an exceedingly difficult task. The degree to which
requirements are clearly specified, complete, and measurable determines the success of any
software development or maintenance task. A "measurable" requirement is one that admits no
uncertainty in the decision as to whether the requirement is met. Every requirement should be
measurable, for from the requirements definition comes the test specifications and procedures, by
which conformance with requirements can be judged and the decision on product acceptance can
be made.
Agreement on the difficulty of requirements definition is accompanied by admission that
requirements will change during the evolution of the software system. The "freezing" of
requirements, remains an unachievable ideal in almost every project; requirements changes are
inevitable. However, change must be controlled to avoid the potentially chaotic condition where
software testing cannot proceed because test specification cannot keep pace with requirements
changes.
Process measurement of requirements volatility attempts to assess the degree to which
requirements changes can reduce the quality of software products. Note that changes can affect
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 44
software design in ways that increase the dependencies (coupling) among components or reduce
the cohesion within a single component. Requirements changes can also produce cancerous
effects that worsen unless treated and become increasingly debilitating as detection and treatment
are delayed.
4.2.1.1 Detection of Requirements Defects
Requirements defects are errors in the specifications of software requirements. In an ideal
project, where no requirements changes are made following the initial allocation (Software
Specification Review (SSR) in DOD-STD-2167A), any problem traced back to requirements
must be caused by a defect; i.e. an error in the statement of the requirement, a missing
requirement, or an ambiguity leading to different interpretations. The development process
should be structured to expose such defects as early as possible, not permitting defects to
propagate beyond requirements specification into the design activities. Defects corrected during
the design activities can force respecification and redesign that introduces the strong potential for
loss of quality because decisions are now constrained by prior choices that might not have
appeared suitable if considered during the earlier stages.
The OPA attribute affected is Early Error Detection, a process characteristic. Reviews,
inspections and walkthroughs are intended to reveal the presence of requirements defects,
preferably during SSR but hopefully well before such defects are incorporated in a program. The
longer defects go undetected, the greater the potential for decisions to be made on
misinformation, consequently, the lower the quality of the developing product.
An indicator reflecting the capability of the software evolution process to support Early Error
Detection should include both the number of defects and the length of time that each defect has
persisted. Of course, some balancing in terms of size is needed; most apparent is the number of
requirements. In this case the total of allocated and derived requirements is used.
The formulation of the indicator is straightforward but open to question. As with all process
indicators, our principal purpose is to suggest the property desired for measurement, the impact
of that property on the process and the rationale and measurement approach. While we present
an example, adoption of a specific formulation is left to the user so that process variations,
project scheduling effects, and instrumentation capabilities or limitations can be accommodated.
Consider the length of the project to be divided into suitable time periods, indexed by k = 1,2, ...,
T, where T represents the current value of time at which the computation of the indicator value
is being made. Let
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 45
Defects(k) = The number of requirements defects discovered in month k since SSR.
Requirements(k) = The number of allocated and derived requirements for the software component by month k.
F(k) = Defects(k) / Requirements(k) T
Then NF = �6��F(k) * 2 k-1 T = 1, 2, ...,
k=0
Early Error Detection / Requirement Defects Detection = max [( 5 - 10 * NF), -5]
This formulation of the indicator applies increasing weight as requirements defects are
discovered later in the development cycle. This weight is intended to reflect the increasing
severity of late discovery, forcing changes in earlier decisions or compromises that detract from
the software design.
4.2.1.2 Reverification
A second effect stemming from requirements changes is the need to reverify the correctness of
design decisions as new decisions must be made. The addition of requirements following
System Design Review (SDR) forces the reexamination of prior design decisions from
preliminary design to the point in the development or maintenance process where the change is
made. The deletion of requirements mandates a similar form of reexamination. Such
reexaminations, especially if the changes come in the latter stages, are prone to lack the rigorous
attention given during the original activities. The software evolutionary process is challenged to
motivate and enforce the same degree of rigor.
Consider also that the detection of errors during reverification has lost the advantage realized by
correction before proceeding further with actions based on incorrect decisions. In some cases the
domino effect of the actions based on incomplete or incorrect requirements specifications can
cause extensive redesign accompanied by delays and unplanned costs.
Measurement of process capability to minimize reverification is based on two problem sources:
(1) the magnitude of the requirements changes, and (2) the timing of changes. The process
should be structured to support recognition of requirements defects either during requirements
definition (which is ideal but difficult) or during the stage where the design decisions related to
the changed requirements are made. However, the process should also be structured to support
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 46
reexamination of process decisions from any point where the change is introduced. The example
formulation shown below synthesizes both sources.
Let f = (number of added, deleted and modified requirements) / total number of requirements
Early Error Detection / Requirements Volatility = max [ 5 - (1 + f) p , -5] (Note 00 { 0)
{with p = 0 if prior to SDR, 1 after SDR but prior to SSR, 2 after SSR prior to PDR, 3 after PDR prior to CDR, and 4 after CDR.
4.2.1.3 Reduction of Cohesion
Requirements changes following the initial allocation can force a grouping of poorly matched
software components. Even a single requirement that falls beyond the functional focus can lead
to loss of cohesion; i.e. the restricted scope (single function) that promotes the unity of a
software component. Functional cohesiveness promotes ease of understanding and simplifies the
design and implementational tasks. Even the deletion of a requirement can contribute to a loss of
cohesion for a component. Using a database example, the negation of the modify requirement in
the case where both read and modify are initially specified can obviate the need for more
elaborate locking mechanisms applied to the read operations.
The adverse effects of requirements changes, be they additions, deletions or modifications,
potentially degrade the functional congruity of a software component. The more prevalent the
changes within the process, almost irrespective of stage, the higher the potential for loss of
cohesion. Changes that lead to loss of cohesion can prove especially damaging for testing, for
test procedures based on original specifications might include deleted sections or inapplicable
procedures caused by functional modifications.
Requirements changes exert a negative (inhibiting) effect on the OPA attribute Cohesion.
Measurement of the degree of the effect should include all three forms of change: addition,
deletion and modification. A particular project might choose to weight the forms differently, for
example asserting that deletions are not so disruptive on cohesion as additions. Weightings
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 47
based on the lateness of change might also appear warranted. The formulation employed here
includes no weighting based on form or time.
Cohesion / Requirements Volatility = 5 - 10 * (n + m) / N
where n = number of added and deleted requirements
m = number of modified requirements, and
N = number of original requirements + m + n.
4.2.1.4 Addition of Coupling
Requirements decomposition, following either a hierarchical or a functional perspective, are
adversely affected by changes. Addition or deletion of requirements can lead to data sharing and
communication that couples components beyond the degree necessary in the initial specification.
For example, the deletion of the requirement for sonar data to be passed from component A to B
might be accomplished by substitution of the source of that data within B without actually
altering the message contents sent from A. Consequently, A and B are coupled unnecessarily by
the unused sonar data items.
In some cases added requirements can strain the existing design boundaries, forcing an
accommodation through linkages that are neither natural nor efficient. Adding computation
without requiring new data is rare, but in such instances no increase in coupling occurs. The
more typical requirement change imposes additional data, as input or output, and the result is
often an added dependency that might have been avoided, or led to the avoidance of an existing
dependency, if that requirement had been stated properly or not excluded in the initial
requirements specification.
Coupling is the OPA attribute affected by requirements changes that force an expansion in
component dependencies or the addition of linkages among existing components. The effects
can be broad and subtle, creating a "ripple effect" moving through the software product (code
and documentation) like an epidemic. The later the change, the more likely it leads to
modifications of early design decisions, forcing changes to be made at higher levels of the
decomposition tree that filter down to the lower levels. The process that recognizes the effects of
such changes early can avoid compounding the unnecessary coupling through successive
branches in the decomposition tree.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 48
The decomposition tree provides the basis for the process indicator computation:
Coupling / Requirements Volatility = 5 - 10 * r / R
where r = number of lowest level components
affected by changes
R = total number of components in the
decomposition tree beginning with
the highest level component affected
Example: The decomposition tree shown below shows components experiencing requirements
changes in bold. The highest level component affected is &�; the total number (R) is 6; and the
number at the lowest level (r) is 3.
���������������������������������������������������������������&��
���������������������������������������������������������������_���?�
������������������������������������������������������&����&����&��
����������������������������������������������������������?��������������?�
��������������������������������������������������&�����&���&������&��
����������������������������������������������������������������?������������_���?�
������������������������������������������������������&�������&������&���&���
������������������������������������������������������_���?�������
��������������������������������������������&���&���&���
�
The value computed for Coupling/Requirements Volatility for this example is 0.
4.2.1.5 Disruption in Traceability
Traceability is the ability to follow the path of a functional capability or a non-functional
stipulation from the original requirements specification through successive levels of design to the
final implementation in code. Thus traceability is an attribute deemed exceedingly important by
the development management and even more important by the life-cycle support agent. Adding
or modifying requirements after the initial allocation (the System Design Review in DOD-STD-
2167A) forces changes in design documents, possibly in program design and code that must be
enabled and even mandated by the software development process. Such changes often
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 49
necessitate a backtracking from the current stage of development to prior stages. A natural
tendency is to forego the needed revisions of prior specification documents in the haste to move
on with changes. The consequence is often a disruption in the documented path from
requirements to their realization in code.
The first requirements baseline is established at the System Design Review (SDR), using the
terminology of DOD-STD-2167A. Allocation of new requirements after this point may lead to
incomplete or inaccurate documentation since the tendency is to find the least troubling point to
accommodate the added functionality. The deletion of requirements can prove even more
disruptive to traceability because the tendency is to drop the requirement at the point where it is
no longer applicable. The correct approach, which is to document a requirement deletion at both
the inception and termination points, preserving the design steps taken in the intervening period,
is likely to be viewed as "needless busy work."
Traceability loss should be weighted by two factors: (1) the significance of the requirements
change, and (2) the time period between SDR and the introduction of the change. Both are
reflected in the indicator formulation below.
Let s = significance of each requirements change and s � [ 0, 2 ]
then Traceability / Requirements Volatility = max [ 5 - s * 2p , -5 ]
{with p = 0 if prior to SDR, 1 after SDR but prior to SSR, 2 after SSR prior to PDR, 3 after PDR prior to CDR, and 4 after CDR.
4.2.2 Software Development Folders (Files)
The use of a software development folder that contains all data related to a software component
is a broadly recommended, if not universal, practice. Some organizations have opted for a
machine-readable version, referring to it as a "software development file." Actually, the file
version can be cumbersome unless all data, including notes, correspondence, meeting minutes,
instructions, etc. , are in machine-readable form. Thus, a loose-leaf binder often serves the
folder function. Unfortunately, graphical description, which is progressively preferred, can
prove troublesome with a binder or in machine-readable form. Figures or diagrams larger than
the standard page size can be accomodated in a fold-out but the result can be bulky. Reduced
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 50
photocopies can be difficult to read. Machine-readable file versions of figures or diagrams can
be difficult to comprehend and to translate if necessary.
4.2.2.1 An Historical Repository
The presence of a software development folder (SDF) indicates that someone has become
convinced of the utility of maintaining a current history of the component's development. To
serve well as a documented history, however , the SDF must contain either all data relevant to
the project or pointers (references) to where such data can be found. Sections of the SDF
typically correspond to the major stages in a development process. Changes in requirements
should be described together with the effect incurred in the processing of the change. All
significant events, such as design reviews, code walkthroughs, unit test completions, etc., should
be included. Project memoranda that affect the component in any way, schedule, cost or quality,
should be included. Project management data that affect the quality of the product should be
included, such as schedules, milestone charts, certifications, etc. The compilation of SDFs from
multiple projects to form an organizational quality database, with access structured to respond to
queries concerning quality issues and problems rather than project cocerns, is discussed in
Section 8.4.
While missing data exemplify the most common deficiencies in the evolutionary process,
inaccuracies and inconsistencies can prove the most sinister. A process that includes audits, as
well as self-checking, should eliminate these frustrating contributors to poor quality.
Measurement of the use of a SDF "simply" involves the documentation that all data is included,
adequately presented, and the document is used effectively. Missing elements, or the failure to
recognize the need for an element, reduce the effectiveness of a SDF. If a responsibility of the
software quality assurance organization is to audit the SDF, then compliance with that audit
requirement (both degree and frequency) needs to be examined. Attention must be given to the
critical points of product review and approval, but the measurement focus should be on the
process. For example, noting the outcome of a design review in terms of the number of action
items generated can reveal something about the quality of the product (a high number could
mean that deficiencies in product quality are reflected) or the quality of the process (a low
number could be evidence that the review activity is not functioning properly). Of equal
importance for measuring the quality of the process is that the resolution of the action items be
accomplished as specified and within the stipulated time requirements.
The example given above, measurement of review outcomes, extends to walkthroughs,
inspections, critical reviews, etc. Much of the interest seemingly is attached to the product, but
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 51
the more important issues surround the process activities and the degree to which they are
functioning as specified in the Software Engineering Manual or the project Software
Development Plan. Judgments of the process activities necessitate involvement; i.e. the
individual must be there at the time extracting the subtle, inexplicit data that cannot be obtained
retrospectively that is often possible with product measurement. Subjectivity cannot be avoided
in process assessment, both in the determination of the existence of a factor and the degree of its
effect. The example used below illustrates this point. 9
Visibility of Behavior / Use of Software Development Folders = max [5 - 6� C(j) , -5 ]
j = 1
with j indexing over the values defined in the table below
j C(j) Brief Description of Deficiency ----- --------- ---------------------------------------------
1 10 No SDF or equivalent
2 1 - 3 Late creation of the SDF
3 1 - 3 No log or entry requirement
4 1 - 3 Missing sub-component data (could apply to COTS or NDI)
5 1 - 5 Missing phase documentation (e.g., no preliminary design)
6 1 - 5 Missing test documentation
7 1 - 5 Missing verification or test results
8 1 - 3 Missing project management data
9 1 - 5 SDF contains inaccuracies
4.2.2.2 An Evolutionary Record
In addition to providing the source of all data related to a software component, the Software
Development Folder furnishes the links between the various evolutionary versions of the
software component and between various sub-components within a single version. The analogy
with a genealogical tree is useful here: the composition of a family in one generation is shown as
a horizontal description, but also the relations among generations are depicted in a vertical
representation. The horizontal view of a software component gives a "picture" at a point in time
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 52
-- a status description, but only when complemented with the vertical view does one have the
knowledge needed to understand the evolution of the component to the current state.
Traceability is an attribute that is a major concern for both developing and maintaining (life-
cycle support) organizations. While the SDF may contain all data to reveal the current status of a
software component, thus projecting a very high degree of currency, it needs to also provide a
complete description of the relations among the various generational data items. The data
supplied in the SDF to support traceability can vary depending on the methodology employed.
For example, the estimation of risk and the subsequent effect of the risk computations are key to
tracing the development decisions using rapid prototyping. The logging of significant events,
particularly related to sponsor (customer) interaction, can indicate much about the evolution of a
software component irrespective of methodology.
Especially important in the deployment (life-cycle support) phase of a system is the recording of
software measurement values. Tracing the effect of changes, stemming from functional,
corrective or perfective sources, provides a revealing picture of the effect of maintenance on
software quality. The continuation of high quality or significant increases in the quality of
deficient components should be a major goal of the Life-cycle Support Agent (LSA), and tracing
the effects of changes on quality can be a convincing argument for the value added by the LSA
organization.
Because of the differences in methodology, no specific formulation of an indicator is given as an
example. In developing an indicator, we suggest a factor scoring similar to that used above. If a
prototyping methodology is used, then the capability to explain changes based on customer
reactions is important. Risk estimation is also a major contributor to tracing the evolutionary
path of a software component. A "waterfall" methodology tends to place traceability data in the
results of reviews or inspections. However, communications between sponsor and the
developing organization should not be ignored. The scoring should be based on the presence of
such data, its completeness, and the documentation of its effect on consequent decisions,
particularly changes. Note that ancestral documents, or sources beyond the developing
organization in the case of COTS or NDI software, should be included if necessary or at least
referenced with an accessible location specified.
4.2.3 Software Quality Assurance Infrastructure
The concern for high quality software should pervade the project activities and persist
throughout the duration of development and deployment. To that end, a Software Quality
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 53
Assurance (SQA) group should strive in concert with and support of the project personnel.
However, quality should be goal number 1 for all involved. Important differences exist between
the quality concerns of the SQA group and those of development or maintenance personnel.
Such differences are not confined to schedule and budget issues affecting quality. In short,
product quality is a primary project goal but is a secondary goal of the SQA group. Process
quality is, and must be, the primary goal for the SQA group. The proper conduct of process
measurement must consider these differences.
The correct organizational location of the SQA group is reporting to a supervisor outside the
project management structure and preferably at a level commensurate with the top-level line
(project) management in the organization. However, at the working level, the SQA tasks must
be supportive of development or maintenance personnel, while assuring the integrity of the
process and searching for improvements in the process. The measurement activity must be
geared to these objectives as well. The quality assurance representative(s) should be viewed by
project personnel as belonging to the team. Achieving this view is not only the responsibility of
the organizational leadership but also the project leadership. A project manager should perceive
compliance with quality standards to be beneficial and convey that attitude openly and readily.
Software developers and maintainers should evince concern that the SQA role is discharged
properly at all walkthroughs, reviews, inspections, etc.
The major impact of the SQA infrastructure is to enhance the understanding of and appreciation
for the process definitional and procedural requirements: the visibility of behavior . The
indicator formulation shown below exemplifies the various factors lying within the SQA sphere
of influence that can enhance this visibility.
Visibility of Behavior / SQA Infrastructure is computed by:
loop j:= 1 to 10 SQA := SQA + Q(j) endloop SQA: = SQA - 5
with one point added for each of the quality factors defined below that is met:
Q(1): Software quality recognized as responsibility at project level
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 54
Q(2): Software quality recognized as responsibility within the organization.
Q(3): SQA responsibilities, etc. described in organization's software process documentation (Software Engineering Manual, etc.).
Q(4): SQA responsibilities, etc. described in project level documentation with clear identification of any deviations from organizational requirements. Q(5): A SQA group independent of the project manager exists within the organization. Q(6): SQA attendance at code walkthroughs, reviews, etc., is required and the requirement is observed. Q(7): SQA attendance at design walkthroughs, reviews, etc. (both preliminary and detailed) is required and followed. Q(8): SQA administration (direct) or audit (indirect) of the configuration manage- ment function is prescribed in organization or project description of the software development process. Q(9): SQA approval at one level of review is mandatory before a subsequent review can be scheduled. Q(10): SQA audit of Software Development Folders can occur at any time on request.
4.2.4 Process Stability
An obvious determinant of software quality is the stability of the evolutionary process that
creates the products. Process instability, clearly recognized as detrimental during development,
can be even more destructive during the life-cycle support phase because the investment of time
and training to gain productivity in the maintenance of a software system typically exceeds that
required for development productivity by a factor of four. (More needs to be learned concerning
an existing system, most likely created by others, than for a system being created.)
Measurement of process stability (or equivalently, instability) is by nature subjective in degree
(some assessment of the severity of the effect is required); however, the existence of the source
of instability should be devoid of subjectivity. Five potential sources of instability are readily
categorized by changes in:
(1) personnel responsible for a software component,
(2) target hardware platform(s),
(3) target language or language translator(s),
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 55
(4) software environment supporting the effort, and
(5) project and organizational management or policy.
This is an area of process measurement where the cause for quality degradation can be attributed
directly to persons outside the evolutionary process (those not having responsibility for design or
implementation). If the nature of instability is seen as a sensitive issue, we suggest that several
persons be polled, perhaps in questionnaire fashion, simply to identify the source(s). While
identification of source might be through a "committee decision," we recommend that the degree
judgment be made by an individual so as to promote consistency. Having the "committee"
provide explanation or justification in their source identification could render the degree
judgment more informed.
Instabilities can lead to an inordinate number of errors, making it difficult to detect and remove
all of them. Further, an unstable process can undermine the error detection capabilities
embodied in the process. The simple formulation of an indicator below allows severity
judgments in three weights for each of the five categories above. 5
Early Error Detection / Development Instability = 5 - 6�I(j) , �
������������������������������������������������������������������������������������������� j=1
{where I(j) = 0 with no change 1 with one change 2 with two or more changes
While this formulation of the indicator seems reasonable, or at least defensible, it can be attacked
as unreflective of the situation where one type of instability, say that caused by personnel
turnover, is so great as to undermine the process quality irrespective of other influences. Thus,
this example should be used with caution; it could prove inadequate.
4.2.5 Product Test
Verification, validation and testing are terms associated with assuring the "goodness" of the
product. The activities described by each term differ, and understanding the nature of these
differences is a prerequisite to effective measurement. Verification applies to assurance of the
accurate transformation from one program specification to another: all functions are preserved
and the degree of abstraction is sufficient to enable assurance that functional behavior is
appropriately described. Validation is the assurance that the product meets the specified
functional requirements. Notice that inspections, walkthroughs and reviews are typically
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 56
associated with verification; testing activities, particularly system acceptance testing is
associated with validation.
While entire books have been written about testing principles and techniques, little has been
published about measurement of the testing process for assessing the inherent quality. Certainly,
the existence of necessary test documentation such as a test model and a test requirements
guideline is indicative of an adequate understanding of testing needs. Additionally, an
examination of recognized artifacts -- a test plan, test specification document, test procedures,
and test results -- provides some potential for assessment of how well the process is being
executed. Integrating the assessment of testing adequacy with that of test effectiveness is a
relatively unexplored topical area.
Although we offer no example of an indicator for measurement of testing quality, we encourage
some assessment to be done. Current research in this area seems likely to produce beneficial
results.
4.3 Interpretation, Decisions and Actions
Measurement of the quality of the software evolutionary process is acknowledged as difficult in
the first paragraph of this section. A first estimate, which in reality is an assessment of process
potential, is provided by the SEI Capability Maturity Model (CMM) [Humphrey 1990]. Given
the proclivity for requiring a CMM assessment as part of contract competition, a value is likely
to be available. However, this is an organizational value that can be affected significantly by
factors specific to a given project. The Software Engineering Manual (SEM), also an
organizational document, provides another source of data for process assessment. The process
measurement guidebook produced by SEI [Florac et.al. 1997] is intended for on-going
assessment for management and improvement and the results of such assessments would be an
excellent initial source for project-related insight.
If a Software Development Plan (SDP) is required (and we believe that one is desirable), then the
SDP becomes the first project-specific data source. Process assessment based on CMM, SEM,
and SDP with no other documents is quite speculative. Clearly, the existence of values from on-
going assessments is a valuable resource. However, the absence of such data should not be a
cause for despair; the alternatives could help to identify major problems early when corrective
action is most easily applied and with the avoidance of late changes. The degree and force of
such actions need to be governed by the speculative nature of the data. From this point, project
deliverables provide more definitive data for process assessment, and hopefully the confidence in
decisions based on such data increases throughout the remainder of the project.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 57
5. PRODUCT MEASUREMENT: DOCUMENTATION
Measuring the quality of documentation, while a less formidable problem than the quality of the
process, offers a major challenge. Documentation measurement can be approached with varying
degrees of syntactic and semantic analysis. The simplest approach is to note whether the
document exists, irrespective of its contents. Examination of the contents leads to syntatic
considerations: structure, format, correspondence. Semantic analysis in the attempt to
understand and exploit meaning, especially using automated techniques, represents an
exceedingly difficult problem.
5.1 Motivation and Distinctions
Although the term "software" is often used as a synonym for "program," recognition that
documentation is a component as important as code has existed for decades. We consider
software to include both the executable specification (the code component, which includes
internal documentation) and the non-executable specifications (the external documentation)
necessary to generate a product that is both usable and useful. With that view, the assessment of
document quality is mandatory for predicting software quality. Internal documentation is
assessed in concert with code, for the quality of comment statements, block descriptions, and
header information is dependent on relationships with the execution behavior represented. The
assessment of external documentation quality must recognize characteristics that are somewhat
independent of the executable behavior and frequently far in advance of its specification. This
chapter explores the measurement of quality in external documentation.
That the quality of documentation is important needs no justification. Consider the external
documents produced and delivered for a major software system: requirements definition, design
specifications, test documents, system and user manuals, and the list seems to increase with time.
Clearly, the collective number of symbols in external documents exceeds that in executable form
by a wide margin. The absence of quality in these components of the system would impose a
tremendous burden on both developers and the future sustainers (maintainers). Such a condition
could even prove chaotic over time.
The measurement of document quality is motivated also by the need to predict quality at points
in the development process which still admit detection and correction of problems. Changing
and ambiguous requirements, imprecise design specifications, or incomplete test procedures lead
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 58
to incorrect, costly and unmaintainable programs. Eliminating problems early reduces the effort
and higher cost of removing them later.
5.2 The Characteristics of High Quality in Documentation
The question, "What characterizes high quality in technical documentation?" could be answered
simply as anything that promotes achievement of one or more of the seven objectives identified
in Chapter 2. However, that answer only begs the question repeated once for each of the seven
objectives; e.g. What in a given document promotes the achievement of maintainability (or
correctness, reliability, portability, etc.)? Consequently, answers that convey more instructive
guidance are needed.
5.2.1 Principles Leading to High Quality
From the Objectives level, we move to the Principles level, and here Concurrent Documentation
assumes a major role. Concurrent Documentation embodies the management of document
creation throughout the software evolution so that at any arbitrarily selected point the document
set would present a faithful representation of the product status. The principle can be refined
into two subprinciples:
• Currency: the updating of documents to assure that changes made during development or
maintenance are captured as they are made. • Controlled Augmentation: the exercise of strict control over document modification.
Abstraction plays the second major role in document production. This centrally important
principle in its application also takes two subordinate forms:
• Formal Organization: application of a standard or guideline to the production of the
document set.
• Reification: the treatment of abstract components as reality in themselves.
The second subprinciple above, described by Lehman [1993], is relatively unknown as a term,
but the application is quite familiar to practicing software engineers. Reviews and inspections of
design documents, for example, are purposed toward understanding the behavior of a software
system created as specified to detect ambiguities or mistakes that can arise in a subsequent
specification.
Following the OPA Framework, employment of the principles above induces attributes desirable
in specifications generated in the evolving software product.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 59
5.2.2 Attributes of High Quality Documentation
Among the nine attributes defined in the OPA Framework, three are central in the measurement
of document quality: Readability, Ease of Change, and Traceability. Readability can be
attributed to two characteristics: Recognizability, a physical property, and Comprehensibility, an
issue encompassing meaning, representation of meaning and clarity of expression. The former is
much simpler to assess than the latter, a claim that is quickly appreciated as Document Quality
Indicators (DQI's) are defined.
The physical characteristics of a document include such things as: type and size of font, format,
medium, etc. Such characteristics might be classified as message-independent; i.e. unrelated to
the meaning in the writing but conveying the ease with which the written message is recognized.
In contrast with Recognizability, the Comprehensibility of the document depends on non-
physical characteristics, typically associated with style, organization, use of figures and tables,
proper grammar and structure, choice of vocabulary, and other features that assist a reader in
understanding the message.
Ease of Change and Traceability can be applied jointly and explained more fully with respect to
Recoverability. The ability to retrieve data quickly without limitation to sequential search is
included in Accessibility. Locational aids such as indices, references, and a table of contents
promote this ability. Poor accessibility inhibits the capability for tracing relationships, either
within or among documents. Similarly, effecting changes throughout the document set is
impeded in the absence of such aids.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 60
Accuracy can be easily confused with Currency. However, the latter reflects the expeditious
manner in which changes to documents are made; a change at one specification level should
precede consequent changes at other levels by a minimum lag time. The implication of time is
not present with Accuracy, which embodies either the correctness of a value or the difference
magnitude between two values.
The degree to which all needed information is provided embodies the attribute of Completeness.
Missing sections, unresolved references and partial descriptions are examples of deficiencies in
Completeness. The argument that incompleteness is nearly impossible to detect until post-
deployment is frequently heard; however the verification, validation and testing procedures
should have this purpose.
Consistency is the invariant use of definitions, concepts and values within and among all
members of the document set. As the project matures, Consistency tends to become increasingly
more difficult. Controls over software changes through project standards, adherence to a
methodology, and configuration management procedures are techniques for minimizing
inconsistencies.
Recoverability expresses the capability for reconstructing a specification path. Such a path could
be in a generational activity (development or maintenance) but not yet in executable (coded)
form or could exist in varying degrees that support or inhibit traceability and ease of change.
Illustration of the principle and attribute relationships for application of the OPA Framework to
documentation is shown in Figure 5.1. Examining this figure, especially in comparison with
other published works on documentation, might cause one to ask where "usability" is placed.
The usability of documentation is often cited as a primary goal or characteristic. Unfortunately,
the preoccupation with usability has reached a point where no consensus on the meaning of the
term can be reached. Moreover, we contend that usability must be an attribute, not an objective,
and the attributes identified above: accessibility, accuracy, completeness and consistency taken
together subsume usability.
5.3 Measurement Approaches
Assessment of document quality is complicated by the inherently subjective nature of some key
characteristics, e.g. comprehensibility and completeness. Nevertheless, a program that lacks
document measurement cannot hope to support predictive correction prior to code production.
We view this purpose as central to the measurement role.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 61
Approaches to the measurement of document quality can take three forms: (1) manual, (2)
automatic, or (3) combined (hybrid). While no completely automated approach has been
published, the acceptance and use of highly structured document standards, such as DOD-STD-
2167A, enables the feasible application of automated techniques. The automation of document
measurement, at least in part, is necessary to extend measurement to the complete document set.
Manual techniques must be restricted to samples from the document set; otherwise, the cost is
prohibitive. The combined manual and automatic approach is probably the most feasible
approach for the near term.
Documentation measurement has three goals: (1) objectivity, although some degree of
subjectivity is inevitable, (2) representation, the sample documents should accurately reflect the
characteristics of the document set , and (3) efficiency, too costly a procedure is likely to be
abandoned. Developing or revising the measurement program component for documentation
should proceed with an admission that the result must be a compromise dictated by the relative
importance of each of these goals.
** D
R A
F T
** N O
T F
O R
D IS
T R
IB U
T IO
N O
R A
T T
R IB
U T
IO N
** D R
A F
T **
6 2
.
Currency Controlled Augmentation
Concurrent Documentation
Figure 5.1 Principle and Attribute Relationships for Document Quality
Maintainability Correctness Reusability TestabilityPortability AdaptabilityReliability
Formal Organization
Reification Abstraction
Accuracy Completeness ConsistencyAccessibility Recoverability
Ease of Change Traceability
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 63
5.3.1 A Manual Procedure
A manual procedure where the evaluation is rendered as a single value, such as a real value in
the interval from -5 to +5, or one from the set [ poor, fair, good, excellent ], can provide some
recognition of potential problem spots when used to compare document components. However,
the procedure is not very helpful in suggesting how improvements can be realized. If a manual
procedure is employed, then the identification of explicit criteria on which document quality are
judged is preferable to the single-value labeling. The criteria suggested by the OPA Framework
are described above and summarized in Table 5.1. For each criterion (document property), an
indicator should be defined that can be employed in a human examination. Examples are given
in Table 5.2.
Attribute in OPA Framework
Document Attribute Property Examined
Physical condition of components (poor legibility, too small font size)
Ability to convey meaning clearly
Needed documents can be consulted expeditiously
Minimum lag time between changes at all representation levels
All needed data items and realtionships are represented
Invariant use of all definitions, concepts, etc.
Capability to reconstruct a specification path
Recognizability
Comprehensibility
Accessibility
Accuracy
Completeness
Consistency
Recoverability
Readability
Ease of Change
Traceability
Table 5.1 Criteria for Measurement of Document Quality Suggested by the OPA Framework
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 64
The examiner should review the selected documents with a checksheet, recording a score for
each indicator. (Note that a Document Quality Indicator (DQI) is an attribute/property pair in
conformance with the definition given in Section 2.) Use of or conversion to the -5 to +5 scale
of measurement is necessary if the attribute values are to be propagated upward to form principle
and objective computations.
Documentation Attribute Documentation Property Property Measurement
Accessibility Accuracy Completeness Comprehensibility Consistency Recognizability Recoverability
Completeness of Table of Contents (TOC) Code Utilization Documentation Format Appropriateness Invariance of Concept Adequacy of Print Reverse Path Reconstruction
Degree to which all important topics are contained in TOC Degree to which existing code components are necessary to fulfill design specifications Degree to which software components required by standard or project guidelines are present Suitability of presentation style or layout (use of charts, graphs, tables) Extent to which the meaning of each requirement is preserved through all specifications to date Degree to which physical display techniques reflect knowledge of human needs and limitations Extent to which claimed functionality of software component can be substantiated in prior specifications
Table 5.2 Examples of Documentation Quality Indicators (DQI's) (DQI = Document Attribute / Document Property)
5.3.2 An Automated Procedure
The DOCALYZE document quality analyzer demonstrates an automated application to a subset
of the properties created in the seminal study. The application of DOCALYZE requires that: (1)
the documentation deliveries follow DOD-STD-2167A and (2) the implementation is in Ada.
While only 11 of the 32 proposed DQI's could be automated and only ten prove usable, the
ability of that subset to match the assessment of expert assessors is demonstrated, giving
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 65
encouragement to the prospect for totally automated analysis [Dorsey 1992]. Table 5.3 shows
the 11 indicators included in DOCALYZE to give an indication of the extent to which automated
analysis can be employed.
Accuracy with which index cites locatios of terms
Use of alphabetical ordering format
Degree to which necessary terms are contained in the glossary
Proportion of proper (needed) references
Accuracy of section titles appearing in TOC
Accuracy of TOC in citing section locations and titles
Degree to which important topics contained in documentation are included in TOC
Proportion of missing references Frequency of TBD/TBS use within the document
Degree to which acronym use is well-defined and consistent
Degree to which key words are used consistently throughout a document set
Locational Accuracy of Index
Order of Glossary
Glossary Completeness
Appropriateness of References
Correctness of Table of Contents (TOC)
Locational Accuracy of TOC
Completeness of TOC
Missing/Incorrect References
To Be Defined/To Be Specified Frequency
Acronym Usage
Keyword Context Consistency
Accessibility
Completeness
Comprehensibility
Documentation Attribute
Document Property
Property Measurement
Table 5.3 Document Quality Indicators Automated in DOCALYZE
The ability to automate indicators related to specific attributes more than others is apparent in
Table 5.3, where seven of the 11 pertain to Accessibility. The rather "mechanical" nature of
Accessibility makes it more amenable to automated analysis than Recognizability for example.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 66
A deficiency of the current DOCALYZE prototype -- no analysis of inter- document properties --
must be removed to address Consistency and Recoverability in a satisfactory fashion.
The approach taken in DOCALYZE is statistical in nature. Frequency counts form the values of
selected indicators. An alternative approach that could be followed is to build an expert system
that would apply automatically the criteria characterizing document quality in the opinion of a
human documentation specialist. One caution here is that the predictive purpose of document
quality assessment mandates that the expert's judgment be validated; i.e. the expert demonstrate
the ability to recognize documentation characteristics that do affect software quality based on
post-deployment results.
5.3.3 A Combined Approach
Progressing to fully automated analysis of document quality should be a goal; but the hurdles to
achieving this goal are not insignificant. We advise a strategy framed within practical
limitations:
• Identify from the literature, or define based on experience, those document characteristics
which capture quality as your organization understands and values it. • Following the systematic procedure in Section 3.3, construct Document Quality
Indicators for those characteristics which admit definition. (Do not force entries in the definitional template; the current understanding of some characteristics is not sufficient for a DQI definition.)
• For each DQI, assess the importance of the indicator in conveying the quality of the
document (high, moderate, low) and estimate the effort (time and/or dollars) to achieve automated application (high, moderate, low).
• Examining the value pair (importance, effort) for each DQI, consider first those with
(high, high) for inclusion in the manual procedure. Follow with those with (high, moderate), then (high, low), adding indicators for manual use when possible.
• Consider the resulting set of DQI's in the manual procedure. If the set is felt sufficient
for quality prediction, then establish the set as the standard or base. If the set is too large, reduce the set by deleting DQI's with value pair (high, low), followed by (high, moderate), then (high, high) if necessary. If too few indicators comprise the set, then repeat the examination above, beginning with the (moderate, high) values and progressing downward in effort.
The intent of the above strategy is to develop a manual procedure, which will require training to
apply, while at the same time laying out a plan for progressive automation of the DQI
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 67
measurement. Thus, having completed the manual procedure set of DQI's, the excluded
indicators should be reviewed to develop a schedule for automating those deemed necessary.
The basis for selecting DQI's in the above strategy should assure that members of the manual set,
if candidates for automation, are among the last. The strategy preserves a stable manual
procedure for as long as possible.
5.4 Interpretation, Decisions and Actions
The interpretation of DQI values, especially in their integration with the values of process
indicators, is highly dependent on the maturity of the organization, in particular in its history
with software measurement. Limiting the discussion to document quality, an initial exercise
should be to identify those problems which appear to be most prevalent in causing low or
unacceptable DQI values. Corrective fixes are then developed for these problems, including
example excerpts reflecting the absence and presence of the attribute. This instructive material
should be included in a specific chapter of the Software Engineering Manual devoted to
document design, generation and review. If the writing problems are particularly widespread,
this chapter or an expanded version could be created as a separate pamphlet readily accessible
during process activities in which documentation is created.
5.4.1 Beginning a Document Quality Measurement Program
Document quality measurement, if preceded by process measurement, initiates with the first
qualified deliverable (likely to be the requirements document) and begins the transition to
product-oriented assessment augmenting the process measurement. Some variation in values for
process and document indicators should be expected, especially with a new measurement
program. Such variation could cause the conclusion that the two are measuring quality in very
different ways. Despite expecting some variation, an examination of the process and the
products delivered to that time is in order. Potential problems (effects) seen in indicator values
are listed along with the possible causes in Table 5.4. The suggested order of check is simply a
guess based on which causes seem more likely, lacking any project-specific information.
A distinct advantage of the use of Software Quality Indicators within the OPA Framework is the
potential for confirming information gained through the application of multiple indicators.
When indicator values, grouped in pairs or triples, do not relate to support an interpretation, then
the measurement procedure should be examined to assess the cause of the unexpected results.
That our intuition is sometimes faulty should be acknowledged, but assurance that such is the
case is a recognized responsibility.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 68
5.4.2 Integrating Process and Document Measurement
Augmenting the information provided by Process Indicators with DQI results gives a much
stronger basis for decisions regarding the quality prediction. Software components with notably
lower values need to be examined closely. As the document products continue to be generated
the information support becomes increasingly stronger. Decisions on inspection* schedules
should consider the information rendered by the SQI's; re-inspections may be warranted if low
quality persists. Regressive inspections, i.e. inspection of previously inspected specifications to
include corrections and improvements, should be mandatory actions. If software quality
measurement is to derive the maximum benefit for the project, then the predictive capability of
both process and document indicators must be employed in the decisions made.
Table 5.4 Checklist for Diagnosing Possible Problems with Document Quality Measurement
Large variations in a single indicator over the set of software components Large variations among Process Indicators for single software component Document Quality Indicators score much higher than Process Indicators Document Quality Indicators score much lower than Process Indicators Indicators appear to cluster in the interval (0.4, 0.6)
Different experience levels among component creators Varying knowledge of methodology Differing experience levels with environment tools (utilities) among component creators Inadequate training for some personnel Process quality may not be reflected adequately by Process Indicators DQI values appear inflated by low denominator values (component size typically) Process Indicator may not apply because of methodology or project guidelines Natural tendency is for SQI's to cluster near "0"
a,b,c b,d,a f,g,e c,e,d g,h
Problems (Effectswith SQI's
SuggestedOrder of Check
Potential Cause of Problems (Effects)
1.
2.
3.
4.
5.
(a)
(c)
(b)
(d)
(e)
(f)
(g)
(h)
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 69
That differences exist among software-intensive projects is undeniable. The choice of
methodology for software evolution can lead to differences. Yet, those differences may exert
little influence on software quality because of the approach taken by the OPA Framework.
Functional influences are minimized; the focus is where it should be: the engineering of high
quality software. Consequently, we believe in maintaining project measurement histories that
promote the comparison of current values with past results. The key consideration for inter-
project comparison of measures is to avoid the cases where other factors render the comparisons
meaningless. Comparisons of indicator values within a project are obviously a plus, but
comparisons between projects, with some attention to avoiding the "poor mixes," can be
extremely informative. The project measurement history for software quality, i.e. the quality
database described in Section 8.4, should be regarded as a corporate asset, and used in the
planning and management of subsequent projects.
6. PRODUCT MEASUREMENT: CODE
Developing measures of software (code) quality has been a long-term challenge in computer
science and software engineering. A survey of articles in the literature reveals that many metrics
are available for measuring software. Some well documented metrics include Halstead's
Software Science [1977], McCabe's Cyclomatic Number [1976] and Henry and Kafura's
Information Flow Metric [1981]. A major criticism of using any one of these metrics in isolation
is that it provides a snapshot of only one particular aspect of code quality. For example,
McCabes's Cyclomatic Complexity reports the number of paths through a program. While this
particular characteristic is certainly related to program quality, it says nothing about the many
other contributors to code quality, e.g., inter-module coupling, the use of structured
programming constructs, code commenting, and so forth. Even if one employs a collection of
existing metrics, each developed in relative isolation and narrowly focused on particular
characteristics, drawing a conclusion from disparate measures is indeed difficult.
The objective of this section is to outline and explain one process for identifying and defining
measures of code quality. In the following subsection a systematic procedure for deriving
measures based on quality attributes is presented. This procedure outlines a process for: (1)
identifying language structures supporting the development of code which exhibits quality
characteristics, and (2) formulating measures that reflect the intended benefits of including such
structures in the language. Section 6.3 presents an overview of what are considered to be the
attributes of code quality. The outline includes the description of an abbreviated, yet structured,
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 70
indicator derivation for each attribute. Finally, the last subsection presents a discussion that
focuses on the distinctive capabilities and limitations in measuring code quality.
6.1 Indicator Distinction and Derivation
Section 3.3.1 introduces a systematic procedure for defining software quality that is applicable to
the definition of process, documentation and code quality indicators. The following set of steps
are intended to focus that process on the definition of code quality indicators. In particular, the
steps presented below build on those outlined in Section 3.3.1 by incorporating additional
information conditioned on the specific implementation language. The first three steps describe
how one uses a language definition and rationale to identify potential structures supporting
software quality engineering. Steps 4-5 extend the systematic procedure described in steps 4-7
in Section 3.3.1 demonstrtating the specific application to Ada. All combined, the steps outlined
below justify, strengthen and further substantiate the utility of quality indicators based on
definitive relationships between software engineering attributes and observable properties of the
code. Although we describe the procedure for Ada, the steps are easily generalized to
accommodate any structured, imperative language.
Step 1: Identifying, Categorizing and Classifying Crucial Language Components
The initial task in defining a procedure for assessing the quality of a language product is to:
• identify those language components deemed necessary and crucial to the assessment process, and then • formulate a categorization scheme that permits a language to be examined at both the individual and aggregated component levels.
In concert with this approach, the initial categorization scheme employs partitioning criteria
proposed by Ghezzi and Jazayeri [1982] and Wichmann [1984]; that is, the partitioning of
language components along specific functional boundaries. In particular, an Ada program can be
viewed as possessing data types, statement level control structures, and unit level control
structures:
• Data Types
- Strings - Record Discriminants
• Statement Level Control Structures
- Partial Array Assignments - Exit Statements
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 71
- Named Loops with Exits - Block Structures
• Unit Level Control Structures
- Subprograms Default Parameters, Name Overloading, Parameter Passing
- Packages Specification Body
- Generics - Tasking
Concurrency Specification - Exception Handling.
We recognize that the above categorization does not cover all Ada-specific language
components, but stress that the intentions are to examine only those that are most prominent from
a software engineering perspective. Bundy [1990] offers a more detailed explanation of
identifying, categorizing, and classifying Ada language constructs with respect to software
quality assessment within the OPA Framework.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 72
Step 2: Understanding the Rationale for Component Inclusion
Before employing code analysis as part of a software quality procedure, one must acquire a firm
understanding of why particular language constructs have been included in a language
definition. In some cases, the rationale might simply be that a specific capability is needed, e.g.,
looping. From the perspectives of software engineering and software quality assessment,
however, the rationale for including constructs like generics, packages, and block structures that
are purported to support desirable product design and development capabilities is a particular
focus. For Ada, the language designers have provided the Rationale for the Design of the Ada
Programming Language [ADAR84]. Published papers describing research and development
efforts and books describing usage techniques provide additional insights into the proposed uses
of Ada language components. Using packages as a representative example, the next paragraph
outlines the type of information used in synthesizing an adequate understanding for including
particular language elements in the definition of Ada.
According to [ADAR84] packages are one mechanism through which the programmer can group
constants, type declarations, variables, and/or subprograms. The intent is that the programmer
uses packages to group related items. From a software engineering perspective, this particular
use of packages is appealing because it promotes code cohesion [Ross 1986]. Packages are also
a powerful tool in supporting the specification of abstractions. The ability to localize
implementation details and to group related collections of information is a prerequisite for
defining abstract data types in a language. Again, from a software engineering perspective, the
capability to specify abstract data types and to force the use of predefined operations to modify
data structures promotes reliability, portability, and maintainability.
Step 3: Assessing Component Importance from a Software Engineering Perspective
To exploit the OPA Framework one must determine each individual component's contribution to
the achievement of desirable software engineering objectives, its support in the use of accepted
software engineering principles, and/or its ability to induce desirable software engineering
attributes in the resulting product. Important in the OPA Framework is the impact of a
component on product quality -- it can be beneficial or detrimental. For example, operator
overloading generally enhances program readability [Wichmann 1984a, Ghezzi and Jazayeri
1982]. If used indiscriminately, however, it can have the opposite effect [Ghezzi and Jazayeri
1982 ].
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 73
From an Ada standpoint, the literature abounds with citations attesting to the "software
engineering goodness" of Ada language constructs. In particular, Ada packages are extremely
important in achieving a high quality software product. Ada packages support four definitional
abstractions: named collections of declarations, subroutine libraries, abstract state machines, and
abstract data types. One particular abstraction, abstract data types, is fundamental to supporting
the software engineering principle of information hiding [ADAR84]. That is, packages defining
abstract data types provide the type declaration for an abstract data type and methods for
manipulating the data type. What is hidden from the user is the sequence of coded instructions
supporting the manipulative operations. Also, the user is forced to modify the abstract data type
through the specified operations. This form of information hiding is particularly beneficial when
maintenance is required because it tends to minimize the "ripple effect" that change can have.
As also discussed by Booch [1983, 1987], packages are crucial in supporting modularity,
localization, reusability, and portability, all of which are highly supportive of software
engineering objectives.
Step 4: Identifying the Impact of Component Usage on Desirable Software Engineering
Attributes
In the third step described above language components are associated with rather abstract
software engineering characteristics such as maintainability, reliability, information hiding, and
modularity. The fourth step is to identify: (a) how each language construct can be used (or
misused) during software development, (b) which attributes are affected, and (c) how they are
affected. Within the OPA Framework the fourth step is crucial because it relates the use of each
language construct to the impact it has on one or more of the (less abstract) software engineering
attributes. This fourth step is illustrated below by considering the impact of packages relative to
selected software engineering attributes.
As a basis, we examine the four proposed uses of packages in linking package properties to
software engineering attributes. For example, packages that contain only type declarations
indicate code cohesion [Ross 1986]. The other three proposed uses are packages to define
abstract data types, packages to define abstract state machines and packages to define
subprogram units. Although all four of these uses induce desirable attributes in the developed
product (see [Gann, Katz and Basili 1986, Embley and Woodfield 1988, Booch 1987],
respectively), the improper use of packages can also have a negative impact on the desirable
product attributes. For example, the use of packages to group type declarations has diminishing
returns when too many type declarations are exported. This misuse hinders ease of change
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 74
because program units must be unnecessarily checked for possible impacts caused by changes to
declaration packages.
Consider as a detailed illustration of the above, the use of packages to define abstract data types
(we refer to such packages at ADT packages). The benefits (relative to the inducement of
desirable software engineering attributes) of ADT packages are enhanced cohesion (functional
and logical), a well-defined interface via the ADT, and enhanced ease of change for program
units "withing" the ADT package. The improved cohesion results from the grouping of the
ADT declarations and access operations within one package. A well-defined ADT interface is
achieved by using the package specification to house the subprogram specification for each ADT
and then using private or limited private types to restrict access to the ADT. From a different
perspective, because of the capabilities provided by packages, the use of ADTs has additional
beneficial effects in terms of reduced code complexity and improved readability. Without
further elaboration, it suffices to say that the definition of ADTs through packages embraces the
use of abstractions that hide superfluous details from the ADT user.
Step 5: Identifying Properties, Defining Indicators, and Formulating Measures and Metrics
The fourth step of the metric development procedure describes the impact that component uses
and abuses have on the software attributes. Step 5 identifies and formally links product
properties (language elements) to software engineering attributes, and building on the
relationship between the attribute/property pair, defines a measurement approach, supporting
metric(s) and an indicator. These activities are considered as a single step in the derivation of
code indicators because they are so intrinsically related. To illustrate Step 5 of the development
procedure, the remainder of this section focuses on the identification of properties indicative of
the presence of the attribute cohesion relative to the use of packages in defining groups of
subprograms.
With attention focused on a single attribute, the process begins with the identification of
properties that indicate the presence or absence of that attribute. In the cohesion example, the
task is to identify characteristics that a cohesive package would exhibit. One such characteristic
is the utilization of subprograms defined within a package. In particular, each program unit that
“withs” the package of subprograms utilizes a percentage of the subprograms. A very low
utilization suggests that the subprograms grouped by the package are not as closely related (or
functionally cohesive) as they should be. A very high utilization suggests that the subprograms
are closely related or functionally cohesive.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 75
The description presented in the previous paragraph suggests: (a) selecting an attribute, (b)
identifying the property (or properties) that supports an evaluation of the selected attribute, (c)
formulating a measurement approach, and (d) defining the metric and corresponding indicator.
In particular, the attribute/property pair is the “definition of packages that export subprograms
relative to its positive impact on code cohesion.” Hence, to effectively measure the cohesiveness
of packages that export subprograms, one must examine the utilization of the subprograms by
“withing” units. Intuitively, if the subprograms are sufficiently related, any unit that “withs” the
package should use a majority of the subprograms. The indicative metric, calculated on a per
package basis, is given with the following formula:
• "Withs" to a Sub Package
Sub Package Utilization
= package subprograms
referenced
(total # of "withs") * (# of subprograms in the package specification)
(Note: Sub Package refers to a package that exports subprograms)
The associated indicator is:
COH/DPES = 5 * Sub Package Utilization.
In summary, the OPA Framework provides a formal basis for defining a software quality
assessment procedure that includes both code and document products. We have focused on Ada
for the examples of quality in code components. Using the set of steps described above we have
defined 66 Ada code indicators: eight are based on data type information, 12 exploit properties of
statement level structures, and 46 reflect characteristic assessments of unit level constructs such
as packages, subprograms and so forth. (For additional detail see [Bundy 1990). For all 66
indicators, a prototype Ada code analyzer (Adalyzer) and report generator (RGEN) provide the
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 76
necessary automation for code analysis, data collection, indicator computation, and reporting of
results.
6.2 Attributes of Code Quality
Producing code that is maintainable, reliable and/or reusable is a highly desirable goal.
Nonetheless, because such concepts represent abstract amalgamations of many distinct quality
characteristics, directly measuring their existence is often difficult, if not impossible. For
example, we know that the use of structured data types promotes maintainability through code
cohesion [Conway, et. al. 1976]. On the other hand, if those same data structures are used as
parameters in procedure calls, excessive inter-module coupling is likely to be introduced through
the passing of extraneous information that is part of the data structure but unused by the called
procedure. This use of structured data types has an adverse impact on maintainability [Troy and
Zweben 1981].
The paradox illustrated above is a direct reflection of trying to measure quality relative to a
concept that is too abstract. Note, however, that the beneficial impact of using structured data
types is related to a cohesive quality imparted to the code. Similarly, but with a negative effect is
the adverse impact of using structured data types that introduces unnecessary inter-module
coupling. If one views the impact of structured data types relative to these attributes of code
quality rather than project-level objectives, no paradox exists.
The following subsections enumerate and discuss those code attributes that support product
quality assessment from an individual module perspective. The work by Dandekar [1987] and
Bundy [1990] serves as the basis for the discussion.
6.2.1 Cohesion
Cohesion is defined to be the degree to which the tasks performed by a single program module
are functionally related. A cohesive module is a collection of statements and data items that are
treated as a whole because they reflect actions and data stores focused on the proper execution of
a single function. Through linkages to the software engineering principles of abstraction,
information hiding and stepwise refinement, cohesion can be traced as one contributor to the
achievement of project-level objectives such as maintainability, reliability and reusability.
Although seven types of cohesion have been identified [Stevens, Myers and Constantine 1974],
three are most prominent in reflecting the degree to which a module is cohesive: functional
cohesion, sequential cohesion and communicational cohesion. A functionally cohesive module
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 77
is one in which every processing element is an integral part of, and is essential to, the
performance of a single function. Sequential cohesion is characterized by distinct processing
elements within a module that combine to form a linear chain of successive or sequential
transformations of the data. On the other hand, modules which exhibit communicational
cohesion possess code constructs which simply share information in the process of computing an
intended function. In terms of module engineering, functional cohesion is the most desirable,
followed by sequential and communicational cohesion, respectively.
Understanding the implications of cohesion relative to the achievement of project-level
objectives is crucial. Just as crucial, however, is establishing a firm understanding of how the
various types of cohesions differ and are reflected in module code -- this understanding provides
the basis for identifying measurable code properties attesting to the presence or absence of
module cohesion.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 78
6.2.1.1 An Indicator of Cohesion: The Use of Block Statements
Block statements, e.g. begin/end , while and for are control structures that are provided by
most imperative languages. By binding code together through the use of blocking statements,
code cohesion is enhanced. This conjecture is supported by the observation that statements
enclosed in block structures usually define specific functions (or sub-functions) related to the
task of the encompassing module.
To measure the extent to which the use of block statements promotes code cohesion, we assume
that the ideal module has all of its code enclosed in blocking structures, and then measure that
percentage actually placed within such structures. In an attempt to adjust for artificial (or
unnatural) cases where: (a) significant amounts of code are enclosed in a single blocking
statement, or (b) small amounts of code are enclosed in many block structures, we utilize an
additional normalization component based on the expected number of block structures (derived
from an average across all modules).
In defining the metric we have elected to measure only code that is enclosed in the outermost
(level 0) blocking structures. Effectively, all nested blocking structures and their respective code
are counted as single lines of code within a single blocking structure.
Metric definition:
Let BS denote Block Structure at Level 0
Avg SLOC per BS =
Total SLOCs in all modules
Total number of BSs in all modules
Expected Number of BSs = Total SLOC in Module
Avg SLOC per BS
Metric Formula = Total SLOC Enclosed by BSs
Total SLOC of Module
Expected Number of BSs
Number of BSs in Module *
Proposed Indicator:
COH:UOBS = -5 + 10*(Metric Formula)
In the above computation we do constrain the value of the metric formula to be between 0 and 1.
Consequently the range of the indicator value is between -5 and 5 with 5 denoting the highest
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 79
cohesion and -5, the least value for cohesion. A 5 is achieved if the module under consideration
has at least the number of expected block structures at level 0 and all of its code is enclosed by
those block structures.
6.2.1.2 Other Indicators of Cohesion
Additional indicators of cohesion include:
• Use of switches as parameters (-) • Use of controls structures (+) • Multiple entry points in a module (-) • Multiple exit points from loops (-) • Multiple exit points from modules (-) • Number of calling modules (+) • Excessive number of called modules (-) • Modularization (+) • Definition of declaration packages (+) (Ada specific) • Definition of packages that export subprograms (+) (Ada specific).
Indicators labeled with a "+" signify that they denote a beneficial (or positive) impact of a
property on an attribute. Conversely, a "-" inidcates a negative impact.
6.2.2 Complexity
We define complexity as the degree or complication of a system or system component,
determined by such factors as the number and intricacy of interfaces, the number and intricacy
of conditional branches, the level of nesting, the types of data structures, and other system
characteristics. In some sense complexity is an abstract measure of work associated with
understanding a software component. All participants in software development are subject to
mistakes - programmers, analysts and designers. The reason behind their mistakes is the
significant complexity of the proposed problem. Or, from a different perspective, mistakes occur
because of the limited capacity of humans to understand complexity.
Factors affecting complexity are: (a) the amount of information that must be understood
correctly, (b) the accessibility of information, and (c) the structure of information [Yourdon and
Constantine 1978]. The "amount of information" corresponds directly to the number of
statements or arguments that are presented to the software engineer at one time. For software,
this factor is related to the size of a program module. The "accessibility of information" refers to
the availability of information about a software component that enhances the understanding
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 80
necessary for writing or interpreting the code correctly. For example, code comments and
development documentation reduce module complexity. Finally, "structure information"
captures the impact that presentation format has on complexity. For example, code presented in
a linear fashion, rather than nested, is less complex. Similarly, information is less complex if
represented in a positive form rather than in a negative one. Both of these concepts are directly
applicable to writing and understanding code.
Because the attribute of complexity is the most difficult to define, for illustrative purposes we
include instances of each of the three factors in its definition. This inclusion also underscores the
absolute necessity of having a firm understanding of: (a) what constitutes module complexity
and (b) what measurable module properties indicate reduced or excessive complexity.
6.2.2.1 An Indicator of Complexity: Mixing the Order of Parameters within a Call
Statement
Within languages like Ada the programmer has the option to pass parameter values according to
defined position or by naming the parameter value to correspond to the formal name in the called
module. The use of named parameters is beneficial when one elects to employ default values
defined by the called module. This capability can be abused, however, if the programmer elects
to pass parameters in a sequence that differs from that defined in the formal parameter list of the
called module. The reordering of parameters is possible through the use of named parameters.
Nonetheless, such decisions contributes to confusion and incomprehensibility, and thereby, add
to the complexity of the module and to the overall program.
The proposed measurement approach examines each call statement with a given module and
compares the order of each parameter list with the ordering specified by the formal argument list
defined by the called module. If any of the parameter lists differ from the defined formal
argument list, then complexity is being increased. This measure is to be tempered relative to the
total number of call statements in the module.
Metric Definition:
Proportion of Call Statements with Reordered Parameters=
Total Number of Call Statements with Parameter List whose Ordering Differs from the Corresponding Formal Argument List
Proposed Indicator:
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 81
COM/MOPCS = -5 * Proportion of Call Statements with Reordered Parameters
Note that the range of the proposed indicator, COM/MOPCS, is between -5 and 0. This range
reflects the fact that the presence of call statement with reordered parameters can only contribute
to the complexity of a module. Effectively, the absence of such call statements, i.e., an indicator
value of 0, contributes nothing to enhancing or diminishing module complexity.
6.2.2.2 Other Indicators of Complexity
Given the attention directed toward program complexity, the various forms suggested and the
numerous metrics advocated, the recognition of many properties that can be tied to this attribute
is not surprising. No doubt, others beyond those identified in our work could be selected.
Additional indicators of complexity include:
• Use of control structures (+) • Excessive nesting of control structures (-) • Use of dynamic structures (-) • Use of meaningful names for modules and variables (+) • Use of GOTOs (-) • Use of negative compound booleans (-) • Use of block comments (+) • Program length (-) • Use of embedded alternate language (-) • Use of code indentation (+) • Use of recursive code (-) • Multiple entry and exit points for module (-) • Use of both default parameters and positional notation (-) (Ada specific) • Definition and use of default parameters for stable values (+) (Ada specific) • Use of parameter notation (+) (Ada specific) • Mixing the order of parameter lists (-) (Ada specific) • Use of both positional and name notation in a single module call (-) (Ada specific) • Definition of packages that are never "withed" (-) (Ada specific) • Use of record discriminants (+) (Ada specific) • Use of exception handlers (+) (Ada specific).
6.2.3 Coupling
Coupling is defined as the measure of the interdependence among modules in a computer
program. Coupling results when an element of code references a location in memory which is
not defined in the encompassing module. In more general terms, coupling occurs when elements
of code in two distinct modules reference the same location in memory. Two common situations
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 82
that give rise to coupling are: (a) the use of global variables and (b) the use of parameters to
share information.
As a rule one strives to reduce coupling. The interdependence among modules can adversely
affect the modifiability of a module through what is known as the "ripple effect". The ripple
effect occurs when one makes a change in module A and inadvertently introduces changes in
other modules. For example suppose that modules A and B share data through a globally
defined integer variable C . Suppose further that the type of C is changed to real to accommodate
algorithmic changes in module A. The impact of that change will "ripple" down to all other
modules that reference C, e.g., module B. The minimization of coupling is a direct result of
applying the principle of information hiding, for example. Applying other principles can affect
coupling as well.
The degree of coupling among modules is primarily related to three factors: the type of
dependent connection, the size of the connection and the type of communication permitted
through the communication. Connection types can be partitioned into two categories: (1)
connections that address or refer to a module as an entity by its name, i.e., passing information
through a procedure call, and (2) connections that refer to internal elements of a module, e.g. the
use of non-local (or global) variables for sharing information. Connections of type 1 are more
effective in reducing coupling. Connection size refers to the amount of information that is
passed (shared) between two modules -- the more the information, the higher the coupling.
Finally, the third factor recognizes the difference between information in the form of data versus
control. Data coupling is coupling introduced by sharing simple information such as
computational items. Control coupling stems from the sharing of information through which
decisions are made in a subordinate (or superordinate) module. Typically, such control
information is termed as a "flag" or "switch". Coupling associated with the passing of data
information is less severe than that caused by the passing of control information.
6.2.3.1 An Indicator of Coupling: The Use of Structured Data Types as Parameters
As stated earlier the use of structured data types can have a beneficial or detrimental impact
depending on the form of use. In particular, when data structures are passed as parameters,
seldom is every data element needed by the receiving module. In effect, those extraneous data
elements introduce unnecessary inter-module coupling.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 83
To capture the impact that the use of structured data parameters has on inter-module cohesion,
one examines each parameter list to determine what percentage of parameters are, in fact,
structured data elements. That is, for all calls within any given module the proposed measure
relates the actual number of structured data type parameters to the total number (structured and
primitive data elements).
Metric Definition:
Proportion of Structured Data Types Used as Parameters=
Number of Structured Data Type Parameters
Total number of Parameters
Proposed Indicator:
COUP:STDP = -5 * Proportion of Structured Data
Types Used as Parameters
Note that the values for the indicator COUP/STDP range between -5 to 0, indicating that the use
of structured data types as parameters can have only a detrimental impact on coupling as an
attribute affecting software quality.
6.2.3.2 Other Indicators of Coupling
Excessive nter-module coupling is often cited as reflecting poor design. Clearly, a code indicator
of poor design provides information a posteriori. Before taking the action of reviewing the
design, one would want to have additional evidence that such a costly decision is warranted.
Additional indicators of coupling include:
• Use of global variables (-) • Use of switches as parameters (-) • Use of parameterless procedure calls (-) • Multiple entry points to a module (-) • Number of calling modules (-) • Types of parameters
- Control (-) - Data (+)
• Use of rendezvous with in/out or in and out parameters (-) (Ada specific) • Relying on upper level modules to handle raised exceptions (-) (Ada specific).
6.2.4 Ease of Change
We define ease of change as the ease with which software accommodates enhancements or
extensions. Ease of change is often referred to by other authors as: (a) expandability -- the ability
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 84
to provide for the expansion of data or program function), (b) or changeability -- the ease with
which the logic of a program can be changed.
In general, code is never a static object -- over time, changes, extensions, enhancements and
expansions are required. Changes, and in particular the ease of change, must be viewed from
two related perspectives: (1) when the code is first produced and (2) when the code is being
modified.
When a system is first being developed, one should design with change in mind. Where
appropriate, for example, data dictionaries and data base management systems should be
exploited. Data dictionaries provide common naming conventions and data descriptions which
are then combined to form the record structures used by all code modules. Modular design helps
isolate and encapsulate individual functions, thereby minimizing the scope and impact of
changing any one particular function. During the coding phase, the use of sound software
engineering principles such as information hiding, structured programming and stepwise
refinement are critical to supporting ease of change.
When a code unit is being modified, the changes should not adversely impact the ease of future
change. This latter point is particularly crucial for those systems that are expected to have a long
lifetime and to evolve over an extended period of time. More specifically, maintenance
personnel must focus on changes that are controlled and carefully monitored, while adhering to
those same principles used in the initial product development.
6.2.4.1 An Indicator of Ease of Change: The Use of Symbolic Constants
If a constant is used several times in a module, it is preferable to define that value as a symbolic
constant and then refer to its associated identifier wherever the constant value is needed. Such
an approach has two major advantages over using constant values in every appearance: (1) it
permits one to associate a meaningful name with a value, e.g., EOF for cntrl-D, and (2) if the
value needs to be changed at a later point in time, a single change of value is sufficient.
To measure the impact of the use of symbolic constants relative to ease of change, two pieces of
information must be considered: (1) the number of symbolic constants currently being used in
the module, and (2) the potential use of additional symbolic constants. The latter can be
determined by counting the number of non-symbolic constants that are used multiple times
within the given module. Relative to ease of change, the ideal module always uses symbolic
constants in lieu of multiple references to non-symbolic constants.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 85
Metric Definition:
Proportional use of Symbolic Constants=
Number of Symbolic Constants Defined
Number of Distinct Uses of Multiple References to Non-Symbolic Constants
+ Number of Symbolic Constants Defined
Proposed Indicator:
EOC/USC = 5 * Proportional use of Symbolic Constants
The range of values for the indicator, EOC/USC, is [0, 5]. With this particular indicator the use
of symbolic constants in a module can only have a beneficial impact on ease of change, hence,
the consequent values are zero (0) or greater.
6.2.4.2 Other Indicators of Ease of Change
Additional indicators of ease of change include:
• Use of dynamic structures (+) • Use of modularization (+) • Use of global variables (-) • Number of modules called (-) • Definition of declaration packages (+) (Ada specific) • Insufficient decomposition of declaration packages (-) (Ada specific) • Definition of packages that export subprograms (+) (Ada specific) • Multiple instantiations of generic units (+) (Ada specific).
6.2.5 Readability
Readability can be defined as the difficulty in understanding the function(s) of a software
component and how that functionality is realized by that software component. We strive for
code that is readable, i.e., understandable. The more readable a code module, the easier it is to
understand what function(s) the code unit performs and how that function(s) is realized in the
code itself.
Many factors affect code readability. Most prominent among such factors are those related to
comments and code structure. Code comments, for example, can be placed at the beginning of a
module to provide a natural language description of its functionality. Comments can also be
interspersed throughout the code to provide insights into how the functionality is being realized.
Obviously, in the absence of comments, maintenance personnel must rely on somewhat cryptic
code to determine functionality and module structure. Even more detrimental, however, is the
presence of incorrect comments. Incorrect (or inaccurate) comments are often an artifact of
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 86
modifying code without appropriately changing the corresponding comments. Such comments
are likely to mislead maintenance personnel and cause the code to be misinterpreted.
Code structure includes several items that promote readability. For example, code indentation is
useful to represent nested conditionals and loops; that is, conditional or loops that are
"subordinate" to other conditionals or loops. The use of structured programming constructs, the
avoidance of negative compound conditionals, and the separation of an IF statement's true-part
and else-part all represent aspects of a structured approach to coding which promotes readability.
6.2.5.1 An Indicator of Readability: Use of GOTOs
As stated above code structure plays an important role in the readability of software modules.
That structure can assume either physical characteristics, like code indentation, or logical ones,
e.g., the use of GOTOs. GOTOs first appeared in FORTRAN in the late 1950s and have been
incorporated (although reluctantly so) in many languages since. GOTOs are an unconditional
runtime branch to another location in the program. This language "feature" represents the
antithesis of structured programming concepts. GOTOs destroy code readability by introducing
unstructured control within a module. Inordinate use of GOTOs results in what is known as
"spaghetti code."
Because the detrimental impact of GOTOs is absolute, measuring its effect is relatively easily.
We propose a simple metric that decreases a module's readability factor in direct proportion to
the number of GOTOs found.
Metric Definition:
Extent of GOTO Usage = 2 * Number of GOTOs
Proposed Indicator:
READ:GOTO = Max (-5, 0 - Extent of GOTO Usage)
Because of the absolute value associated with this proposed indicator, its value is constrained to
the range [ -5, 0] . While this penalty might seem harsh, consensus in the software engineering
community supports such a view.
6.2.5.2 Other Indicators of Readability
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 87
Additional indicators of readability include:
• Use of control structures (+) • Use of symbolic constants (+) • Multiple statements on one line (-) • Use of GOTOs (-) • Use of meaningful names for routines and variables (+) • Multiple exit points from loops (-) • Modules exceeding one printed page in length (-) • Use of block comments (+) • Excessive use of single line comments (-) • Use of parenthesized expressions (+) • Overloading of subprogram names (-) (Ada specific) • Use of units which "with" packages that export subprograms (+) (Ada specific) • Definition of tasks (-) (Ada specific) • Mixing order of parameter lists (-) (Ada specific) • Use of parameters with name notation (+) (Ada specific).
6.2.6 Traceability
Traceability is defined as the software (code and documentation) attribute that provides a link
from requirements to the implemented program [Arthur 1985]. In other words it is the ease in
retracing the complete history of a software component from its current state to its design
inception.
As a system evolves conventional wisdom mandates the verification of requirements to design
and design to code. Traceability characteristics, captured in development documents and
artifacts such as the software development folder, are indispensable in the verification process.
Similarly, the capability to trace code elements to their originating requirements supports an
evaluation process that ensures the necessity for inclusion of all such elements.
In addition to supporting the development process, traceability is a critical component in the
execution of software maintenance. In particular, proposed code and design changes must be
thoroughly researched to determine their impact on other systems elements and the overall set of
currently defined requirements.
Clearly, design documents that include traceability matrices and references to other documents
assist in providing traceability. References to called procedures and code comments provide
additional traceability artifacts.
6.2.6.1 An Indicator of Traceability: Use of Comments Referencing Project Documents
and "Who Called Me"
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 88
Measuring traceability based on code attributes relies primarily on comments that reference: (a)
project documentation, and (b) other calling modules. Unlike previously defined indicators,
assessing traceability through comment references relies more on content analysis rather than
data item counts. Consequently, the data collection is, by necessity, a time consuming, manual
process. Nonetheless, well placed code comments provide the necessary links relating sections
of code to corresponding design and requirement specifications. Moreover, comments that
describe calling sequences, and in particular, those which describe "who calls me" provide
traceability to higher-level modules, and hence, to higher-level functions supported by the
module being examined.
Similar to the immediately preceding indicator, measuring traceability through the use of use of
appropriate comment references relies primarily on a simple count, that is, the number of
references to project documentation and to calling modules.
Metric Definition:
Number of Comment References = Number of References to Project Documentation +
Number of References to Calling Modules
Proposed Indicator:
TRAC/NOCR = Min (Number of Comment References, 5 )
6.2.6.2 Other Indicators of Traceability
Traceability is an easily understood attribute. Consequently, little confirmation through other
properties is either needed or possible. Additional indicators of traceability include:
• Consistency in the use of variable names in code and documentation (+) • Organizational consistency between code and documentation. (+).
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 89
6.2.7 Well-defined Interfaces
An interface is defined as a shared boundary where autonomous systems interact (communicate)
with each other [Dale and Orshalick 1983]. While interfaces exist in many different forms, e.g.,
hardware/hardware, hardware/software and software/humanware, relative to software quality
assessment we focus our attention primarily on those interfaces that support communication
among software components. Subsequently, a well-defined interface can be defined as the
definitional clarity and completeness of a shared boundary between a pair of software
components.
Guidelines for a well-defined interface suggest that a module's interface should be defined and
used so that:
(1) all communication passes through the interface, (2) parameters lists are kept small, and (3) inputs are separated from the outputs [Marca 1984].
The first guideline requires that: (a) each input and output be definitively identified and
described in detail, (b) all requisite data be passed when a module is invoked, and (c) all results
are passed back to the calling module when the called module completes execution. The most
common violation of the first guideline is the use of global variables for the sharing of
information among modules. The second guideline underscores the necessity of keeping
parameter lists small. Stevens, for example, suggests that the number of parameters should not
exceed three or four -- effectively, minimizing the number of parameters tends to improve the
clarity and simplicity of the interface [Stevens 1981]. Finally, the third guideline stipulates (or
implies) that "in", "out" and "in/out" parameters be easily distinguishable. Languages such as
Pascal and Ada support and enforce such classification.
6.2.7.1 An Indicator of Well-Defined Interfaces: The Use of Parameterless Procedures
The interface between two modules is defined by the data elements passed between them. An
explicit indication and description of those data elements is crucial to the creation of a well-
defined interface. Nonetheless, modules are often invoked using parameterless procedure calls.
Implicit in such invocations is that communications are occurring through the use of global or
non-local variables. Consequently, the information exchanged between the calling and the called
module is difficult to determine; hence, the interface characteristics between the two modules are
obscured.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 90
To measure the impact of parameterless call on well-defined interfaces we require two data
elements: (1) the number of parameterless calls, and (2) the number of calls that have
parameters. Relative to well-defined interfaces the latter count is used to temper the negative
effect of the former.
Metric Definition:
Proportion of Parameterless Calls = Number of Parameterless Calls
Number of Calls with Parameters
Number of Parameterless Calls+
Proposed Indicators:
WDI:UOPC = -5 * Proportion of Parameterless Calls
6.2.7.2 Other Indicators of Well-defined Interfaces
Additional indicators of well-defined interfaces include:
• Use of global variables (-) • Use of structured data types as parameters (-) • Use of excessive number of parameters (-) • Definition of default parameters (+) (Ada specific) • Definition of packages that export subprograms (+) (Ada specific).
6.2.8 Attributes of Code Quality: A Summary
In summary, seven attributes of code quality are described above. The intent of this discussion is
to
• present a sample of those characteristics that contribute to code quality, • discuss code quality relative to measures that employ observable, concrete data elements,
and • promote the thesis that "observable properties of the code exist that attest to the presence
and absence of attributes such as cohesion, coupling, ease of change, readability, complexity, traceability and well-defined interfaces."
Those observable properties, coupled with individual attributes, form the basis for defining code
indicators. The OPA Framework links computed indicator values to the use of appropriate
principles in the development process, and finally, the employment of principles to the
achievement of project-level objectives. Interpreted within the OPA Framework, these
indicators support decision processes at the software engineering, program management and
project management levels.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 91
6.3 Automated Collection of Code Indicators
Experience has shown that manually collecting data to compute code indicator values is labor-
intensive and error prone. The OPA approach utilizes many contrasting and confirming SQIs.
Partly because of the number but also because of our original intent, the data items required to
compute the SQIs are simple and easily obtained. Consequently, automated data collection is
facilitated, as is the metric computation associated with each SQI for code. Currently, we are
using a prototype Ada code analyzer (ADALYZE) which analyzes code based on its compilation
order and extracts the necessary data items to compute 66 code indicators.
6.4 Interpretation, Decision and Actions
As stated in Section 2.3 the OPA Framework stresses assessment through the employment of
multiple confirming and contrasting indicators of quality. Each indicator focuses on the
measurement of one particular characteristic of quality. Taken together, the full set of indicator
measures provide an accumulation of evidence attesting to the extent to which product quality
has been achieved. The multiple indicator approach does not rely exclusively on any single
measure of quality. Moreover, it recognizes the fact no two modules are expected to exhibit the
same quality characteristics.
In general, code assessment provides an "after the fact" judgment of quality. Nonetheless, it can
provide crucial insights supporting the development of better quality products in the future. If
one's development process is based on an incremental approach, those insights can have an
immediate beneficial impact.
From a code perspective, software quality assessment is focused at the unit level. That is, each
module is individually examined for quality attributes. If a particular attribute for that module
receives a low rating, the software engineer examines the indicators contributing to the
aggregated attribute value, looking for the indicator (or indicators) that contributes to the low
score. Because each indicator is defined to include the rationale behind its measurement, the
software engineer is guided through a reasoned explanation as to why the module is judged to be
inadequate. Based on that explanation, a preliminary set of actions can be formulated to correct
the inadequacies. Since the definition of each indicator includes all code properties contributing
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 92
to its assessment, the preliminary set of corrective actions can be focused and refined to address
the precise characteristic(s) contributing to the module's low quality rating.
In addition to providing the software engineer with the crucial insight necessary to identify and
correct software quality problems, the definitional structure of each indicator promotes additional
education. Through periodic examination of the indicator characteristics, the software engineer
is continually exposed to those basic elements and activities that contribute to the development
of a quality product.
7. AN EXAMINATION OF INDICATOR VALUES AND THEIR
AGGREGATION
The OPA Framework relies on indicators, defined and developed in the tradition of social
science research. Indicators are surrogates that are directly measurable; whereas the attributes
are concepts that do not admit to direct measurement. The value and benefits derived from a
software measurement program accrue in time as experience enables the user to gain confidence
in the values and to recognize the signals embodied in them.
A well designed quality measurement program relies on instrumentation throughout the
development or maintenance process. Confirming and contrasting indicators can present a more
holistic picture than selected metrics in isolation. Presentations using aggregated values and
trend accumulations in graphs or tables can be especially effective.
7.1 Using and Interpreting Indicator Values
Recall that an indicator measurement reflects an attribute and property relationship that is
described as an attribute/property pair in Section 2. More specifically, the indicator value
portrays the degree to which the property provides evidence that the attribute is present in the
product or process. Some attributes can give evidence only of the absence of an attribute
(computation of their value is limited to the range 0 to -5). Others can support only the presence
of an attribute, ranging from 0 to +5. Still others give evidence of both the absence and the
presence and can take on the full range of values [-5, +5].
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 93
The employment of software quality indicators within the OPA Framework results in measures
that are singularly focused; that is, each captures one aspect of software quality. Similarly, each
aggregated measure is associated with a single attribute, principle or objective. Relative to the
interpretation of these measures, we offer a few helpful hints of guidance.
1. An indicator value in the interval around "0" (roughly -0.5 to +0.5) implies that either: (a) no
significant basis for judging quality is found, or (b) equal evidence supporting both beneficial
and detrimental judgments of quality is found. These two cases can be distinguished by
examining the variability among values contributing to the aggregate measure. High
variability in itself is a warning that should be heeded.
2. Each indicator has a recognized "critical component" that if absent renders the indicator non-
computable. For example, a property that is used in the denominator of an automated metric
computation and whose absence results in a zero (0) value is deemed a critical component.
In this situation, continuing with the metric computation results in a "divide by zero error."
Subsequently, during the computation of each indicator, if a critical component is absent, that
indicator is assigned a default value of "0", i.e., no basis for judgment exists. During the
interpretation process one must be aware that an abundance of "default zeros" included in
aggregate values can obscure quality assessment by lowering potentially high measures of
quality and by raising potentially low measures.
3. The OPA Framework requires the users of the procedure to define what quality means by the
prioritization of objectives on any given project. Objectives can be competitive and mutually
inhibitive as well as mutually supportive. A failure to assign priorities to objectives
represents an abdication of responsibility for product and process quality. Further, the
interpretation of quality is subject to serious misgiving in the absence of a clear statement of
what represents quality for a given project.
4. The OPA Framework enables only relative measures of quality; anyone claiming to have a
procedure for an absolute measurement is immediately suspect. Sufficient differences exist
among software projects that drawing comparisons across projects from entirely different
organizations should be done only with great caution and clear understanding that
homogeneity in the statistical sense is not likely to allow such a comparison. Only with a
database compiled of several (or many) projects can a single organization hope to deal with
valid statistical comparisons across projects.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 94
5. Because no database of experience yet exists, interpreting quality measures in an absolute
sense must be avoided. For example, the absolute difference in the quality of a module or
system that receives a score of 1.7 versus one that receives a 2.4 is unknown. What is
known, however, is that the latter module or system is considered to have evidence
supporting the claim that its quality exceeds that of the former.
7.2 Integrating Code, Documentation and Process Quality Indicators in a Software
Quality Measurement Program
The initiation of any project is a period of excitement and high expectations. For those
administering a software quality management program based on the OPA Framework, specific
tasks must be accomplished.
• Key project personnel must be briefed (possibly educated) on the premises of the
Framework and the procedures to be used.
• Decisions on measurement points, procedures and responsibilities must be made.
• Reporting frequency and format should be agreed upon so that the measurement group
maintains the needed independence and feedback is provided for process
improvement.
The roles of the three types of indicators (code, documentation and process) are illustrated in a
variation of Figure 2.3, which is given below as Figure 7.1. Keep in mind that Figure 7.1 is an
illustration of a “typical” software project and represents no specific development or
maintenance effort. Nevertheless, we find it instructive in portraying the supportive roles as they
change with time and project phasing.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 95
Requirements Analysis
High-Level Design
Low-Level Design
Code &
Unit Test
Integ Test
Accept. Test &
Display
Maintenance
Docume ntation
Process
Code
Figure 7.1 The Importance of Indicator Sources to Software Prediction and Assessment in the OPA Framework
Initially, the reliance on documentation as a source of data is by necessity; nothing else exists
beyond a Software Engineering Manual (SEM) or a Software Development Plan (SDP). The
former provides information on how well the organization understands the needs for producing
quality software; the latter should layout the specific process to be followed for the given project.
Figure 7.1 shows the accumulative sources of indicator information among the three to take
distinctly different shapes.
.
x� Documentation data represent the only sources initially and the data increase from the
early level (SEM and SDP) throughout the development phases with the increase
occuring at a smaller rate during the in-service support period.
• Process data, derived primarily from the SEM and SDP, increase also but at a lesser
rate than document data. (Recall that document quality is judged on format, presentation
and content; while process quality focuses more narrowly on content.)
• Toward the end of the detailed design phase, code begins to emerge and rapidly
becomes a major source, continuing throughout the coding and testing phases but never
at the rate seen during code and unit test.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 96
The consequences to be drawn from Figure 7.1 are that the Process Indicators, with some support
from Document Quality Indicators, provide the basis for prediction during the development
project when changes can be made without exorbitant cost. The quality assessment relies
heavily on the products of the code creation, unit test and integration testing phases. In a few
cases, indicators may serve their usefulness and be discarded. In other cases, an indicator value
may be updated as newer data becomes available. Also, we have the indicator that continues to
be used but whose importance may diminish simply because of new data emerging, giving rise to
new indicators.
An organization might choose to follow a pattern of weighting indicators based on the
perceptions of their importance in either prediction or assessment. A measurement program that
provides an automated follow-up scheme, assigning weights based on past predictive capability,
is not difficult to envision. Clearly, the integration and interpretation of indicator values is likely
to differ exceedingly among organizations, but, hopefully, within an orgainization, a consistent,
accepted procedure is followed.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 97
8. PREDICTION AND ASSESSMENT: DIFFERENCES IN GOALS
This Handbook is intended to provide guidance in the creation of a software quality
measurement program utilizing the OPA Framework and Software Quality Indicators. Two
points of emphasis throughout the description are: (1) the evolutionary nature of large, complex
software systems, and (2) the project perspective supported by software measurement. Both are
important in their influences on the presentation. Large software systems, be they "mission-
critical," also termed "embedded systems," "real-time systems," "time-critical systems," or
"automated information systems" (AIS), exhibit complexities that can overwhelm the typical
software engineer, systems engineer or project manager. Measurement, or the more popular term
"software metrics," offers a means of dealing with those problems that arise in the creation of
any large, complex system. The project perspective can apply equally to mission-critical or AIS
systems, although within DoD, the term is associated more with the former. Indicative of the
project perspective is the focus on objectives, and for software-intensive projects, the
prioritization among software engineering objectives.
Coupling the evolutionary nature of large, complex systems with the project perspective
introduces another perception: the diminution of the distinction between development and
sustainment (maintenance). An evolutionary system passes through alternating periods of
emphasis on creation (adding of functionality) and consolidation (correcting and perfecting the
functional form) several times during its lifetime. Unless imposed by contractual arrangements,
the distinction of the initial development activities from the subsequent sustaining and
supporting activities is of little importance. Differences in the application of the OPA
Framework or implications in its use can stem in part from three sources: (1) the process
model(s) used over the life of the system, (2) the limitation to assessment without prediction, and
(3) contractual constraints for separating development and sustainment.
8.1 Effects of Process Models
While the OPA Framework originated with no underlying process model influence, use of the
procedure must be patterned to the model governing the development process. This patterning or
tailoring does not detract from the benefits of measurement, but does alter to some degree the
cost and scheduling of measurement activities.
8.1.1 The Waterfall Model
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 98
The version of the Waterfall Model shown in Figure 8.1, taken from Lavender [1988] is that on
which DOD-STD-2167 was based and serves well for our discussion of software quality
measurement in such a process. Note that only the development process is shown in the model.
Within a development process described by the classical Waterfall Model, the prioritization of
objectives should precede the requirements definition. Process measurement could initiate even
earlier through examination of a SEI Capability Maturity Model evaluation, a Software
Engineering Manual (a Software Standards and Procedures Manual), or a Software Development
Plan. If a Software Configuration Management Plan and Software Quality Evaluation Plan are
generated, then both should be examined also. Process measurement following generation of the
Software Requirements Specification (SRS) and the Interface Requirements Specification (IRS)
focuses on attributes induced from two principles: Abstraction and Concurrent Documentation.
Process and document assessment are conducted concurrently, and the documentation products
begin to assume a more predominant role. The complementary relationship over time between
process and document assessment is illustrated earlier in Figure 2.3 and also in Figure 7.1.
We advocate the application of process assessment prior to the review points shown in Figure 8.1
for each of the software components undergoing review. Discussion of process difficulties
should be an agenda item for each review. Comparison of the reviews for each software
component, conducted by the SQA representative for the project, can point to common process
difficulties and potential improvements.
Clearly, the expected predictive capability should improve as the project matures. The combined
process and product data increases, providing a more accurate reflection of the quality.
Nevertheless, changes in requirements for example can cause major redesign that could affect
the quality of a specific software component and infuse uncertainty into the predictions.
8.1.2 The Domain-Dependent Life-Cycle Model
Pictured in Figure 8.2, this model offered by Giddings [1984] is similar to the Waterfall Model in
its portrayal of a sequential succession of stages. However, the culmination of the activities with
a prototype that is the subject of experimentation that leads to feedback for restatement of
requirements and repetition of the cycle. The Domain-Dependent Model corresponds more
faithfully to the characterization of "evolutionary" software; no clear distinction is made between
development and sustainment.
The OPA Framework would be applied from the initial specification of requirements throughout
each succeeding phase. As with the Waterfall Model, an organization gains experience with the
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 99
Framework and, through comparison with prior projects, recognizes potential early problems in
time to take corrective actions. A major responsibility of the OPA Framework is to gauge
any
** D
R A
F T
** N O
T F
O R
D IS
T R
IB U
T IO
N O
R A
T T
R IB
U T
IO N
** D R
A F
T **
1 0
0
Software Requirements
Analysis
Preliminary Design
Detailed Design
Coding and
Unit Testing
CSC Integration and Testing
CSCI Testing
Functional Baseline
Allocated Baseline
Developmental Configuration
Product Baseline
SSR PDR CDR TRR FCA
PCA
Figure 8.1 The Software Development Life Cycle for DoD-STD-2167 (from [Lavender 1988, p. 181])
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 101
major shifts in quality as the project progresses. An experienced organization would set
quantitative values that would govern test and release decisions. Note that little is explicitly
stated about testing -- unit versus integration or the point of system acceptance testing. A
software quality measurement program could play a major role in both test and configuration
management decisions.
Observation
requirements
Abstraction
validation
Specifications
verification
Prototype
verification
Experimentation
validation
Figure 8.2 Gidding’s Domain Dependent Software Life-Cycle Model
(from [Giddings 1984, p.431]).
8.1.3 Boehm's Spiral Model
The Spiral Model [Boehm 1986], shown in Figure 8.3, portrays the cyclic nature of a process
that assures requirements validation before production of the delivered system. If the rapid
prototyping technique is strictly applied, then no quality measurement should be applied until the
delivered system detailed design is initiated. (With rapid prototyping the definition of each
prototype --design and code -- is discarded, and only the knowledge gained is conserved to apply
to the design of the next prototype.) Application of the OPA Framework following the
operational prototype would take a course similar to that used in the Waterfall Model.
If the evolutionary prototyping approach is invoked, then quality measurement would begin at
some point in the prototype development but not necessarily with the first. More likely, that
prototype judged to provide required functionality in the form to be used in the delivered system
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 102
would be the point where measurement activities would initiate. (Some would argue that this is
the operational prototype in both cases, but others would argue to the contrary.)
Cumulative Cost
Review Commitment
partition
Risk ana- lysis
Progress through steps
Risk analysis
Risk analysis
Risk analysis
Prototype Prototype
Prototype
Operational prototype
1 2
3
Requirements plan Concept of operation
Life-cycle plan Software
requirements
Requirements validation
Develop- ment plan
Integration and test
plan
Plan next phases
Software product design
Design validation and verification
Detailed design
Code
Unit test
Integration and test
Acceptance testImplemen-
tation
Develop, verify next-level product
Determine objectives, alternatives, constraints
Evaluate alternatives, identify, resolve risks
Simulations, models, benchmarks
Figure 8.3: The Spiral Model [Bohem 1986]
8.2 Software Reuse
Reusability is an objective in the OPA Framework. Software reuse -- a major benefit of the
object-oriented programming paradigm -- is touted as realizing significant savings in cost and
time. Reused code is also viewed as having higher reliability. What guidelines should be
followed in applying the OPA Framework to existing software, often described as "Commercial-
Off-The-Shelf" (COTS) or "Not Developed Internally" (NDI)?
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 103
At first glance, the measurement task appears to be one of assessment only: analyze the code.
No prediction is possible nor is it warranted. We can know very little about the process activities
leading to the candidate components for reuse. Such a view is fraught with danger.
Two points should be understood with regard to the measurement of quality of COTS or NDI
software: (1) unless both function and technology are essentially unchanged, the proportion of
cases where code can be directly adapted and used without change is very low, and (2) the key to
the degree of adaptation required for reuse is the match between the intended purpose of the
reused software and its original purpose. Thus, the importance of measuring the supporting
documentation -- the requirements document, design specifications, user manuals, etc. -- cannot
be over-emphasized. Certainly, the code can be analyzed, and the automatic analysis of code
alone seems a viable and inexpensive strategy. However, subjecting only the code to
measurement provides a potentially deceptive result: an initial component with very high quality
may have to undergo such transformation, perhaps unassisted by the supporting documentation,
that the final result bears little resemblance in its quality profile to the original.
Development documentation and test results may be difficult to obtain for COTS and NDI
software, but every effort should be made to do so. At the least, an effort should be made to
obtain surrogate information; i.e. deployment defect data, customer reactions, user reviews, etc.
While the measurement of document quality is likely to be manual, slow and costly, the
confidence in the conclusions regarding software quality should be considerably enhanced.
8.3 Sustaining Responsibilities and Quality Measurement
Common with many DOD systems is the contractual separation of the development and
sustainment activities. Also known as "maintenance" or "post-deployment support," sustainment
includes those activities that have been categorized as software maintenance in four forms by
Swanson [1976]:
• corrective: the correction of faults in a software system, • perfective: the improvement in functionality of a software system to meet current
requirements better, • preventive: the addition of facilities to make a software system more robust, and • adaptive: the addition of functionality to a software system to accommodate changing
requirements.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 104
Some authors prefer to lump perfective and preventive together to list only three forms.
Hopefully, the sustaining organization has been involved closely during the development
activities. In the best of worlds, the OPA Framework has been employed during development,
and the transition of measurement responsibilities to the sustaining organization is simply a part
of the larger transfer. The remainder of this section describes the practical problems and possible
alleviating actions if that is not the case, or, worse yet, no measurement program has been in
effect.
8.3.1 Initial Assessment
Two important questions for the sustaining organization are: (1) What is the expected life of the
delivered software system? and (2) How extensive is the document set delivered as part of the
software system? In general, the longer the system life, the greater the need for added
functionality (adaptive maintenance) and the more extensive the document set required to
support changing requirements effectively, expeditiously, and with acceptable costs. For a long-
lived software system, the quality of the delivered product is determined by the quality of the
documentation more than the quality of the code. Many have learned, and continue to learn, this
lesson the hard way.
Applying the OPA Framework to the delivered product -- the document set and the code --
provides an initial characterization of the system. While little can be said in absolute terms,
unless one has experience in applying the OPA Framework to past systems, the relative values of
the document and code indicators help to identify potential trouble points -- those components
that represent the "soft" spots. These data can assist in personnel assignments: put the best
people on the worst components. If the evaluation reveals that the documentation is not current
(a very typical occurrence), then the decision to reverse engineer the detailed design specification
could be a viable option. The intent in the initial assessment is to enable the sustainment
organization to decide if the delivered system is "a pig in a poke" and, if so, begin to separate the
bacon from the wrapping.
8.3.2 Continuing Assessment
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 105
A quality measurement program can be more beneficial in the sustainment phase than in
development. Some may be surprised by such an assertion, but the truth lies in the ability to
combat the natural tendency of a software system to lose quality. Deteriorating quality stems
from the potential loss of design integrity (clean partitioning of functionality), design congruity
(interfaces permitting well defined communications) and the introduction of errors as changes
are made. By measuring the proposed changes, the sustaining organization can hopefully
maintain or even improve the quality assessment. Functional changes (adaptive maintenance)
can be evaluated in terms of their effect on system quality. The price paid for added
functionality in terms of reduced quality is a strong argument that is seldom available for current
systems.
8.4 The Quality Database and Validity of Indicators
Assessment in measurement refers to the establishment of “what is.” Assessment is applied to
the existing entity, and quality assessment is motivated by the desire to say “how good is this
widget.” Prediction, on the other hand, is applied to that which is in the making. By examining
certain characteristics, can we predict the quality of the finished product? While both purposes
are important in the measurement of quality, assessment is the easier, but still is exatremely
difficult.
We agree with a number of authors [Arthur 1993; Ashley 1995; and Dunn 1990] who prefer not
to define quality but seek to describe it in ways that promote its measurement. We view the
perception of quality much like the perception of beauty: quality lies in the eyes of the beholder
or, recognizing the scale of commerce in software, the expectations of the customer. We contend
that quality cannot be measured in absoute terms, but being able to assess (or predict) the quality
of product X as superior to that of product Y is the key issue in an acquisition decision. Thus,
we rely on relative measures of quality and each organization must establish the procedures to
capture and preserve quality measures so as to build a database that represents a major
organizational asset.
A quality measurement database evolves through successive projects and products, permitting
the comparison and contrast with prior efforts and the prediction of future achievements based on
past data. But, does that mean that software quality indicators, once adopted, should be used for
all future projects? The all too obvious answer is “No.” The use of an SQI should be a specific
decision at the inception of each project. At the conclusion of each project each SQI should be
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 106
reviewed as to its utility in the project and if the utility is judged marginal, an explanation of the
contributing factors should be given. This is a responsibility for the Software Quality Assurance
(SQA) group within the organization.
Periodically, the SQA group, as part of its auditing function, should review the SQIs used on
projects in the past to evaluate their validity. This evaluation should occur more frequently if the
organization mandates a either a uniform set or a core set of indicators. This validation could
take several forms, ranging from face validity -- subjective judgment by an expert team -- to a
statistical approach that is quantitative. The depth of the validation and the consequent cost is
dependent on the cost of the measurement program, the contractual requirements between
customers and the organization and the commitment of the organization to software quality. For
a thorough and comprehensive description of validation and verification techniques in
simulation, which has struggled with this issue much longer than software engineering, see
[Balci 1994].
REFERENCES
ADARF 1984.Rationale for the Design of the ADA Programming Language, Minneapolis, MN: Honeywell Systems and Research Center.
Arthur, J.D., and R.E. Nance. 1987. Developing an automated procedure for evaluating software development methodologies and associated products. Technical report SRC-87- 007, Department of Computer Science and Systems Research Center, Virginia Tech, Blacksburg, Virginia. Arthur, J.D., and R.E. Nance. 1990. A framework for assessing the adequacy and effectiveness of software development methodologies. In Proceedings of the Fifteenth Annual Software Engineering Workshop, Process Improvement Session, Greenbelt, Maryland. Arthur, J.D., Nance, R.E., Bundy, G.N., Dorsey, E.V., and Henry, J. 1991. Software quality measurement: Validation of a foundational approach. Technical report SRC-91-002, Systems Research Center and Department of Computer Science, Virginia Tech, Blacksburg, Virginia. Arthur, L.J. 1985. Measuring Programmer Productivity and Software Quality. New York: John Wiley.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 107
Arthur, L.J. 1993. Improving Software Quality: An Insider’s Guide to TQM. New York: John Wiley. Ashley, N. 1995. Measurement as a Powerful Software Management Tool.. Berkshire: McGraw-Hill Book Company Europe. Balci, O. 1994. Validation, verification, and testing techniques throughout the life cycle of a simulation study. Annals of Operations Research 53: 121-173. Basili, V.R., and D.M. Weiss. 1985. A methodology for collecting valid software engineering data. IEEE Transactions on Software Engineering 11:2:157-168. Basili, V.R., and H.D. Rombach. 1988. The TAME Project: Towards improvement-oriented software environments. IEEE Transactions on Software Engineering 14:6:759-773. Booch, G. 1983. Software Components with Ada. Menlo Park, CA: The Benjamin/Cummings Publishing Company. Booch, G. 1987. Software Components with Ada. Menlo Park, CA: The Benjamin/Cummings Publishing Company. Bundy, G.N. 1990. Assessing software quality in ADA-based products with the Objectives, Principles Attributes Framework. Master’s thesis, Department of Computer Science, Virginia Tech, Blacksburg, Virginia. Carmines, E.G., and R.A. Zeller. 1979. Reliability and Validity Assessment, Quantitative Applications in the Social Sciences. J.L. Sullivan (ED), Beverly Hills, CA: Sage Publications. Carley, M. 1981. Social Measurement and Social Indicators. George Allen and Unwin. Conte, S.D., Dunsmore, H.E., and V.Y. Shen. 1986. Software Engineering Metrics and Models. CA: Benjamin/Cummings Publishing Company. Cruickshank, R., and J. Gaffney. 1980. Indicators for software design assessment. IBM FSD Final Report on Creative Development, Task 91. Dale, N., and D. Orshalick. 1983. Introduction to Pascal and Standard Design. Lexingon, MA: D.C. Heath & Company. Dandekar, A.V. 1987. A procedural approach to the evaluation of software developmen methodologies. Master’s thesis, Department of Computer Science, Virginia Tech, Blacksburg, Virginia. Dorsey, E.V. 1992. The automated assessment of computer software documentation quality using the Objectives/Principles/Attributes framework. Master’s thesis, Department of Computer Science, Virginia Tech, Blacksburg, Virginia.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 108
Dunn, R.H. 1990. Software Quality: Concepts and Plans. Englewood Cliffs, NJ: Prentice-Hall, Inc. Dunsmore, H.E., and J.D. Gannon. 1980. Analysis of the effects of programming factors on programming effort. Journal of Systems and Software 1:2:1044-1050. Ejiogu, L.O. 1987. The critical issues of software metrics--Part 0. Perspectives on software measurements. SIGPLAN Notices 22:3:59-64. Embley, D.W., and S.N. Woodfield. 1988. Assessing the quality of abstract data types written in Ada. In Proceedings: 10th International Conference on Software Engineering, 144-153. Florac, W.A.. R.E. Park and A.D. Carleton, Practical Software Measurement: Measuring for Process Management and Improvement, Guidebook CMU/SEI-97-HB-003, April 1997. Gannon, J.D., E.E. Katz, and V.R. Basili. 1986. Metrics for Ada packages: An initial study. Communications of the ACM 29:7:616-623. Ghezzi, C., and Mehdi Jazayeri. 1982. Programming Language Concepts. New York: John Wiley & Sons, Inc. Halstead, M.H. 1977. Elements of Software Science. New York: Elsevier North-Holland, Inc. Henry, S., and D. Kafura. 1981. Software structure metrics based on information flow. IEEE Transactions on Software Engineering 7:5:510-518. Humphrey, W.S. 1990. Managing the Software Process. SEI Series in Software Engineering, Reading, MA: Addison-Wesley. Kearney, J.K., R.L. Sedlmeyer, W.B. Thompson, M.A. Grey, and M.A. Adler. 1986. Software complexity measurement. Communications of the ACM 29:11:1044-1050. Lavender, R.G. 1988. The explication of process-product relationships in DoD-STD-2167 and DoD-STD-2168 via an augmented data flow diagram model. Master’s thesis, Department of Computer Science, Virginia Tech, Blacksburg, Virginia. Marca, D. 1984. Applying Software Engineering Principles. Boston, MA: Little, Brown and Company. McCabe, T.J. 1976. A complexity measure. IEEE Transactions on Software Engineering 2:4:308-320. McCall, J.A., P.K. Richards, and G.F. Walters. 1977. Factors in software quality. Technical report RADC-TR-77-369, Rome Air Defense Center. Meier, K., and J. Brudney. 1981. Applied Statistics for Public Administration. Duxbury Press.
** DRAFT ** NOT FOR DISTRIBUTION OR ATTRIBUTION ** DRAFT** 109
Nance, R.E., Arthur, J.D., and A.V. Dandekar. 1986. Evaluation of software development methodologies. Technical report SRC-86-010, Systems Research Center and Department of Computer Science, Virginia Tech, Blacksburg, Virginia. Nance, R.E., and J. D. Arthur. 1994. Software quality measurement : Assessment, prediction and validation. The Sixth Annual Software Technology Conference, Track 2: Software Testing, Measurement and Inspection. Salt Lake City, Utah. Nance, R.E., and J.D. Arthur. 1994. Software quality measurement: Assessment, prediction, and validation. Technical report SRC-94-006, Systems Research Center and Department of Computer Science, Virginia Tech, Blacksburg, Virginia. Ross, D.L. 1986. Classifying Ada packages. Ada Letters 6:4:53-65. Stevens, W.P. 1981. Using Structured Design. New York: John Wiley & Sons, Inc. Stevens, W.P., Myers, G.J., and L.L. Constantine. 1994. Structured design. IBM Systems Journal 13:2:115-139. Troy, D.A., and S.H. Zweben. 1981. Measuring the quality of structured design. Journal of Systems and Software 4:2:113-120. Wichmann, B.A. 1984a. Is Ada too big? A designer answers the critics. Communications of the ACM 27:2:98-103. Wichmann, B.A. 1984b. A Comparison of Pascal and Ada. In Comparing and Assessing Programming Languages. Englewood Cliffs, NJ: Prentice-Hall, Inc. Yourdon, E., and L.L. Constantine. 1978. Structured Design. New York: Yourdon, Inc. Zage, W.M., Zage, D.M., and C. Wilburn. 1995. Avoiding metric monsters: A design metric approach. Annals of Software Engineering 1:43-56.