Stage 4

Auztina

TheNewSoftwareEngineering-Conger.pdf

Home >Information Systems homework help >Stage 4

The New Software Engineering

This book is licensed under a Creative Commons Attribution 3.0 License

The New Software Engineering

Sue Conger

For any questions about this text, please email: [email protected]

The Global Text Project is funded by the Jacobs Foundation, Zurich, Switzerland

This book is licensed under a Creative Commons Attribution 3.0 License

This edition was scanned and converted to text using Optical Character Recognition. We are in the process of

converting this edition into the Global Text Project standard format. When this is complete, a new edition will be

posted on the Global Text Project website and will be available in a variety of formats upon request.

The New Software Engineering 2 A Global Text

http://creativecommons.org/licenses/by/3.0/

mailto:[email protected]?subject=Basic%20Political%20Concepts

____ - THE NEW ____________ -

SOFTWARE ----- --------------- ____ - ENGINEERING __________ -

CONTENTS ____________ ~ ____ --

CHAPTER 1 ____________________ _ OVERVIEW OF _________ _ SOFTWARE ENGINEERING 1 ____ _

Introduction 1 Software Engineering 2 Applications 5

Application Characteristics 5 Application Responsiveness 13 Types of Applications 17 Applications in Business 22

Project Life Cycles 23 Sequential Project Life Cycle 23 Iterative Project Life Cycle 29 Learn-as-You-Go Project Life Cycle 31

Methodologies 34 Process Methodology 34 Data Methodology 34 Object-Oriented Methodology 35 Semantic Methodologies 37 No Methodology 38

User Involvement in Application Development 39 Overview of the Book 40

Applications 40 Project Life Cycles 40 Part I: Preparation for Software Engineering 40 Part II: Project Initiation 40 Part III: Analysis and Design 41 Part IV: Implementation and Operations 41

Summary 41

PART 1 ____________________________________ _ PREPARATION FOR SOFTWARE ENGINEERING 45 ____ _

CHAPTER 2 ___________ _ LEARNING APPLICATION ______ _ DEVELOPMENT 46 ________ _

Introduction 46 How We Develop Knowledge and Expertise 46

Learning 46 Use of Learned Information 48 Expert/Novice Differences in Problem

Solving 48 How to Ease Your Learning Process 50

Application Development Case 50 History of the Video Rental Business 51 ABC Video Order Processing Task 51 Discussion 53

Summary 54

CHAPTER 3 ____________ _ PROJECT MANAGEMENT 57 ____ _

Introduction 57 Complementary Activities 58

Project Planning 58 Assigning Staff to Tasks 62 Selecting from Among Different

Alternatives 64 Liaison 67

Project Sponsor 67 User 67 IS Management 69 Technical Staff 69 Operations 69 Vendors 69

Karen

Highlight

Karen

Highlight

Karen

Highlight

vi Contents

Other Project Teams and Departments 70 Personnel Management 70

Hiring 70 Firing 71 Motivating 71 Career Path Planning 72 Training 72 Evaluating 72

Monitor and Control 74 Status Monitoring and Reporting 74

Automated Support Tools for Project Management 79

Summary 80

CHAPTER 4 ____________________ __ DATA GATHERING FOR ___________ _ APPLICATION DEVELOPMENT 83 ____ _

Introduction 83 Data Types 83

Time Orientation 84 Structure 84

Completeness 86 Ambiguity 86 Semantics 86 Volume 86

Data Collection Techniques 87 Single Interview 87 Meetings 92 Observation 94 Temporary Job Assignment 95 Questionnaire 95 Document Review 97 Software Review 98

Data Collection and Application Type 98 Data Collection Technique and Data Type 98 Data Type and Application Type 99 Data Collection Technique and Application

Type 101 Professionalism and Ethics 102

Ethical Project Behavior 103 Ethical Reasoning 106

Summary 107

I ____ PARTII __________________________________ _

I ____ PROJECT INITIATION 111 ____________________________ __

CHAPTER 5 ___________________ __ ORGANIZATIONAL ________________ _ REENGINEERING AND ENTERPRISE ________ _ PLANNING 113 ________ __

Introduction 113 Conceptual Foundations of Enterprise

Reengineering 113 Planning Reengineering Projects 117 Reengineering Methodology 119

Identify Project Sponsor 120 Assign Staff 121 Scope the Project 122 Create a Schedule 123 Identify Mission Statement 124 Gather Information 124

Summary of the Architectures 125 Translating Information into Architecture 128 Architecture Analysis and Redesign 133 Implementation Planning 140

Enterprise Analysis Without Organization Design 143

Automated Support Tools for Organizational Reengineering and Enterprise Analysis 143

Summary 143

CHAPTER 6 ____________________ _ APPLICATION FEASIBILITY __________ _ ANALYSIS AND PLANNING 148 ____ __

Introduction 148 Definition of Feasibility Terms 148

Karen

Highlight

Karen

Highlight

Karen

Highlight

Karen

Highlight

Karen

Highlight

Feasibility Activities 150 Gather Information 150 Develop Alternative Solutions 159 Evaluate Alternative Solutions 170 Plan the Implementation 17'2

Contents vii

Evaluate Financial Feasibility 187 Document the Recommendations 193

Automated Support Tools for Feasibility Analysis 194

Summary 195

PART III _______________ - ANALYSIS AND DESIGN 199 ___________ _

Introduction 199 Application Development as a Translation

Activity 202 Organizational and Autom.ated Support 209

Joint Application Development 210 User-Managed Application Development 216 Structured Walk-Throughs 217 Data Administration 218 CASE Tools 222

Summary 225

CHAPTER7 ____________________ __

PROCESS-ORIENTED ___________ _ ANALYSIS 227 _______________ __

Introduction 227 Conceptual Foundations 227 Sum.mary of Structured Syst~ms Analysis

Terms 228 Structured Systems Analysis Activities 231

Develop Context Diagrarp. 234 Develop Data Flow Diagram 241 Develop Data Dictionary 261

Automated S"!lpport Tools 270 Summary 270

CHAPTER 8 ____________ _ PROCESS-ORIENTED DESIGN 279 __ _

Introduction 279 Conceptual Foundations 279 Definition of Structured Design Terms 280 Process Design Activities 293

Transaction Analysis 294

Transform Analysis 295 Complete the Structure Chart 303 Design the Physical Database 310 Design Program Packages 312 Specify Programs 317

Automated Support Tools for Process-Oriented Design 319

Strengths and Weaknesses of Process Analysis and Design Methodologies 322

Summary 324

CHAPTER 9 __________ _ DATA-ORIENTED ANALYSIS 328 __ _

Introduction 328 Conceptual Foundations 329 Definition of Business Area Analysis Terms 329 Business Area Analysis Activities 339

Develop Entity-Relationsl1ip Diagram 339 Decompose Business Functions 356 Develop Process Depen<~ency Diagram 363 Develop process Data Flow Diagram 372 Develop and Analyz,e Entity /process

Matrix 381 Software Support for Data-Oriented Analysis 387 Summary 387

CHAPTER 10 ________________ __ DATA-ORIENTED DESIGN 391 ___ _

Introduction 391 Conceptual Foundations 391 Definition of Information Engineerfng Design

Terms 392

Karen

Highlight

Karen

Highlight

Karen

Highlight

Karen

Highlight

Karen

Highlight

Karen

Highlight

Karen

Highlight

Karen

Highlight

Karen

Highlight

Karen

Highlight

Karen

Highlight

Karen

Highlight

viii Contents

Information Engineering Design 401 Analyze Data Use and Distribution 401 Define Security, Recovery, and Audit

Controls 410 Develop Action Diagram 424 Define Menu Structure and Dialogue Flow 438 Plan Hardware and Software Installation

and Testing 445 Automated Support Tools for Data-Oriented

Design 453 Summary 456

CHAPTER11 _________ __

OBJECT-ORIENTED ANALYSIS 459 __ __

Introduction 459 Conceptual Foundations of Object-Oriented

Analysis 459 Definition of Object-Oriented Terms 461 Object-Oriented Analysis Activities 463

Develop Summary Paragraph 464 Identify Objects of Interest 468 Identify Processes 473 Define Attributes of Objects 479 Define Attributes of Processes 483 Perform Class Analysis 486 Draw State-Transition Diagram 492

Automated Support Tools for Object-Oriented Analysis 497

Summary 497

CHAPTER 12 _________ __

OBJECT-ORIENTED DESIGN 501 __ _

Introduction 501 Conceptual Foundations 501 Definition of Object-Oriented Design Terms 502 Object-Oriented Design Activities 508

Allocate Objects to Four Sub domains 509 Draw Time-Order Event Diagram 512 Determine Service Objects 517 Develop Booch Diagram 521 Define Message Communications 525 Develop Process Diagram 529 Develop Package Specifications and

Prototype 533

What We Know and Don't Know from OOA and OOD 534

Automated Support Tools for Object-Oriented Design 534

Summary 535 Appendix: Unix/C++ Design of ABC Rental 539

CHAPTER13_~ _______ __ SUMMARY AND FUTURE _____ __ OF SYSTEMS ANALYSIS, ______ _ DESIGN, AND __________ _ METHODOLOGIES 554 ______ _

Introduction 554 Comparison of Methodologies 554

Information Systems Methodologies Framework for Understanding 555

Humphrey's Maturity Framework 562 Comparison of Automated Support

Environments 565 Research Relating to Analysis, Design, and

Methodologies 568 Business and Technology Trends that Impact

Application Development 569 Legacy Systems; 570 Repositories and Data Warehouses 570 Client/Server 571 Multimedia 572 Globalization 572

Summary 574

CHAPTER 14 _________ __

FORGOTIEN ACTIVITIES 579 ___ _

Introduction 579 Human Interface Design 579

Conceptual Foundations of Interface Design 579

Develop a Task Profile 580 Option Selection 590 Functional Screen Design 601 Presentation Format Design 605 Field Format Design 620

Conversion 625 Identify Current and Future Data

Locations 626

Karen

Highlight

Karen

Highlight

Karen

Highlight

Karen

Highlight

Karen

Highlight

Karen

Highlight

Define Attribute Edit and Validate Criteria 627 Define Data Conversion Activities and

Timing 627 Select and Plan an Application Conversion

Strategy 627 ABC Conversion Strategy 629

User Documentation 631 Mix of On-Line and Manual

Documentation 631

Contents ix

Automated Support Tools for Forgotten Activities 632

Summary 633

PARTIV ________________________________ __ IMPLEMENTATION AND MAINTENANCE 637 _____ _

Introduction 637

CHAPTER15 __________________ __ CHOOSING AN _________ _ IMPLEMENTATION _______________ _ LANGUAGE 640 _______________ _

Introduction 640 Characteristics of Languages 640

Data Types 640 Data Type Checking 641 Language Constructs 642 Modularization and Memory Management Exception Handling 646 Multiuser Support 646

Nontechnical Language Characteristics 647 Comparison of Languages 650

SQL 650 Focus 656 BASIC 656 COBOL 656 Fortran 657 C 657 Pascal 657 PROLOG 658 Smalltalk 659 Ada 659

645

Programming Language Evaluation 660 Language Matched to Application Type 660 Language Matched to Methodology 661

Automated Support for Program Development 662

Summary 662

CHAPTER16 ______________ ~---- PURCHASING _____________ _ HARDWARE __________ _

AND SOFTWARE 666 _______ _

Introduction 666 Request for Proposal Process 667

Develop and Prioritize Requirements 667 Develop Schedule and Cost 667 Develop Request for Proposal 668 Manage Proposal Process 669 Evaluate Proposals and Select Alternatives

Informal Procurement 670 Contents of RFP 670

Vendor Summary 670 Required Information 671 Schedule of RFP Process 674 Description of Selection Processes 674 Vendor Response Requirements 675 Standard Contract Terms 677

Hardware 677 Functionality 677 Operational Environment 678 Performance 678

Software 678 Needs 679 Resources 679 Performance 680 Flexibility 680 Operating Characteristics 680

RFP Evaluation 681 General Evaluation Guidelines 681

Automated Support Tools for Evaluation 687 Summary 687

670

Karen

Highlight

Karen

Highlight

Karen

Highlight

x Contents

CHAPTER17~------------------ TESTING AND· __________ _

QUALITY ASSURANCE

Introduction 690 Testing Terminology 690 Testing Strategies 694

Black-Box Testing 695 White-Box Testing 697 Top-Down Testing 699 Bottom-Up Testing 702 Test Cases 702

690 ____ _

Matching the Test Level to the Strategy 704 Test Plan for ABC Video Order Processing 706

Test Strategy 706 Unit Testing 710 Subsystem or Integration Testing 718 System and Quality Assurance Testing 723

Automated Support Tools for Testing 729 Summary 732

CHAPTER 18 _________________ __

CHANGE MANAGEMENT 735 ___ _

Introduction' 735 Designing for Maintenance 735

Reusability 735 Methodology Design Effects 738 Role of CASE 740

Application Change Management 741 Importance 741 Change Management Procedures 742 Historical Decision Logging 744 Documentation Change Management 744

Software Management 749 Introduction 749 Types of Maintenance 749 Reengineering 751

Configuration Management 751 Introduction 751 Types of Code Management 752 Configuration Management Procedures 755

Automated Tools for Change Management 756 Collaborative Work Tools 756 Documentation Tools 758 Tools for Reverse Engineering of

Software 759 Tools for Configuration Management 759

Summary 759

CHAPTER 19------------------- SOFTWARE ENGINEERING _____ _ AS A CAREER 764 _______ __

Introduction 764 Emerging Career Paths 764 Careers in Information Systems 765

Level of Experience 765 Job Type 767

Planning a Career 772 Decide on Your Objective 773 Define Duties You Like to Perform 773 Define Features of the Job 773 Define Features of the Organization 775 Define GeographiC Location 777 Define Future-Oriented Job Components 777 Search for Companies That Fit Your

Profile 778 Assess the Reality of Your Ideal Job and

Adjust 778 Maintaining Professional Status 780

Education 781 Professional Organizations 781 User Organizations 783 Accreditation 785 Read the Literature 785

Automated Support Tools for Job Search 786 Summary 787

APPENDIX ___________ _

CASES FOR ASSIGNMENTS 790 ____ __

Abacus Printing Company 790 AOS Tracking System 791 The Center fot Child Development 792 Cohrse Registration System 794 Dr. Patel's Dental Practice System 795 The Eagle Rock Golf League 796 Georgia Bank Automated Teller Machine System

796 Summer's Inc. Sales Tracking System 797 Technical Contracting, Inc. 798 XY University Medical Tracking System 799

Glossary 801 Index 811

PREFACE -----------------------------, .......... .----- As we move toward the 21st century, the techniques, tools, technologies, and subject matter of appli- cations development are changing radically. Glob- alization of the work place is impacting IS development as well, by pressuring organizations to strive for competitive advantage through auto- mation, among other methods. Strategic IS, reusable designs, downsizing, right-sizing, multimedia data- bases, and reusable code are all discussed in the same breath. Methodologies are being successfully coupled to computer-aided software engineering en- vironments (CASE); yet object-oriented methodolo- gies, which are being touted as the panacea for all problems, have not yet been fully automated ... or even fully articulated. Few if any tools, methods or techniques address the needs of artificial intelligence and expert system development, which are currently driven by the program language being used for development. New technologies for true distribu- tion of processing are maturing, and integration across hardware and software platforms is the ma- jor IS concern in multiple industries [Computer- world, 10/15/90].

IS professionals must be jacks-of-all-trades as never before, but there is also increased demand for domain experts who are intimately familiar with all aspects of a particular business area, such as money transfer in banking. It is difficult for anyone person to be both expert and generalist. But there are many systems developers-I call them software engi- neers-who do possess these attributes. Today's ideal software engineer is familiar with the alterna- tives, trade-offs and pitfalls of methodologies (notice the plural form), technologies, domains, project life cycles, techniques, tools, CASE environments, hard- ware, operating systems, databases, data architec- tures, methods for user involvement in application development, software, design trade-offs for the problem domain, and project personnel skills. Few professionals acquire all these skills without years of experience including both continuing education and

variations in project assignments, company type, and problem type. This book attempts to discuss much of what should be the ideal software engineer's project- related knowledge and theoretical background in order to facilitate and speed the process by which novices become experts.

The goal of this book, then, is to discuss project planning, project life cycles, methodologies, tech- nologies, techniques, tools, languages, testing, ancillary technologies (e.g., database), and com- puter-aided software engineering (CASE). For each topic, alternatives, benefits and disadvantages are discussed.

For methodologies, one major problem is that most writing on methods of development concen- trates on what the analyst does. It is up to the indi- vidual instructor and/or student to develop the how knowledge. Yet, the what knowledge is easy and takes very little time to learn. If I say, "The first step in object-oriented methodology is to make a list of objects," that sounds like a simple step. I may understand what I'm to do, but not how to do it. This book is intended to shed some light on the how information. One technique used to facilitate the learning process is to develop the same case problem in each methodology, highlighting the similarities, differences, conceptual activities, decision pro- cesses, and physical representations. Another tech- nique is to provide cases in the appendix that can be used throughout the text for many assignments, thus allowing the student to develop a detailed-problem understanding and an understanding of how the problem is expressed in different methodologies and using different techniques.

A related problem in software engineering texts is that little information is available on current research and future directions. Information systems development is a 30-year old activity that is begin- ning to show some signs of maturity, but is also con- stantly changing because the type of systems we automate is constantly changing. Research in every

xii Preface

area of software development, from enterprise analy- sis through reengineering 20-year-old systems, is taking place at an unprecedented rate. Moreover, the landscape of system development will change radically in the next 20 years based on the research taking place today. This text attempts to highlight and synthesize current research to identify future directions.

Many software engineering texts never discuss problems attendant with methodologies. This text attempts to discuss methodologies in the context of their development and how they have evolved to keep pace with new knowledge about system devel- opment. Both useful and not-so-useful representa- tion techniques will be identified. The book may be controversial in this regard, but at least the knowl- edge that there are problems with methods should remove some of the prevailing attitudes that there are right and wrong ways to complete everything. Unfortunately, no methodology is complete enough to guarantee the same results from two different analysts working independently, so interpretations differ. I try to identify my interpretations and gener- alizations throughout the text.

The book is case-oriented in several ways. First, a sample project is described, designed, and imple- mented using each of the techniques discussed. Sec- ond, cases for in-class development are provided. Third, cases for homework assignments are also pro- vided. Research on learning has revealed that we learn best through practice, analysis of examples, and more practice. For each topic, an example of both acceptable and unacceptable deliverables is provided, with discussion of the relative merits and demerits of each. Through repeated use of different cases, students will learn both the IS topics and something about problem domains that will carry over into their professional lives.

Finally, this text has a bias toward planning, analysis, and design activities even though the entire life cycle is discussed. This bias is partly due to practical and space limitations; however, it is also because of the realities of changing software engi- neering work. CASE promises to remove much of the programming from business application devel- opment by automating the code generation process. Although languages are discussed, the discussion

focuses on how to choose the correct language for an application based on language characteristics, rather than on how to program in the language.

The audience for this text includes business, com- puter information systems, and computer science students. The courses for which this text is appro- priate include software engineering, advanced sys- tem analysis, advanced topics in information systems, and IS project development. Computer software engineering is moving away from a con- centration on developing the perfect program to a realization that even perfect programs never work in isolation. Program connections are significantly more important than individual program code. Thus, even computer scientists are recognizing a need for methodologies, techniques for system representa- tion, and language selection.

The text was originally planned to accommodate either quarter or semester classes. I have taught this material in both. While the written material is longer than anticipated, I believe the book can be covered in one quarter because there are usually more contact hours with students. One of my goals was a book that did not require much additional outside mater- ial to supplement the text; I hope this goal was met. Much of the bulk is explaining the how processes in Chapters 7-12, and these should be covered in class to discuss alternatives, possible flaws in my think- ing, and so on. If programming is also included in the course, I suggest development of a two-quarter (or semester) sequence that includes software engi- neering through system design in the first course and the remaining subjects in the second course.

Every school seems to offer courses on "Ad- vanced Topics in Systems Development" or Ad- vanced Systems Analysis" or "IS Development Project" that frequently use no book because nothing covers all the desired topics. This book attempts to provide for these courses. Advanced systems analy- sis and development courses all tend to concentrate on alternatives during the design process from which decisions must be made. The typical systems analy- sis course might discuss one technique for each major topic area: enterprise modeling, data model- ing, process modeling, program design. That alter- natives are available is certainly mentioned, but there is simply not enough time to teach all topics,

nor are students able to assimilate much informa- tion about alternatives without becoming hopelessly confused. Advanced courses try to broaden the knowledge base of students with discussions of alternatives in each area. Even in these courses, without a hands-on orientation and concrete exam- ples to use for reference, the number of topics and alternatives is necessarily limited. The use of a sin- gle case throughout the text, together with cases for home/school work practice, should broaden the number of topic areas that can be covered adequately in a one-semester course.

ACKNOWLEDGMENTS ____ _ No textbook is published without the involvement of many people and I would like to acknowledge those who have helped bring this book to fruition. I am grateful, first, to my husband Dave and my daughter Katie, who have put up with haphazard meals and an absent-minded wife and mother for a long time. Baby-sitters were especially important when I com- muted four hours a day. I thank Elaine Black, Lis Nielsen, Sarah Cropley, Louise Shipman, Jacquie Draycott, Ellen Crawford, and Angela Moore.

Also, I wish especially to thank Peter Keen for his unfailingly good advice and uplifting moral support. I have never before worked with someone so free with great ideas. Frank Ruggirello, who actually got me moving and enlisted the supportive and helpful reviewers, played a special part in the project. I want to thank the reviewers, who put up with my typos and grammar long enough to read about the ideas I am attempting to convey. Their comments have ma- terially enhanced the final quality of this book. These reviewers include: Donald R. Chand, Bentley Col- lege; Dale D. Gust, Central Michigan University; Lavette Teague, California State Polytechnic Uni- versity-Pomona; Jon A. Turner, New York Univer-

Preface xiii

sity; Douglas Vogel, University of Arizona; Connie E. Wells, Georgia State University; J. Christopher Westland, University of Southern California; and Susan J. Wilkins, California Polytechnic Univer- sity-Pomona. My thanks for the helpful and sup- portive comments.

Next, the Wadsworth "family" has been support- ive throughout the work, including Kathy Shields, Rhonda Gray, Tamara Huggins, Peggy Mehan, Greg Hubit, and Janet Hansen. Martha Ghent, the copy editor, deserves special mention. Having never worked through the copy process before, I ~ad no idea what was done. Martha was easy to work with and taught me how to improve both my writing and my punctuation.

Friends and colleagues, who have given me anec- dotes, support, ideas, and comments, were invalu- able. The friends who have materially contributed to this project include Peter Keen, Connie Wells, Judy Wynekoop, Irene Auerbach, Chung Pin Chuang, Karen Loch, Kuldeep Kumar, Scott Owen, Iris Vessey, Nancy Russo, Alex Heslin, Paul Halde- man, Marty Fraser, Eph McLean, Ross Gagliano, Jim Senn, Mike Palley, Dorothy Dologite, Ronnie Wilkes, Jong Kim, Seok Jung Yoon, Dennis Strou- ble, Mary Alexander, Ted Stohr, and the many stu- dent 'guinea pigs' (mine and others) from Georgia State University, Baruch College (CUNY), Univer- sity of Texas-Arlington, University of Dallas, and New York University. Thank you all.

Finally, I would like to thank you, the reader, for buying this book and taking the trouble to read even a portion of it. If you should disagree with my rea- soning or find errors or omissions that should be cor- rected, I would be grateful for suggestions and correspondence.

Sue Conger Dallas, Texas

CHAPT ERI

OVERVIEW ------------------------------------~-----OFSOFTWARE

----------------------~ ........ ------- ENGINEERING ----------------------__________ r-----

INTRODUCTION ____ _

Businesses around the world depend more and more on software in the very basics of their operations. U.S. firms alone have 100 billion lines of program code in use today. This code cost $2 trillion to cre- ate and costs $30 billion a year to maintain. The typ- ical Fortune 1000 company maintains 35 million lines of code. Quality of software design and qual- ity of business service are increasingly linked. We take for granted the everyday convenience we gain from reservation, telephone, automated teller, and credit card authorization applications. We can take these conveniences for granted until they 'crash' or have a 'bug.' Software engineers (SEs) developed those systems. The engineering skills they apply to developing applications go far beyond the writing of good programs. The skills SEs need are to deploy and manage the data, software, hardware, and com- munications business assets of a corporation. These computer-related assets now account for almost half of all U.S. business investment.

Software engineers are skilled professionals who can make a real difference to business profitability. The word professional is key here. Software devel- opment is notoriously difficult to manage; software projects are routinely over budget and behind sched- ule. Computer programmers are legendary for their lack of understanding of, or interest in, business. SEs who are professionals are more likely to manage and

deliver a quality project on time and within budget. One goal of this text is to challenge you to set high startdards for personal excellence: to beconie a pro- fessional and to make a difference.

This chapter introduces you to the book and the topics to be covered in more detail in later chapters. The objectives ofthis chapter are to: (1) re- view what you might already know, (2) give you a vocabulary for discussing applications, and (3) in- troduce the topics of thi~ text. Use this chapter to learn basic definitions and to begin building a mental picture of how different approaches to software en- girleering work. You will learn the details in later chapters.

Software engineering is the systematic develop- ment, operation, mairttehance; and retirement of software. Software engineers (SEs) have a mental 'tool kit' of techniques to use in developing appli- cations. As students of information systems, you know bits and pieces of the tool kit. This text will show you how to use the tools together, and will add to what you already know. For instance, you should already know data flow diagrams (DFDs). DFDs are one of many tools, induding new diagrams such as process hierarchies, process dependencies, and object diagrams. No one tool is ideal or complete. The SE knows how to select the tools, understanding their strengths and weaknesses. Most of all, an SE is not limited to a single tool he or she tries to force-fit to all situations.

2 CHAPTER 1 Overview of Software Engineering

Software engineering is important because it gives you a foun4ation on which to develop a career as an information systems development profes- sional. At the end of the course, you will understand a variety of approaches to analyzing, designing, pro- gramming, testing, and maintaining information sys- tems in organizations. You will know the alternatives for developing applications, and you will know how and when to select from among them. You will be able to compare and contrast methodology dif- ferences and will know the major computer-aided software engineering (CASE) tools that support each methodology. Finally, you will have an appreciation of the roles of software engineers and how they work with project managers in application development.

In the next section, you will learn what it means to be a software engineer. Then, a framework for discussing applications will help you categorize characteristics, technologies, and types of applica- tions in business organizations. The next several sections guide you through alternatives for overall management of the application development pro- cess. The last section briefly outlines the remaining chapters of the book. Along the way, major terms are highlighted in bold print and defined so you can begin to form a mental picture of the alternative approaches to software engineering work.

SOFTWARE ____________ _ ENGINEERING __________ _

This conversation might be overheard in a man- ager's office:

Consultant Manager: "All right, Mary, tomorrow you start work on the rental proces~ing applica~ tion we are developing for ABC's Video Com- pany. Mary, you are the ptoject manager. Are you ready?" . .

Mary: "Yes, our first job is to find out more about the application. Then, Sam and I will decide our approach to development and the documen- tation that is needed. ABC's manager, Vic, is willing to provide us with whatever we need. Then, we will complete a feasibility anal~sis and ... "

Mary is describing the first steps used by a modem software engineer in the development of a computer- based application. Software is the sequences of instructions in one or more programming languages that comprise a computer application to automate some business function. Engineering is the use of tools and techniques in problem solving. Putting the two words together, software engineering is the systematic application of tools and techniques in the development of computer-based applications.

A software engfneer is a person who applies a broad range of application development knowledge to the systematic development of application sys- tems for organizations. Software engineers used to think of their job as conscientious development of well-structured computer programs. But, as the field evolved, systems analysis as a task appeared along with systems analysts, the people who perform that task. Now, there is a proliferation of techniques, tools, and technologies to develop applications. Soft- ware engineers' jobs have ~volved to now include evaluation, selection, and use of specific systematic approaches to the development, operation, mainte- nance, and retirement of software. Development begins with the decision to develop a software prod- uct and ends when the product is delivered. Opera- tions is the daily processing that takes place. Maintenance encompasses the changes made to the logic of the system and programs to fix errors, pro- vide for business changes, or make the software more efficient. Retirement is the replacement of the current application with some other method of pro- viding the work, usually a new application.

Fundamental skills of software engineers include

1. How to identify,evaluate, choose, and imple- ment an appropriate methodology! and CASE tools

2. How and when to use prototyping 3. How and when to select hardware, software,

and languages

1 Techni·cally, the tenn methodology means 'the study of meth- ods.' In infonnation systems work, the tenn is colloquially ac- cepted to mean a collection of tools and techniques used to represent an application's requirements. We use the Infonna- tion Systems (IS) fonn of the tenn meaning 'collections of tools and techniques.' CASE software automates the use of the tools and techniques.

NEW YORK BANK In 1970, NY Bank wanted to be first in the New York market with an automated teller machine (ATM) system. The bank contracted with a large computer vendor to build custom ATM software using the vendor's equipment. Because telecommunications technology was in its infancy at the time, and distributed processing did not exist when the system was installed in 1971, the two ATM lo- cations used small, local computers to record transactions. The computers did not commu- nicate with each other. Nor could they check customer balances to verify availabil- ity of funds for transactions.

4. How to manage activities associated with configuration management, planning, and control of the development process

5. How to select computer languages and de- velop computer programs

6. How and which project testing techniques to apply

7. How to choose and use software maintenance techniques

8. How to evaluate and decide when to retire applications

The goals of a software engineer are to pro- duce a high quality product and to enjoy a high quality development process. The product of a soft- ware engineering effort is a delivered, working com- puter system, some examples of which include:

• Accounts receivable processing • Order processing • Inventory monitoring and maintenance • Decision support for overnight funds

investment • Collateralized mortgage obligatien cost

determination • Insurance reimbursement processing • Funds transfer processing • Early warning system for problems with criti-

cal success factors

Software Engineering 3

Within one month of the opening of the ATMs, one customer had, in one 24-hour period, withdrawn $200,000 from the two machines. The customer's balance in his checking account was $50. One month, and one similar user later, NY Bank shut its ATM offices, canceled the contract with the ven- dor, and wrote off $30 million in development costs. Shortly after, NY Bank began another project to develop a "second-generation" ATM system in which balances were checked via communications with a centralized data- base application.

• Query processing for a customer information database

A quality SE product is

• on time • within budget • functional, i.e., does what it is supposed

to do • friendly to users • error free • flexible • adaptable

In addition to a quality product, quality of process is desirable. The software engineering process describes the steps it takes to develop the system. We begin a development project with the notion that there is a problem to be solved via automation. The process is how you get from problem recognition to a working solution. A quality process is desirable because it is more likely to lead to a quality prod- uct. The process followed by a project team during the development life cycle of an application should be orderly, goal-oriented, enjoyable, and a learning experience.

That we try to apply engineering discipline to software development does not mean that we have all the answers about how to build applications. On

4 CHAPTER 1 Overview of Software Engineering

TUV INSURANCE COMPANY In 1991, TUV Insurance Company began a restructuring project for an annuity premium processing application. The project team consisted of a manager who had been with the company 20 years and two analysts who were new hires in 1991. The two new people, Jacquie and Ted, both wanted to apply information engineering techniques to the work. They discussed the methodology with the project manager and clients who agreed to try a modified form of the new methodology.

During the first phase of development, an entity-relationship diagram was developed with accompanying data dictionary and process decomposition descriptions. The proj-

the contrary, we still build systems that are not use- ful and thus are not used. For example, New York Bank lost millions of dollars (see Example 1-1) because they used the wrong technology. Part of the reason for continuing problems in application development, like those of NY Bank, is that we are constantly trying to hit a moving target. Both the technology and the type of applications needed by businesses are constantly changing and becoming more complex. Our ability to develop and dissemi- nate knowledge about how to successfully build sys- tems for new technologies and new application types seriously lags behind technological and business changes. This book discusses where the field is now, and where it is likely to be in the 21st century. One thing is certain: The way we build systems in 10 years will be vastly different from the way we build systems today. The existing techniques that we ex- pect to be using into the next century are discussed in this text. There will be other techniques yet to be developed, and you will have to learn to use them, too. One purpose of this text is to provide a founda- tion for learning to learn software engineering.

Another reason for continuing problems in appli- cation development is that we aren't always free to

ect team and users were pleased with the results.

When the schedule for development was presented to the user, it was estimated that the entire project would take 18 months using information engineering. The client balked. He said, "The history of this company is that any project over one year never gets done. Therefore, I won't approve this. Just design me a file, like we have always done, and then add on the processing to create and maintain the file. When you revise the schedule to use this approach-file design and its processing-make sure it is under a year."

apply the techniques we know work best. Why? you might ask. Organizations may know the right things to do, but it is hard to change habits and cultures from the old way of doing things, as well as get users to agree with a new sequence of events or an unfa- miliar format for documentation. As Example 1-2 shows, compromise is possible. The example illus- trates some problems with revolutionary change and how revolution can be pared down to evolution and made acceptable.

You might ask then, if many organizations don't use good software engineering practices, why should I bother learning them? There are two good answers to this question. First, if you never know the right thing to do, you have no chance of ever using it. Second, organizations will frequently accept evolutionary, small steps of change instead of revolutionary, massive change. You can learn indi- vidual techniques that can be applied without complete devotion to one way of developing sys- tems. In this way, software engineers can speed change in their organizations by demonstrating how the tools and techniques enhance the quality of both the product and the process of building a system.

ApPLICATIONS ____ _

Software engineering is the building of applications. An application is the set of programs2 that automate some business task. Businesses are made up offunc- tions such as marketing, accounting, manufactur- ing, and personnel. Each function can be divided into work processes for which it is responsible. For instance, marketing is responsible for sales, ,adver- tising, and new product development. Each process can be separated into its specific tasks. Sales, for instance, requires maintaining customer relations, order processing, and customer service. Applications could support each task individually. Conversely, one marketing application could perform all tasks, integrating the information they have in common.

All applications have some common and some unique features. One problem is that there is no agreed upon way to discuss these similarities and differences. In this book, we present three dimen- sions of applications to simplify and clarify this discussion. The dimensions of applications are char- acteristics, responsiveness, and type. Characteris- tics are common to all applications and include data, processes, constraints, and interfaces. The section on application characteristics is first and should be a review. Responsiveness defines the underlying time orientation of the application as batch, on-line, or real-time. By knowing the time orientation of an application, we can define minimal technology required to support the application. Type defines the business orientation of the application as transac- tional, query, decision, or intelligent.

Application Characteristics This section is about shared characteristics of appli- cations: data, processes, constraints, and interfaces (see Figure 1-1). All applications: (1) act on data and require data input, output, storage and retrieval; (2) imbed commands that transform data from one state to another state based on and constrained by

2 A program is composed of instructions that perform some well-defined task. Sometimes there are many tasks, composed of millions of instructions in an application. When there are many tasks, they are split into programs. This decomposition into subtasks which relate to programs is one topic in the chapters on application design.

Applications 5

business rules; and (3) have some human interfaces and may have one or more computer interfaces. Application types vary in the extent to which these characteristics are known, defined, and understood. Each of the characteristics is discussed below. Since this is review, if you can define the terms in bold print, you might skip to the next section: Application Responsiveness.

Data

Data are the raw material (numbers and letters) that relate to each other to form fields (attributes), which define entities (see Figure 1-2). An entity is some definable class of people, concrete things, concepts, or events about which an application must maintain data. Examples of each entity type are customers, warehouses, departments, or orders, respectively. Data and entities can be described independently of their processing rules. Examples of data definition aids are entity relationship diagrams (see Figure 1-3) and third normal form linkage diagrams (see Figure 1-4).

Data requirements in applications include input, output, storage, and retrieval.

INPUT. Data inputs are data that are outside the computer and must be entered using some input de- vice. Devices used for getting data into the computer include, for example, keyboard,3 scanner, and trans- mission from another computer.

OUTPUT. Output is the opposite of input; that is, outputs are data generated to some media that is outside the computer. Common output devices in- clude printers, video display screens, other comput- ers, and microform equipment (e.g., microfiche, microfilm).

STORAGE AND RETRIEVAL. Data storage describes a physical, machine-readable data format for data, while data retrieval describes the means you use to access the data from its storage format. Storage and retrieval go together both conceptually and in software. Storage format and retrieval access

3 Attached to video display or maybe some typewriter-like terminal, touch-tone phone, etc.

6 CHAPTER 1 Overview of Software Engineering

Data Input and Output Using Human Inte) Application Processes

with Constraints Built-in.

Data Output; Manua/lnterface

Application Processes: Edit, Update, Report, Query

FIGURE 1-1 Application Characteristics

may be defined by your use of purchased software (such as a database management system's method, e.g., Oracle, DB2, or Adabas4), or may be defined by an access method provided by a hardware vendor (e.g., IBM's virtual sequential access method- VSAM).

Data storage require two types of data definition: logical and physical. The logical definition of data describes the way a user thinks about data, that is, the logical data model. These definitions might be

4 Oracle is a trademark of the Oracle Corporation. DB2 is a trademark of the IBM Corporation. Adabas is a trademark of Software AG, Inc.

) Data Output;

Manua/lnterface

Data Storage: Computer Interface

)

To Accounting Applications

relational, hierarchic networked, or object-oriented. Relational logical data models are arranged in tables of rows and columns. Hierarchic logical data models define one-to-many relationships in a tree- shaped model that resembles an organization chart. Network logical data models define many-to-many relationships.

Object-oriented logical data models (OOLDMs) combine hierarchic and network log- ical models to form a lattice-structured hierarchy. OOLDMs are more specific in identifying classes and subclasses of objects in a hierarchy. A class is a set of data entities that share the defining character- istic. For instance, the class customer might have

Applications 7

123426789SandraJaniceJones21 NorthfieidRoadFreeportGA442404042214960 is less meaningful than if it is split into related fields of information:

ENTITY: Person

ATTRIBUTES:

Social Security Number: Name:

Address Line: City:

State: Zip Code:

(Area Code) Telephone:

FIGURE 1-2 Attribute-Entity Example

INSTANCE of Person

123-42-6789 Sandra Janice Jones 21 Northfield Road Freeport GA 44240 (404) 221-4960

subclasses for cash and credit customers. The lat- tice network arrangement allows relationships to remain unconstrained by a data management soft- ware conceptualization.

Figures 1-5, 1-6, 1-7, and 1-8 show logical data structured in each of the four ways for vendor-parts information. Notice that the network and relational diagrams are somewhat similar. The relational model uses logical data connections to reflect relationships, while the network model uses physical address pointers imbedded in the data structure to maintain the relationships. For the hierarchic model, you must make a decision about which information is more important within the data context. If both vendors and parts are equally important, then complete re- dundancy with two hierarchies is required as shown in the diagram.

Vendor

Supplies

Parts

FIGURE 1-3 Entity-Relationship Example

The physical definition of data, or physical data model, describes its layout for a particular hardware device. Physical layout is constrained by intended data use, access method, logical model, and storage device. External storage devices for data include magnetic disk, magnetic diskette, optical disk, com- pact disk, laser disk, digitally applied tape, and mag- netic tape, to name a few. The major differences in devices are the number of times a device can be writ- ten to [e.g., as in write-once-read-many (WORM) technology], the cost, the amount of data that can be stored, the portability of devices, and the type of retrievals that can be done on data (e.g., magnetic tape requires front-to-back sequential processing versus direct accessibility to any data).

8 CHAPTER 1 Overview of Software Engineering

VENDOR Relation I V-No I Vendor-Name I Vendor-Address I City I State I Zip I

~~ VENDOR-PART Relation l V-No I P-No I Quantity J

PART Relation I P-No I Part-Name I Price Units

FIGURE 1-4 Third Normal Form Example

VENDOR Relation

V-No Vendor-Name Vendor-Address City State Zip

01 ABC Hardware 123 Main St. Morristown NJ 07950 03 XYZ Hardware 425 Center St. Akron OH 44311 02 QBE Hardware 7290 4th St. New York NY 10010

VENDOR-PART Relation PART Relation

V-No P-No Quantity P-No Part-Name Price Units

01 001 750 001 Screwdrivers 700 Each 01 002 2000 002 Nails, #1 125 Gross 02 004 1200 004 Nails #3 120 Gross 01 004 1000 ... ....

FIGURE 1-5 Relational Logical Data Model

VENDOR Segment V-No Vendor-Name Address City State Zip

1 PART Segment P-No Part-Name Price Units Quantity-on-hand r-

01 ABC Hardware 123 Main St.

PART Segment P-No Part-Name Price Units Quantity-on-hand

1 VENDOR Segment V-No Vendor-Name Address City State Zip

Morristown NJ

r- f-

07950 012401

Applications 9

Physical Address Pointer

001 Screwdrivers 700 Each 012402

High Values- End of Chain

002 Nails, #1

004 Nails #3 120 Gross FFFFFF

FIGURE 1-6 Hierarchic Logical Data Model

Processes age of new facts or rules inferred about a situation

A process is the sequence of instructions or con- junction of events that operate on data. The results of processing include chaI1ges to data in a data- base, identification of data for display at a ter- minal or printing on paper, generated commands to equipment, generated program commands, or stor-

or entity.

Constraints

Processing is subject to constraints, which are lim- itations on the behavior and/or processiI1g of entities.

10 CHAPTER 1 Overview of Software Engineering

VENDOR Segment V-NO Vendor-Name Address City State Zip

t V-NO P-NO, Qty l Linkage Segment

1 PART Segment P-No Part-Name Price Units

01 ABC Hardware 123 Main St. Morristown NJ 07950 012401

02 004 1200 01 004 1000

/ /

Linkage Set

Nails #3 120 Gross FFFFFF

FIGURE 1-7 Network Logical Data Model

Office Supply Vendors

Vendors

Manufacturing Vendors

FIGURE 1-8 Object-Oriented Logical Data Model

Parts

If accounts receivable balance = zero and prerequisite classes are taken } and course section is available Prerequisites

then register student else write appropriate message to

student.

FIGURE 1-9 Prerequisite Constraint Example

Constraint types are prerequisite, postrequisite, time, structure, control, or inferential.

PREREQUISITES. Prerequisite constraints are preconditions that must be met for processing to occur. They usually take the form of 'if ... then ... else' logic in a program (see Figure 1-9).

POSTREQUISITES. Postrequisite constraints are conditions that must be met for the process to complete successfully. They also take the form of 'if ... then ... else' logic, but the logic is applied after processing is supposedly complete.

TIME. Time constraints may relate to one or more of the following:

Applications 11

1. Timing of processing, for instance, all money transfers in New York must be processed by 3 P.M. to meet the New York Federal Reserve Bank closing deadline.

2. Time allotted for a process, for instance, time-out of the database when remote site A's expected response is not received within ten seconds.

3. External time requirements, for instance, reports must be delivered to the Controller's office by noon.

4. Synchronous processing, for instance, loca- tions A and B must both have completed their respective actions successfully for location C to perform action X.

5. Response time for external interface process- ing, for instance, the system must respond to the user terminal within two seconds after the enter key is pressed.

STRUCTURE. Structural constraints describe the relationships between data, meta-data (knowl- edge about data), knowledge and meta-knowledge (system generated knowledge) in applications (see Figure 1-10). Customers, for example, might have different processing if they pay by credit or cash. So, there would be a general class customer and two subclasses, credit-customer and cash-customer. Meta-data about customers includes, for example, the definition of the domain of allowable values for customer identification.

DATA: CON100

META-DATA: Field=Customer-ID Size=6 Type=xxx999 Validation= Occurs once per customer

KNOWLEDGE: CON001 must pay cash for sales

META-KNOWLEDGE: If Customer-ID > ???050 and accounts receivable balance> 1 000 cash sales only

else OK credit sales up to 1000.

FIGURE 1-10 Structural Constraint Example

12 CHAPTER 1 Overview of Software Engineering

Structural constraints determine what type of inputs and outputs may be allowed, how process- ing is done, and the relationships of processes to each other.

CONTROL. Control constraints relate to auto- mated maintenance of data relationships (e.g., the batch total must equal the sum of the transaction amounts).

INFERENCES. The word infer means to con- clude by reasoning, or to derive from evidence. Inferential constraints are limits on the reasoning ability of the application and its ability to generate new facts from previous facts and relationships. These constraints come in several varieties. First, inferential constraints may relate to the applica- tion. For example, you might not want a medical expert system to build itself new knowledge based on new user information unless the "user" is an approved expert who understands what he or she is doing.

Second, inferential constraints may relate to the type of knowledge in the system and limits on that knowledge. For example, CASE tools cannot help you decide what information to actually enter into the system (yet). Rather, you as the user must already know what you want to describe and how to describe it when you use a CASE tool. What CASE can do is reason whether the information you entered conforms to its rules for how to represent information.

Third, inferential constraints may relate to the language in which the system is developed. For in- stance, you might be required to build an expert sys- tem in Prolog because that is the only language available. Prolog is a goal-oriented, declarative lan- guage with constructs for facts and rules that re- quires its knowledge (i.e., the data) to be imbedded in the program. Large programs in Prolog are hard to understand and may be ambiguous. Therefore, pro- grammers write smaller, limited reasoning programs. If you have a large, complex knowledge base, you may want to separate the data from the program logic. But the language choice can constrain your ability to do such separation.

Interfaces

There are three types of interfaces: human, manual, and computerized. There are few guidelines in any methodologies for designing any of these interfaces. Each type of interface is discussed briefly in this sec- tion, and in more detail later in the text.

HUMAN. Human interfaces are the means by which an application communicates to its human users. Human interfaces are arguably the most important of the three types because they are the hardest to design and the most subject to new tech- nologies and fads.

Most often, a human interface is via a video dis- play which might have options for color, size of screen, windows, and so on. Many application de- velopers are tempted to design elaborate screens with the assumption that more is better: more color, more information, and so forth. But a growing body of research combined with graphic qesign ideas show that this is not the case. Figure 1-11 shows the same information on a well designed screep and on a poorly designed screen. A screen shoulq be orga- nized to enhance readability, to facilitate under.., standing, and to minimize extraneous information. Few colors, standardized design of top and bottom lines, standardized use of programmable function keys, and easy access to help facilities are the keys to good screen design.

MANU AL. Manual interfaces are reports or other human-readable media that show information from the computer. You use manual interfaces when- ever you pay electric, telephone, or water bills. Some simple standards for manual interfaces are to mirror screen designs when possible to enhance under- standing, to fully identify the interface contents with headers, notes, apd footers when needed, and to fol- low many of the same human interface "rules" for formatting information.

AUTOMATED. An automated interface is data that is maintained on computer-readable media for use by another application. Application interfaces tend to be nonstandardized and are defined by th~ data-sharing organizations. Guidelines for applica-

FIGURE 1-11

Program: ABC001 XYZ System Date: mm/dd/yy Name: xxxxxxxxxxxxxxxxxx

Address: xxxxxxxxxxxxxxxxxx City: xxxxxxxx St: xx Zip: xxxxx-xxxx

Ship via Tax? Salesman Terms xxxxxxxx Yes xxx xxxxxxxx Item Qty Description Unit Price Extension xx xx xxxxxxxxxx 9999.99 99999.99 xx xx xxxxxxxxxx 9999.99 99999.99

Well-Designed Screen

Name: xxxxxxxxxxxxx Address: xxxxxxxxxxxxx City: xxxxxxxxx St: xx Zip: xxxxx-xxxx Tax? Y Salesman: xxx Terms: xxxxxxxxxx Ship Via:xxxx Item: xx Qty: xx Description: xxxxxxxxxxxx Unit Price: 9999.99 Extension: 99999.99 Iterm: xx Qty: xx Description: xxxxxxxxxxxx Unit Price: 9999.99 Extension: 99999.99

Poorly-Designed Screen

Good versus Bad Screen Design

Applications 13

tion file interfaces have evolved over the last fifty years to include, for instance, placement of identi- fying information first and placement of variable length information last. Other interfaces are gov- erned by numerous formal standards, for instance, local area network interface standards are defined by the Institute of Electrical and Electronic Engineers (IEEE) and the open system interface (OSI) standard for inter-computer communication is governed by the International Standards Organization (ISO). Few such standards are currently relevant to an individual

business application. Lack of standards, such as for graphics user interfaces (GUIs) slows business acceptance of new innovations. Uncertainty over which 'look' will become the standard, in the case of GUIs, leads to business caution in using new technology.

Application Responsiveness In this book, application responsiveness is how long it takes the system to act on and respond to user

14 CHAPTER 1 Overview of Software Engineering

100% -,-_--==---------::- _______ _

Newly Developed Applications

50%

On-Line Applications

Batch Applications

1950 1960

FIGURE 1-12 Application Type Transition

actions. Responsiveness of an application reflects the fundamental design approach as batch-oriented, on-line, or real-time. Each of these approaches is defined in this section. Of course, in the real world, any combination or permutation of these approaches are used in building applications. Most applications designed in the 1990s are on-line with some batch processing. In the 21st century, on-line applications will give way to more real-time applications. Figure 1-12 shows the transitio.l from batch to on-line to real-time processing in the last half of this century. Table 1-1 compares application responsiveness on several categories.

Batch Applications

Batch applications are applications in which trans- actions are processed in groups. Transactions are gathered over time and stored together. At some pre- defined time, the batch is closed and processing on the complete batch is done. Transactions are pro- cessed in sequence one after the other. A system flow diagram of a typical batch application is shown in Figure 1-13. The batch of transactions is edited and

1970 1980 1990 2000

applied to a master file to create a new master file and a printed log of processing. In batch applications the requirements relating to the average age and maximum possible age of the master file data deter- mine the timing of processing~ 5 In addition to pro- cessing transactions, other programs in batch applications use the master file as their major input and process in a specific fixed sequence.

On-Line Applications

On-line applications provide interactive process- ing with or without immediate file update. Interac- tive processing means there is a two-way dialogue between the user and the application that takes place during the processing. This definition of on-line dif- fers somewhat from the use of on-line terminology in other texts which assume that on-line systems are

5 See Davis, G. and Olson, M., Management Information Sys- tems: Conceptual Foundations, Structure, and Development, New York: McGraw-Hill, 1985, for a detailed discussion of batch systems.

Applications 15

TABLE 1-1 Comparison of Application Technologies

Batch Category Applications

Amount of data Large

Visual review of No inputs

Ratio of updates to High stored data

Inquiry Batch

Reports Long, formal

BackuplRecover Copy files to tape

Cost to build* Low

Cost to operate* Low

Efficient use of Computer resources

Difficulty to build* Simple

Speed of processing Fast all transactions

Speed of processing Slow one transaction

Uses DBMS and Mayor may not data communications

Function integration Low

*Relative measure

also real-time (see the next section). In this text, on-line processing means that programs may be resident in memory and used sequentially by numer- ous transactions/events without reloading.

Figure 1-14 shows the difference between an on-line application and a batch application. In an on-line application, small modules perform the func- tion and communicate directly via data passed between them. In the batch application, disjoint pro- grams perform the function and indirectly communi- cate via permanent changes to file contents created

On-Line Real-Time Applications Applications

Small-Large Medium

Yes Yes

Low-High High

On-line On-line

Short, informal Short, informal

One or more of the One or more of the following: following: Copy files to tape Copy files to tape transaction log, transaction log, preimage log, postimage pre image log, postimage log, mirror image files log, mirror image files

Medium High

Medium-High High

People time People time

Medium Complex

Slow Medium

Medium Fast

Probably Yes

Medium High

by one program and interpreted by the next pro- gram(s). The on-line programs keep a log oftransac- tions to provide recovery in case of error; this prevents re-entry of data.

On-line programs' dialogue with the user is to ensure entry of syntactically correct data. The error correction dialogue replaces the error portion of the update log. The remainder of the update log to doc- ument updates becomes optional and, instead, an acknowledgement of successful processing is dis- played to the user.

16 CHAPTER 1 OveNiew of Software Engineering

anual Transaction Data Entry

Transaction Edit

Program

Master Update

Program

FIGURE 1-13 Batch Application System Flow Diagram

Completion Acknowledgment

Interactive Data Entry

Module'

Update Module

FIGURE 1-14 On-Line Application System Flow Diagram

Real-Time Applications

Real-time applications process transactions and/or events during the actual time that the related physi- cal (real world) process takes pl&ce. The results of the computer operation are then available (in real time) to influence or control the physical process (see Figure 1-15). Changes resulting from a real- time process can be refreshed to users viewing prechange data when the change is completed. Real- time programs can process multiple transactions concurrently. In parallel processes, concurrency literally· means that many transactions are being worked on at the same time. In sequential processes, concurrency means many transactions are in process but only one is actively executing at anyone moment.

Database processing is more sophisticated in real- time systems. If an update to a data item takes place, all current users of the item may have their screens refreshed with the new data. Examples of real-time applications include automated teller machine

Applications 17

(ATM), stock market ticker, and airline reserv&tion processing.

Types of Applications There are four types of business applications: trans- action, data analysis, decision support, and expert applications. Today, all four types are usually on- line although the application may use any (or all) of the responsiveness types, even on a single appli- cation. In addition, a fifth type of application: em- bedded, is defined briefly to distinguish computer science-software engineering from IS-software engineering.

Transaction -Oriented Applications

Transaction-oriented applications, also known as transaction processing systems (TPS), support the day-to-day operation of a business and include order processing, inventory management, budgeting,

Interactive Data Entry

Module

Updated Data to

All Current Users

Edited Transaction

Update Module

Updated Data

Refresh Module

FIGURE 1-15 Real-Time Application System Flow Diagram

18 CHAPTER 1 OveNiew of Software Engineering

Maintain Customers Customer

File

Maintain Orders

Create Shipping

Papers/Invoices

* Maintain here includes add, change, delete, and query processing.

FIGURE 1-16 Order Processing Applications

purchasing, payables, accounting, receivables, pay- roll, and personnel. They are characterized as appli- cations for which the requirements, the data, and the processing are generally known and well- structured.6 By known, we mean that the function is repetitious, familiar and unambiguous. By well-

6 An infonnative text on transaction processing systems is On-line Business Computer Applications, 2nd ed., by Alan Eliaison. Chicago: Science Research Associates, Inc., 1987.

structured, we mean that the problem is able to be defined completely and without ambiguity. The requirements are identifiable by a development team.

A transaction application example is order pro- cessing (see Figure 1-16). Order processing requires an order file, customer file, and inventory file. The contents of the files differ depending on the level of integration of order processing with accounts receiv- able, manufacturing, purchasing, and inventory pro- cessing. Processing of orders requires add, change,

EFFECTIVE INSURANCE COMPANY In the early 1980s, Effective Insurance realized they were generating 22 feet of paper each month in accounting reports that were sent to about 80 different parts of the organization. Yet, for all this paper, the number of legitimate requests for access to data was mushrooming and had reached about 200.

Rather than try to produce reports for each specific user, the company decided to automate the information and allow users to access their own data to generate their own reports. That way, paper could be reduced and each person would have the data they wanted, formatted the way they wanted it.

The company never anticipated the im- mense savings in time, money, and, more

delete, and inquiry functions for all files, pricing of items, and creation of shipping papers and invoices. Inquiry functions should allow retrieval of informa- tion about orders by date, order number, customer ID, or customer name. The software engineer uses his or her understanding of general order processing to customize the application for a given organiza- tion and implementation environment.

Data Analysis Applications

Data analysis applications support problem solving using data that are accessible only in read-only mode. Data analysis applications are also known as query applications. A query is some question asked of the data. SQL, the standard query language for database access, poses questions in such a way that the user asks what is desired but need not know how to get it. The computer software figures out the op- timal access and processing methods, and performs the operations it selects. An example of a query ask- ing for the sum of all sales for customers in New York State for the first yearly quarter might look like the following:

Applications 19

importantly, the increases in productivity and morale, that this move would produce. By 1989, there were over 2,000 users accessing some or all of the accounting information. Each user had his own terminal and the use of a fourth generation language,* to generate customized information interactively. Reports were created by each user as needed.

* A fourth-generation language is one in which a query language, statistical routines, and data base are integrated for application development by both IS and by non-IS professionals.

SELECT CUST_NAME, CUST_ID, AND SUM(CUST_SALES)

FROM CUSTOMER WHERE CUST STATE = 'NY' AND MONTH IN (1, 2, 3);

A language, such as SQL, is a declarative lan- guage, because you 'declare' what to do, not how to do it. Declarative languages are relatively easy to learn and use, and are designed for use by noninfor- mation systems professionals.

Queries are one of three varieties:

1. Interactive, one-of-a-kind. These are assumed to be thrown away after use.

2. Stored and named for future modification and re-execution.

3. Stored and named for frequent unchanging execution.

The third type of query frequently replaces re- ports in transaction applications (see Example 1-3). The data for all query processing must be known in advance and tend to come from transaction applica- tions. Query outputs may use program language for-

20 CHAPTER 1 Overview of Software Engineering

matting defaults (as in SQL), or may be formatted for formal visual presentation or fed into other soft- ware (e.g., graphical software) for summarizing.

Query applications support an evolving concept called data warehouse, a storage scheme based on the notion that most data should be retained on-line for query access. A warehouse stores past versions of major database entries, transaction logs, and his- torical records.

Decision Support Applications

Decision support applications (DSS) seek to iden- tify and solve problems. The difference between de- cision support and query applications is that query applications are used by professionals and managers to select and summarize historical data like the example above, while DSSs are used by profession- als and managers to perform what-if analysis, iden- tify trends, or perform mathematical/statistical analysis of data to solve unstructured problems. Data for DSSs usually are generated by transaction applications.

Unstructured problems are ones for which not all information is known, and if it is known, the users may not know all of the relationships between data. An example of a structured problem is to answer the question: "What is the cost of a 5% salary increase?" An example of an unstructured problem is "What product mix should we manufacture next year?" The difference between these two kinds of questions is that the structured problem requires one query of known data to develop an estimate, while the prod- uct mix question requires economic, competitive, historical, and product development information to develop an estimate. Because the information may not all be known, DSS development uses an iterative problem-solving approach, applying mathematical and statistical modeling to the decision process. Cor- rected and/or supplemental data are fed back into the modeling processes to refine the analysis.

Executive information systems (EIS) are a spin- off from DSS. EIS applications support executive decision m&king and provide automated environ- mental scanning capabilities. Top executives deal with future-oriented, partial, inaccurate, and ambigu- ous information. They scan the economy, industry, and organizational environments to identify and

monitor key indicators of business activity that affect their organization. EIS integrate information from external information databases and internal applica- tions to provide an automated scanning and model- ing capability. The major difference in EIS from DSS then, is the incompleteness, potential inaccu- racy, and ambiguity of the data.

Group decision support systems (GDSS) are a special type of DSS applications. GDSS provide an historical memory of the decision process in sup- port of groups of decision makers who might be geo- graphically dispersed. GDSS focus more on the group interaction processes with little or no data modeling or statistical analyses. Data analysis soft- ware in GDSS tend to be less elaborate than DSS software, but may include a spreadsheet and routines to present summaries of participant votes on issues in either numerical or graphical formats. GDSS typ- ically provide such functions as

1. Anonymous recording of ideas 2. Democratic selection of group leaders 3. Progressive rounds of discussion and voting

to build group consensus

For all DSS, application development is more for- mal than query applications, and less formal than transaction applications. The development life cycle tends to be iterative with continuous identifi- cation of requirements. DSS software environments are sophisticated and typically include software tools for communications support, statistical modeling, knowledge-base maintenance, and decision process support.

Expert Systems

Expert systems applications (ES) are computer applications that automate the knowledge and rea- soning capabilities of one or more experts in a spe- cific domain. ESs analyze characteristics of a situation to give advice, recommend actions, or draw conclusions by following automated reasoning processes. The four major components of an ES are knowledge acquisition subsystem, the knowledge base, the inference engine (or rule base as it is some- times called), and explanation subsystem. Each of these components are briefly explained here.

MEDICAL ES ETHICAL DILEMMA A doctor who is not a specialist in rare dis- eases sees a patient in the emergency room who appears to be in respiratory distress. After a preliminary exam, he consults with an ex- pert system that diagnoses many diseases and recommends a course of treatment. The ES requests all of the symptoms from the doc- tor who answers the questions to the best of his ability. The ES diagnoses the problem as advanced Legionnaires' disease with a prob- ability of 80%. The ES suggests no other possi- ble diseases. The doctor prescribes the ES ' s recommended treatment. The patient dies. On investigation, it turns out that the ES con- tains errors in its rules and that the correct diagnosis, following the exact same set of symptoms, would have led to a different diagnosis with different treatment.

There are ethical issues in every aspect of this problem. Who is responsible for ES accu-

The knowledge acquisition subsystem is the means by which the knowledge base is built. In gen- eral, the more knowledge, the 'smarter' the system can be. The knowledge acquisition subsystem must provide for initial loading of facts and heuristic rules of thumb, and be easy to use in adding knowledge to the knowledge base.

Frequently, we reason without knowing how we arrive at a solution. In fact, reflect how you yourself think when analyzing a problem to develop an appli- cation. How do you decide what the processes are? You follow an elaborate, highly internalized process that is difficult to talk about. You are not alone in having this difficulty. Eliciting the information about reasoning processes from experts is a major diffi- culty in building effective ES applications.

The knowledge base is the codified automated version of the expert user's knowledge and the rules of thumb (also called heuristics) for applying that knowledge. Designing the knowledge base is as dif- ficult as eliciting the information because no matter how it is designed, it will be limited by the software

Applications 21

racy? Is the knowledge engineer who built the ES responsible for ensuring accuracy of information in the system? Or, does his or her responsibility only mean translating the rea- soning processes correctly? What is the responsibility of the "expert" who supplies the information in ensuring it is correctly entered into an ES to supply correct reasoning? If a medical ES contains information on thou- sands of diseases, is it even possible to test it completely? How is consistency of diagnoses checked? What happens when symptoms are entered in different sequences? Is the doctor who uses the ES suggestion ethical? There is no consensus on answers to these questions at present. The lack of consensus highlights the need for discussion of ethical issues in IT applications.

in which it is implemented. Therefore, special ES programming languages have been designed to al- low the most flexibility in defining connections be- tween pieces of information and the way the pieces are used in reasoning.

Just as people reason to develop a most probable outcome to a situation, ESs use reasoning and infer- ence to develop multiple, probable outcomes for a given situation. Several solutions may be generated when there is incomplete information or partial rea- soning. Probabilities of accuracy of the solution(s) are frequently developed to assist the human in judg;.. ing the usefulness of a system-generated outcome. Ethical and moral issues may be more apparent in ESs than the other application types. Example 1-4 describes an ethical dilemma relating to a medical ES.

The last major component of ES is the ability to explain its reasoning to the user. The explanation subsystem provides the ability to trace the ES's rea- soning. Tracing is important so the user can learn from the experience of using the system, and so he or

22 CHAPTER 1 Overview of Software Engineering

she may determine his or her degree of confidence in the ES's results.

These four application types-transaction, query, DSS, and ES-will be referenced throughout the text to tie topics together and to discuss the useful- ness of methodologies, languages and approaches to testing, quality assurance, and maintenance for each.

Embedded Systems

Embedded systems are applications that are part of a larger system. For example, a missile guidance application works in conjunction with sensors, explosives, and other equipment within a single mis- sile unit. The application, by itself, is minor; its com- plexity derives from its analog interfaces, need for complete accuracy, and real-time properties within the missile's limited life span once it is released. Embedded applications development has been the province of computer science educated develop- ers rather than information systems (IS) educated developers.

As business deploys ever more complex equip- ment in the context of computing environments, the need for embedded systems skills will increase. This implies that IS education must also address real- time, embedded system requirements, and that com- puter scientists will continue to move into business for application development.

Applications in Business Applications are most successful when they match the organizations' needs for information. Most in- formation in organizations is generated to allow the managers to control the activities of the organiza- tion to reach the company's goals. Goals may be short-term or long-term. Control of activities implies information evaluation and decision making. There are three levels of organizational decision making: operational, managerial, and strategic. Each level has different information needs and, therefore, dif- ferent application needs.

At the operational level, the organization requires information about the conduct of its business. Deci-

sions deal with daily operations. For instance, the operational level in a retail organization is concerned with sales of products. The main operational level applications would be order processing, inventory control, and accounts receivable. In a manufactur- ing business, the operational level is concerned with sales and manufacturing. The main operational level applications would be manufacturing planning, man- ufacturing control, inventory management, order processing, and shipping.

The information at the operational level is cur- rent, accurate, detailed, available as generated, and relates to the business of the organization. Opera- tional information is critical to the organization remaining in business. As a critical resource, the data requires careful management and maintenance. The types of applications that support operational level decisions and information are transaction processing applications (see Figure 1-17). Query applications for current operational data are other applications that support operational level decisions.

The information needs for managerial control are mostly internal information, can be detailed or sum- mary, and should be accurate. Decisions made for managerial control concentrate on improving the ex- isting ways of doing business, finding and solving problems, and take a medium-range (e.g., quarter or year) view of the company's business. The types of issues dealt with concern reduction of

• costs by comparing suppliers' prices • the time to process a single order • the errors in a process

Operational Control

FIGURE 1- 17 Application Types and Decision Types

• the number of manual interactions with an order, and so on

The types of applications that support these data needs are data analysis applications, DSS, and GDSS (see Figure 1-17). Each of these application types serves a different role in supporting manager- ial control decision needs. Data analysis applications can be used to find and solve problems. DSSs can be used to identify trends, analyze critical relation- ships, or compare different work processes for pos- sible improvements. GDSSs facilitate meetings of people with different motivations and organizational goals, providing a means to reach consensus with a frank discussion of the issues.

At the strategic level, the types of decisions take a broad· view of the business and ask, for instance, what businesses should we be in? What products should we produce? How can we improve market share? These questions require external information from many sources to reach a decision. The infor- mation is ambiguous, that is, able to be interpreted in many different ways. Because the information is future-oriented, it is likely to be incomplete and only able to be digested at a summary level.

The types of applications that support incomplete, ambiguous, external information needs best are ex- ecutive information systems (EIS) (see Figure 1-17). EISs are specifically designed to accommodate incomplete, ambiguous information. GDSSs also might be used at the executive level to facilitate dis- cussion of alternative courses for the organization.

PROJECT ______ _ LIFE CYCLES _____ _

There are several different ways to divide the work that takes place in the development of an application. The work breakdown in general comprises the pro- ject's life cycle. If you asked five SEs to describe the life cycle of a typical computer application, you would get five overlapping but different answers. Life cycles discussed here are the most common ones: sequential, iterative, and learn-as-you-go.7

7 Future developments in life cycles are discussed in Chapter 18.

Project Life Cycles 23

Sequential Project Life Cycle You should remember from systems analysis that a sequential project life cycle (SPLC) starts when a software product is conceived and ends when the product is no longer in use. Phases in a SPLC include

• initiation • problem definition • feasibility • requirements analysis • conceptual design • design • code/unit test • testing • installation and checkout • operations and maintenance • retirement

These SPLC phases are more appropriate to busi- ness than to military/government applications because, in the government, the first four phases (ini- tiation' definition, feasibility, and functional re- quirements definition) are usually completed by a different organization than that of the imple- menters. Government projects are subject to con- gressional review, approval, and budgeting. So, a government project requiring congressional ap- propriation is usually defined as beginning at the conceptual design phase and ending with deployment of the system with operational status according to Department of Defense standard #2167a [DOD, 1985]. In contrast, business IS are typically initiated by a user department requesting that a system be built by an MIS department. The need for an IS is typically motivated by some busi- ness situation: a change in the method of business, in the legal environment, in the staffing/support envi- ronment, or in a strategic goal such as improving market competitiveness.

We call these SPLC phases a 'Waterfall' ap- proach to applications because the output of each phase feeds into the next phase, while phases are modified via feedback produced during the verifica- tion and validation processes8 (see Figure 1-18).

8 Boehm, Barry W., Software Engineering Economics. Engle- wood Cliffs, NJ: Prentice-Hall, 1981.

24 CHAPTER 1 Overview of Software Engineering

L-~In_it_ia_tio_n_----,~

~ Feasibility Analysis '\

~R \1 Design 1 ~ p~ra}

Unit Test '\

~I Te~ 1\ "- Implement ~ '-- ~

Operate and

Maintain

<----..Ret----lire IJ FIGURE 1- 18 Sequential Project Life-Cycle Model

Phases in the waterfall definition are defined as dis- crete even though, in practice, the information is obtained in a nonlinear manner and the phase begin- nings and endings are difficult to distinguish. To identify discrete beginnings and endings, most com- panies use the completion of the major product (i.e., program or document) produced during each phase as signaling the phase end. So, completion of a fea- sibility report, for instance, identifies the end of the

feasibility analysis phase. In the following subsec- tions, each phase of the project life cycle (SPLC) is defined,9 with the main activities and documents identified.

9 This definition is adapted from work conducted during The Assessment and Development of Software Engineering Tools project sponsored by the U.S. Army Institute for Research in Management Information, Communications, and Computer Sciences (AIRMICS), contract DAKFll-89-C-0014.

SPLC Phases

INITIATION. Project initiation is the period of time during which the need for an application is identified and the problem is sufficiently defined to assemble a team to begin problem evaluation. The people and organizations affected by the appli- cation, that is, the stakeholders, are identified. Par- ticipants from each stakeholder organization for the development team are solicited. The outcome of initiation is a memo or formal document requesting automation support and defining the problem and participants.

FEASIBILITY. Feasibility is the analysis of risks, costs and benefits relating to economics, technology, and user organizations. The problem to be automated is analyzed in sufficient detail to ensure that all aspects of feasibility are evaluated.

Economic feasibility analysis elaborates costs of special hardware, software, personnel, office space, and so forth for each implementation alternative.

In technical feasibility analysis, alternatives for hardware, software, and general design approach are determined to be available, appropriate, and functional. The benefits and risks of alternatives are identified.

Organizational feasibility is an analysis of both the developing and using organizatiop.s' readiness for the application. Particular emphasis is placed on skills and training needed in both groups to ensure successful development and use of the application. The decision whether or not to use consultants and the type of role they would play during development is made during organizational feasibility analysis. Organizational decisions include effectiveness of the organization structure and definition of roles of individual jobs in the organization as they will be with the new application.

The feasibility report summarizes

• the problem • the economic, technical and organizational

feasibility • risks and contingency plans related to the

application

Project Life Cycles 25

• preferred concept for the software product and an explanation of its superiority to alternative concepts

• training needs and tentative schedules • estimates of project staffing by phase and level

of expertise

After feasibility is established, the Software De- velopment Life Cycle (SDLC), a subcycle of the SPLC, begins. This subcycle typically includes phases for analysis, conceptual design, design, im- plementation, testing, and installation and checkout. SDLC end is signaled by delivery of an operational application.

ANALYSIS. The analysis phase has many syn- onyms: Functional Analysis, Requirements Defini- tion, and Software Requirements Analysis. All of these names represent the time during which the business requirements for a software product are defined and documented. Analysis activities define

1. Functional requirements-"what" the system is supposed to do. The format of the func- tional requirements definitions depends on the methodology followed during the analy- sis phase.

2. Performance requirements-terminal, mes- sage, or network response time, input/output volumes, process timing requirements (e.g., reports must be available by 10 A.M.).

3. Interface(s) requirements-what data come from and go to other using applications and organizations. The definition includes timing, media, and format of exchanged data.

4. Design requirements-information learned during analysis that may impact design activ- ities. Examples of design requirements are data storage, hardware, testing constraints, conversion requirements, and human- machine interaction requirements (e.g., the application must use pull-down menus).

5. Development standards-the form, format, timing, and general contents of documenta- tion to be produced during the develop- ment. Development standards include rules about allowable graphical representations,

26 CHAPTER 1 Overview of Software Engineering

documentation, tools, techniques, and aids such as computer-aided software engineering (CASE) tools, or project management sched- uling software. Format information includes the content of a data dictionary/repository for design objects, project report contents, and other standards to be followed by the project team when reporting project accomplish- ments, problems, status and design.

6. The plan for application development is refined.

Analysis documentation summarizes the current method of work, details the proposed system, and how it meets the needs of the required functions. Requirements from the work activities are described in graphics, text, tables, structured English, or some other representation form prescribed by the method- ology being used.

CONCEPTUAL DESIGN. Once the proposed logical system is understood and agreed to by the user, conceptual design begins. Other names for con- ceptual design activity include preliminary design, logical design, external design, or software require- ments specifications. The major activity of concep- tual design is the detailed functional definition of all external elements of the application, including screens, reports, data entry messages, and/or forms. Both contents and layout are included at this level. In addition, the logical data model is transformed into a logical database schema and user views. If distribu- tion or decentralization of the database is antici- pated' the analysis and decision are made during conceptual design. The outputs of conceptual de- sign include the detailed definition of the external items described above, plus the normalized and opti- mized logical database schema.

Not all organizations treat conceptual design sep- arately. Outputs of conceptual design may be in a conceptual design document or might be part of the functional requirements document developed dur- ing analysis. Depending on the project manager's personal taste and experience, the conceptual design might be partially completed during logical design and fully completed during physical design. In this text, the two phases, design and conceptual design, are treated as one.

DESIGN. Design maps "what" the system is sup- posed to do into "how" the system will do it in a par- ticular hardware/software configuration. lO The other terms used to describe design activities include detailed design, physical design, internal design, and/or product design.

During the design phase, the software engineer- ing team creates, documents, and verifies:

1. Software architecture-identifies and defines programs, modules, functions, rules, objects, and their relationships. The exact nature of the software architecture depends on the methodology followed during the design phase.

2. Software components and modules-defines detailed contents and functions of software components, including, but not limited to, inputs, outputs, screens, reports, data, files, constraints, and processes.

3. Interfaces-details contents, timing, respon- sibilities, and design of data exchanged with other applications or organizations.

4. Testing-defines the strategy, responsibili- ties, and timing for each type of testing to be performed.

5. Data-physically maps "what" to "how" for data. In database terms, this is the definition of the physical layout of data on the devices used, and of the requirements, timing, and responsibility for distribution, replication, and/or duplication of data.

SUBSYSTEM/PROGRAM DESIGN. Subsys- tem and/or program designs are sometimes treated as subphases of the design phase. Whether they are separate phases or not, the software engineering team creates, documents, and verifies the following:

1. Application control structure-defines how each program/module is activated and where it returns upon completion.

10 Anyone who has designed a system will tell you that you cannot perform the conceptual design without some knowl- edge and attention to the implementation environment. So, the "what" and "how" distinctions are generally, but not completely, accurate when described as discrete activities.

2. Data structure and physical implementation scheme-defines physical data layouts with device mapping and data access methods to be used. In a database environment, this activity may include definition of a central- ized library of data definitions, calling rou- tines, and buffer definitions for use with a particular DBMS.

3. Sizing-defines any programs and buffers which are expected to be memory-resident for on-line and/or real-time processes.

4. Key algorithms-specifies mathe~atically correct notation to allow independent verifi- cation of formula accuracy~

5. Program component (routine with approxi- mately 100 source procedure instructions)- identifies, names, and lists assumptions of program component design and usage. Assumptions include expectations of, for instance, resident routines and/or data, other routines/modules to be called in the course of processing this module, size of queues, buffers, and so on required for processing.

CODE AND UNIT TEST. During coding, the low-level program elements of the software product are created from design documentation and de- bugged. Unit testing is the verification that the pro- gram does what it is supposed to do and nothing more. In systems using reusable code, the code is customized for the current application, and checked to ensure that it works accurately in the current environment.

TEST. During testing-sometimes called Com- puter Software Component (CSC) Integration and Testingll-the components of a software product are evaluated for correctness of integrated processing. Quality assurance testing may be conducted in the testing phase or may be treated as a separate activity. During quality assurance tests, the software prod- uct (i.e., software or documentation) is evaluated by a nonmember of the project team to deter- mine whether or not the analysis requirements are satisfied.

11 This is a term used by DOD standard #2167a, 1985.

Project Life Cycles 27

IMPLEMENTATION. Also called Installation and Checkout, implementation is that period of time during which a software product is integrated into its operational environment and is phased into production use. Implementation includes the completion of data conversion, installation, and training.

At this point in the project life cycle, the software development cycle ends, and the maintenance phase begins. Maintenance and operations continue until the project is retired.

OPERATIONS AND MAINTENANCE. Opera- tions and maintenance is the period in the software life cycle during which a software product is em- ployed in its operational environment, monitored for satisfactory performance, and modified as nec- essary. Three types of maintenance12 are

1. Perfective-to improve the performance of the application (e.g., make all table indexes binary to minimize translations, change an algorithm to make the software run faster, and so on.)

2. Corrective-to remove software defects (i.e., to fix bugs)

3. Adaptive-to incorporate any changes in the business or related laws in the system (e.g., changes for new IRS rules)

Each type of maintenance requires a mini- analysis and mini-design to determine social, techni- cal, and functional aspects of the change. The current operational versions of software and documentation must be managed to allow identification of errors and to ensure that the correct copy of software is run. One aspect of change management specifically ad- dresses configuration management of application programs in support of maintenance activities.

RETIREMENT. Retirement is the period of time in the software life cycle during which support for a software product is terminated. Usually, the func- tions performed by the product are transferred to a successor system. Another name for this activity is phaseout.

12 A detailed discussion of maintenance topics is presented in Lientz and Swanson, 1980.

28 CHAPTER 1 Overview of Software Engineering

UNIVERSAL ACTIVITIES. There are two uni- versal activities which are performed during each life-cycle phase: verification and validation, and configuration management.

An integral part of each life-cycle phase is the verification and validation that the phase products satisfy their objectives. Verification establishes the correctness of correspondence between a software product and its specification. Validation establishes the fitness or quality of a software product for its operational purpose.

For instance, the individual code module specifi- cations from design are verified to ensure that they contain accurate and complete information about the functions they perform. The modules are validated against the analysis phase specification to ensure that all required functions have corresponding designs that accurately reflect the requirements.

Configuration management refers to the man- agement of change after an application is opera- tional. A designated project librarian maintains the official version of each product. The project librarian is able at any time to provide a definitive version (or baseline) of a document or software module. These baselines allow the project manager to control both the software maintenance process and the software products.

History

The sequential life cycle was originally developed and documented in the 1960s to provide defense contractors a life-cycle documentation standard for Department of Defense (DOD) projects. The cur- rent DOD Standard #2167a lists all activities and details all documentation required for software development as fulfillment of military contracts. As industry recognized that their own application development projects were out of control, over bud- get, and unsatisfactory when complete, they modi- fied the standard to eliminate defense/aerospace terminology and replace it with industry specific ter- minology. Organizations modified the standard to incorporate elements of methodologies, such as structured development, data flow diagrams, and walk-throughs, that were becoming known at the same time. In the late 1960s and early 1970s the waterfall and 2167 documentation standard were

used throughout most Fortune 500 companies as cast-in-concrete requirements for building and docu- menting systems.

Problems

As nonnegotiable documentation requirements, pro- jects frequently produced thousands of pages of doc- umentation that no one except the authors ever read. Information about applications was rarely in anyone person's head and communication overhead became a major problem to completing systems successfully. User/management approval to continue with each phase was not based on their knowledge of what the system would do, but on some other criteria. Published studies showed that the typical written application requirements document contained, on average, one-half to one error per page. The conclu- sion that paper prose is not a good medium for conveying the complex variety of application requirements led to the development of more graph- ical representation forms.

Eventually, IS managers realized that the water- fall, when applied too stringently, not only did not solve the problems of bad systems, it contributed to a new generation of overdocumented bad systems. The result has been a scaling back on required doc- umentation. Standards have become 'guidelines' for experienced project managers to consider and to pro- vide new project managers with review lists of activities whose relevance they should consider. Each project team customizes the documentation and development activities in addition to the tools and techniques they use.

Even with relaxation of required documentation, a sequential life cycle does not recognize the itera- tive, nonlinear nature of application development, and cannot easily accommodate overlap of phases. Many organizations now use a variant of the water- fall by performing the activities in an overlapped manner, sometimes called the 'pipeline' approach. Finally, the waterfall approach does not recognize that the level of detail necessary to adequately doc- ument application functions is significantly different with the use of automated tools, use of diagrams (e.g., DFDs) to replace text, and use of high level, fourth-generation languages (e.g., SQL).

Current Use

The sequential life cycle is still used but rarely in full detail, and mostly for transaction applications. The sequential life cycle and its terminology will be around for many decades to come, but two diver- gent trends will occur. On the one hand, demarcation of phases will be more relaxed. Three trends lead- ing to phase relaxation are:

• increasing maturity of computer-aided soft- ware engineering (CASE) tools

• increasing use of high-level languages • availability of reusable electronically stored

application information

On the other hand, further alteration and cus- tomization to accommodate the need for more detail in systems using new, more complex, or other- wise novel hardware/software components will also take place. It is from these novel, groundbreaking applications that our industry frequently develops new techniques to better communicate application characteristics.

Iterative Project Life Cycle Iterative PLC Description

An iterative project life cycle is a cyclic repetition of analysis and design events. Iterative PLC is some- times called prototyping or a spiral approach to development.

Prototyping is the development of a system or system component in a short period of time without formal written specifications. Originally thought of as helpful for proving the usefulness of new tech- nologies, prototyping caught on in the early 1970s as a way to circumvent the overload of documentation from the sequential life cycle. Frequently, prototyp- ing was wrongfully used and led to bad systems. But, as experience with prototyping has grown, there are three specific uses for which proto typing can be very beneficial:

1. Complete iterative development of an appli- cation when requirements are not well-under- stood, e.g., DSS

2. Proof of utility, availability, or appropriate- ness for technology, software, or hardware

Project Life Cycles 29

3. Rapid development of part of the system to ease a critical work situation for users, e.g., order entry without edit/validation to ease paper backlog

Some authors describe a completely different life cycle for prototyped applications. The notion that the life cycle is completely different is not entirely cor- rect. The life cycle depends on the nature of the pro- totype. If a complete application is built, then the model of the life cycle mirrors that of the waterfall with iteration between analysis-design-program- ming-testing-implementation as requirements be- come known (see Figure 1-19). The difference is the level of detail to which analysis and design are per- formed. Requirements of iteratively-developed applications are generally not well known or under- stood. They might be ambiguous or incomplete for some time. The prototype provides a base from which users and developers together discover the requirements for the application.

One use of proto typing tests proof of utility, availability, or appropriateness of the hardware, soft- ware, or design concept. The prototype development process is a subphase of development that may par- allel either feasibility, analysis, or design. There is no significant testing of a 'proof' prototype because it is being used to verify that an activity can be auto- mated in a certain way, or that hardware (or soft- ware) can be used as planned. An example of a proof-of-concept prototype is shown in Figure 1-20 as taking place at the same time as the feasibility study. By the end of feasibility analysis, the useful- ness of the prototype is decided, and the feasibility report recommends that the tested product (or idea) be abandoned or used.

A third type of prototype is a partial application developed as a stopgap measure for a particular problem until the complete system is available. A partial prototype might be built with its complete life cycle paralleling one phase of the development life cycle as shown in Figure 1-21. The phases of the prototype development cycle mirror those of a nor- mal development life cycle; they differ in that only a small portion of the entire application is developed. These prototypes can omit processing details. For instance, an on-line data entry might not fully vali- date data. Feedback to the design team would detail

30 CHAPTER 1 OveNiew of Software Engineering

Analyze and Gather Requirements

Design

Build Prototype

FIGURE 1- 19 Full System Prototype Life Cycle

what is and is not in the prototype so that its design and development are completed during the regular application development.

Problems

There are two major problems with prototyping: misuse to circumvent proper analysis and design, and never completing prototypes as proper applica- tions. Prototyping has been used as one way to cir- cumvent rigidities in the sequential life cycle when it is treated as a set of nonnegotiable activities. In this misuse, some authors refer to 'quick analysis' and 'quick design' as if less work is done during those phases. In fact, if done properly, the activities and work are identical to those done in the life cycle, and the effort normally placed on documentation is di- verted to building software.

The other major problem with development of a prototype is that the system might never get

Evaluate and Refine Requirements

Engineer product to ensure complete documentation development

formalized. Details of processing, for instance, data validation and audit requirements, might be forgotten in the push to get a working prototype into production. While this problem is easily solved, it requires user and management com- mitment to a completed project. The problems with ensuring this commitment are political, not technical.

Current Use

Although still misused in the development of undoc- umented, incomplete applications, proto typing for the above intended purposes is also alive and healthy. All forms of query and DSS applications are candidates for iterative life cycles. Some languages (such as Focus, Rbase, Oracle) have easy to learn, short, very high-level programming languages that are naturally amenable to prototyping. A database can be defined, populated with data, and queried in

Project Life Cycles 31

Feasibility \ r-- Analysis Prototype Activftles: ~ c=J\ c::J\ c::J\ \c::J

Evaluate and Refine Requirements

Unit Test

Implement

FIGURE 1-20 Proof of Concept Prototype Application Development Activities

under an hour to show capabilities of the languages or discuss requirements of a system. This kind of prototyping builds morale in the IS staff and confi- dence in the users, and is a great selling tool for in- house application development.

Future Use

Prototyping is appropriate to validate designs, to prove use of new hardware and/or software, or to quickly assist users while building a larger applica- tion. For these uses, prototyping is expected to be employed with increasing use of high-level lan- guages to facilitate prototype development. Though there are few automated prototyping tools that also interface to CASE for full application definition, more such integrated tools are becoming available.

Learn-as-You-Go Project Life Cycle Learn-as-You-Go PLC Description

With all the good news about developments in life cycles, there is a disturbing statistic that about 75% of all companies in the United States do not use any life cycle and/or methodology to guide their devel- opment work. 13 The title learn-as-you-go could equally well be called trial-and-error, or individual problem solving. The life cycle for the no-life cycle

13 Necco, Charles R., Carl L. Gordon, and Nancy W. Tsai, "Systems analysis and design: Current practices," MIS Quar- terly, December 1987, pp. 461-476.

32 CHAPTER 1 Overview of Software Engineering

Feasibility

'--__ D_e_Si_gn __ ---' '\

Program/ i\ Unit Test

'-------'

Implement Prototype

FIGURE 1-21 Partial System Prototype

approach is shown in Figure 1-22, which shows a generic life cycle. The problem is defined. The SE develops the application, which enters operation and maintenance. This approach is not suited to group work, so projects are limited to one person develop- ing small applications. There are two different types of development groups that are in this category: de- velopers of truly unique applications, and developers who do not want too much control or structure in their work.

The first developer view that the problem is unique and cannot easily be molded into a formal life cycle because of its nature is appropriate to applications using emerging technologies and tech- niques, such as expert systems and artificial intelli-

I Implement

gence. There is no life cycle that describes building of expert systems, although with a feedback loop between maintenance and definition to indicate iter- ation, Figure 1-22 is appropriate to these systems. There is no methodology of knowledge engineer- ing; rather, there are several techniques that one might use depending on the nature of the expertise, the personality of the expert, and the complexity of the problem domain. This life-cycle approach is appropriate for such emerging application domains as long as it is a disciplined experimentation loop that includes feedback and documentation.

The second view that all problems are unique, and if understood, do not require significant model- ing, documentation, or sequences to the analysis and

Define

Problem f\ ,-------,I \

FIGURE 1-22 Development

Develop

~.------'------- Maintain

Generic View of Life-Cycle

development events. Since each problem is unique, there is no point in trying to repeat the analysis and design experience. Development is viewed as a cre- ative activity that should be unconstrained. There should be no formal analysis, design, programming, or testing, even though each of these activities must be performed during the process. This approach denies the need for professional SEs or a profession of software engineering. In fact, it is frequently a cover for ignorance, or an excuse for laziness. This is a hacker's view of the world that is not appropriate to business organizations.

Problems

If building small systems (e.g., less than 2,000 lines of code in a 3GL, like Cobol, less than 400 lines of code in a 4GL), the developers, managers, and users may not have problems. Many financial analysis models and small systems in brokerage firms are

Project Life Cycles 33

developed using no life cycle and no methodology. But anything other than small applications are un- likely to perform exactly as desired, may not be completely tested when placed into production, and cannot be integrated easily into existing applications.

A less obvious problem is that this technique relies on individual problem-solving capabilities and knowledge. Studies by IBM and others show individual programmer differences of as much as 16 times in productivity and more than that in accu- racy. If the firm using this technique has only the best, top 5% of programmers on its staff, there is little risk. But how many firms actually have these people?

The view that we do not need a disciplined ap- proach to developing applications implies that just anyone can design and build good applications. Yet daily we hear of users who have built complex spreadsheet DSSs only to leave a company with no documentation and no procedures for the next user. We also hear of users (and, regrettably, people with the title software engineer) who are leaders of proj- ects that are canceled after spending millions of dol- lars, because the pieces just do not work together. For each type of application, there is a price with this view: DSSs without an architecture cannot be ex- tended; ESs without a plan are unreliable and un- maintainable; TPSs without architectures and plans can only ever support one small piece of business; integration across subject data areas is impossible. Even though ES and AI problem solving both use the learn-as-you-go technique, both require a differ- ent kind of discipline and rigor.

Current Use

As related above, about 75% of all companies in the United States do not use any life cycle or methodol- ogy to guide their application development work. With this statistic, it is no wonder that most applica- tions do not perform as intended, are delivered late, overrun the budget, and have unsatisfied users.

Future Use

For emerging technologies, techniques, or concep- tualizations of applications, this approach is an effective way to nurture development of a field of

34 CHAPTER 1 Overview of Software Engineering

knowledge. For these uses, it will remain. Unfortu- nately, it will also remain for companies who believe that discipline and order cost too much, and who will continue to suffer the risks involved with relying solely on one person's skill and integrity.

In summary, life cycles define a global break- down of activities in the life of an application. No life cycle prescribes how to actually do the work within the phases of a PLC. For that definition, we turn to methodologies.

METHODOLOGIES ___ _

Methodologies are procedures, techniques, and pro- cesses used to direct the activities of each phase of a software life cycle. There are five classes of methodologies: process, data, object, semantic, or none. Each has its own unique view of an application that relates to its historical context, its own short- comings, problems, and futures. In this section, a brief overview of the classes of methodologies is given with a general list of documents produced by the analysis phase, problems with the methodology, and short analysis of the methodology's current and future use. Much of this material should be review. If it is not review, don't panic. Use this material to learn the terminology for discussing the methods in detail later.

In addition to the methodologies prescribing how to do an analysis and design, a special class of meth- ods advises how to bring users into the process. That class, sometimes called social methodologies, is the last part of this section.

Process Methodology History

Process methodologies take a structured, top-down approach to evaluating problem processes and the data flows with which they are connected. Process methods developed during the 1970s in response to increasing complexity of application processing, increased complexity of operating system environ- ments (e.g., the IBM 360 generation of hardware), and the introduction of disk file processing with sequential, indexed, and direct access methods. The

documentation produced by the process approach 14 includes, for example, context diagrams, data flow diagrams, data store definitions, and structured English process descriptions. In the course of a com- plete application development, many other types of analysis and design documentations are developed. These additional documents are discussed in the chapters on analysis and design.

Current Use

Individual techniques such as context and data flow diagrams· are widely used and also supported in CASE environments. Other techniques have been replaced by newer methods, for example, paper- based data dictionaries have been replaced by CASE repositories or active data dictionaries, file design has been augmented by normalization, entity rela- tionship diagramming, and so on.

Future Use

Process methods as attributed to DeMarco and others will ~ade as a distinguishable methodology with context and DFDs melded into a collection of techniques that will he used to support methodol- ogy customization.

Data Methodology History

Data methodologies begin analysis activities by first evaluating data and their relationships to deter- mine the underlying data architecture. When the data architecture is defined, outputs are mapped onto inputs to determine processing requirements. The most used data methodology is information engi- neering (IE) which was described by Finkelstein and Martin.15 Documentation produced by the data

14 The architects of process methods were Yourdon and Con- stantine, 1978; DeMarco, 1979; Gane and Sarson, 1979.

15 See Martin, James, Information Engineering, Book 1: Intro- duction, Book 2: Planning and Analysis, Book 3: Design and Implementation, Englewood Cliffs, NJ: Prentice-Hall, 1990; and Finkelstein, C., Information Engineering, 1991.

approach discussed in this text is that of informa- tion engineering.

As the use of DBMS software became pervasive during the late 1970s and early 1980s, software engineers recognized a need for improved ways of designing data structures. Many methodologies were developed that concentrated strictly on the data aspects of applications with the processing added as an afterthought [cf. Warnier, 1981]. As an attempt to address the entire application development life cycle, Martin and Finkelstein borrowed techniques, packaged them in a new methodology, and inte- grated them to provide the first 'womb to tomb' methodology. Information engineering, the result- ing methodology, begins with enterprise level analy- sis and proceeds through identification of applications and individual project life cycles. The methodology was not the work of one person; rather it integrates concepts that were thought of as the best at the time including entity-relationship modeling, normalization and other techniques relating to DB design. The enterprise level techniques are adapted and widely used in organizational reengineering.

An example of analysis documentation developed using information engineering includes entity rela- tionship diagrams (ERD), entity hierarchy diagrams, process dependency diagrams, process hierarchy diagrams, and third normal form logical database definition.

Current Use

Information engineering is gaining acceptance in some of the largest U.S. corporations (e.g., Mobil, Texaco) and is used in Australia (where Finkelstein lives) but is not widely used otherwise. Other 'data' methods enjoy regional popularity.16

Future Use

Some of information engineering's appeal is its position as the only methodology that represents all levels of organizational analysis from enterprise

16 Michael Jackson's Jackson Structured Development (JSD) is used in England. Warnier-Orr techniques are used in compa- nies such as AT&T. Chen's entity-relationship approach is used in isolation in many corporations but is also part of in- formation engineering.

Methodologies 35

through application. IE cannot easily be altered, at this time, to accommodate object orientation or knowledge engineering. But it will be around for some time with parts of the methodology replaced in a customizing process. Individual techniques such as ERD will gain even more acceptance in the future as data administration increases.

Object-Oriented Methodology History

Object-oriented methodology is an approach to system life-cycle development that takes a top-down view of data objects, their allowable actions, and the underlying communication requirement to define a system architecture. The data and action components are encapsulated, that is, they are combined together, to form abstract data types. Encapsulation means that if I know what data I want, I also know the allowable processes against that data. Data are designed as lattice hierarchies of relationships to ensure that top-down, hierarchic inheritance and sideways relationships are accommodated. Encapsu- lated objects are constrained only to communicate via messages. At a minimum, messages indicate the receiver and action requested. Messages may be more elaborate, including the sender and data to be acted upon.

Object orientation developed during the 1980s and 1990s as producing desirable software attrib- utes (for instance, minimal coupling) espoused since the 1960s. Object-oriented designs can result in soft- ware with desirable properties: modularity, infor- mation hiding, functional cohesion, and minimal coupling. Like the other methodologies, bad designs lead to bad applications.

Object orientation appears able to support the ab- stract concepts needed to automate meta-data and meta-meta-data needed for expert, intelligent, and multimedia applications. Meta-data gives meaning to data and is information about data. For instance, a name or data type is information about the data in the example (see Figure 1-23). Meta-meta-data is information about the meta-data that describes its allowable use to the application. These types of definitions allow you to plug-in any hardware

36 CHAPTER 1 Overview of Software Engineering

Data Cathrine Ratliff

Meta-Data Name, Alpha, 16 Characters

Meta-Meta-Data Type=Data Field, Logical Link = Process, Physical Link, Process, DBMS (EMPL DB)

Data D01

Meta-Data Drive Address, Alphanumeric, 3 Characters

Meta-Meta-Data Type=Disk, Logical Link = I/O Driver Physical Link = SCSI Channel 0

Data SC01

Meta-Data Screen ID, 80x20 Alphanumeric Characters

Meta-Meta-Data Type=3270 BlacklWhite Terminal, Logical Link = I/O Driver, Process Physical Link = SCSI Channel 0

FIGURE 1-23 Object-Oriented Example

device, software, or data to create an application environment.

Object orientation is still an immature discipline, undergoing almost daily evolution and change. As such, the details presented for object orientation in this text may be considerably different in five years.

The documentation produced by one object approach for analysis/design includes, for example, a succinct paragraph describing the system, an object list, an object attribute list, an action list, an action attribute list, a message list, and several optional diagrams.

Current Use

Object orientation is the usual approach to devel- oping applications in aerospace and defense organi-

zations, and experiments with its use are occurring in most large companies. Object design appears to be the best suited method for real-time applications, and is useful for on-line applications. It is one of the IS buzzwords of the 1990s and appears often in every trade periodical, research journal, and booklist.

Future Use

Keeping in mind that it is neither a complete nor a mature methodology, the current high level of activ- ity implies a future full of object-oriented applica- tions, databases, and CASE tools. When done properly, object orientation appears capable of sup- porting many complex environments, including: intelligent applications, multimedia applications, and reusable code and reusable design objects. Look for object ori-entation to be around for a long time.

If you only learn one new methodology, this will be a profitable one to learn for the future.

Semantic Methodologies History

Semantic methodologies are used in the automation of artificial intelligence (AI) applications. AI, like object orientation, is in its infancy. By definition, AI methodologies are also in their infancy.

AI applications cover a broad range of intellectual difficulty, ranging from recognizing to reasoning to learning (see Figure 1-24). Most AI applications in business are on the lower end of the AI spectrum, and provide limited reasoning in applications. Busi- nesses are experimenting with more complex uses of AI.

This discussion is about AI applications that reason through problems to achieve expert level competence in a specific area of expertise. These applications are usually called knowledge-based systems (KBS) or expert systems (ES) applications. Most ES contain the reasoning processes of one or more human experts.

Semantic approaches to system life-cycle devel- opment automate the meaning of objects in the application. For example, a knowledge object might be composed of objects describing a 'legal' hard- ware configuration. The reasoning process in the ES first asks characteristics of hardware objects that are required for a system (e.g., speed of disk drive, size of disk drive). Then, using the required characteris-

Percent of Companies Using AI

50%

25%

Methodologies 37

tics as constraints, the ES determines 'legal' config- urations that meet the constraints.

At present, data and rules for evaluating data in semantic applications are defined together within the application and not separated as in traditional appli- cations. There is no separation of analysis and design activities per se for semantic applications either. Rather, the task of knowledge engineering encom- passes three general tasks: eliciting knowledge from an expert, analyzing it to define the heuristics and data, and automating the information in some logic- based language, such as Prolog.

Current Use

Knowledge-based systems are a growing segment of the applications portfolio in organizations today. This is another class of methodology, along with object orientation, that is in its infancy. Semantic methods are somewhat more well-defined for busi- ness use than object methods. But, the extent of spe- cial training and expertise required to implement intelligent applications make the knowledge inac- cessible to most practicing SEs.

Future Use

There is a significant amount and diversity of research that will result in mature semantic method- ologies over the next decade. One major activity in the future will be the addition of expert intelligence to current transaction, query, data analysis, and DSS systems. Semantic method use will continue to be a growth area in IS for the foreseeable future.

/ /

Recognizing

/ Reasoning

_/ _ Learning

----~- - - - - - -~---0% _-- 1970 1980 1990 2000 2010

FIGURE 1-24 Range of Artificial Intelligence Applications

38 CHAPTER 1 Overview of Software Engineering

STOCK MARKET SELLERS, INC. Stock Market Sellers, Inc. (SMSI) is a brokerage firm that had a reputation for slow, steady growth and low aggression relative to its industry. In 1988, SMSI embarked on a new, more aggressive position and began intro- ducing new products practically overnight to keep up with its competition.

Automated support for SMSls new prod- ucts was the responsibility of Alec Ranier, a young Brit who was a whiz-kid programmer. Alec was promoted several times until, in 1991, he managed a staff of twenty pro- grammers who developed applications to support new products.

When asked about his use of life cycles and methodologies, Alec said, "No, we don't use any of those methodologies or CASE technologies. We don't have time. A broker wants a new product or a new analysis the day after they ask for it, basically."

"Don't programmers have to talk to each other to coordinate their work?" I asked.

He replied, "Not usually. That's how we get away with being so informal."

No Methodology History

When you develop an application using no method- ology, you rely on your own experience and prob- lem-solving ability to automate a solution to a problem. The use of no methodology is implied by the discussion of the learn-as-you-go life cycle. There are no general activities because what is done and how it is done are left strictly to the individual.

Current Use

Most organizations in the United States currently use no methodology. Example 1-5 illustrates the

"What happens when you do need to have programmers talk to each other?"

Alec answered, "It is a mess! (laughing) I'll grant that. We redesign, rewrite, and do a lot of code. Another side effect is we reinvent the wheel a lot. We probably have twenty programs that calculate collateralized mort- gage obligations and their returns."

I was astonished. "How do you verify their accuracy?"

"WelL we don't because we can't. That is a problem. We're actually trying to design a few key modules to be reusable, but it's a problem because the potential-using programs are all going to need to be rewritten. "

I asked, "Do you know any methodologies to help you do that design?"

Alec was honest. "Not really. I'm a good programmer who got promoted. Some day I might learn one but now I just want to 'get product out the door.' "

box companies get themselves into when they do not use a methodology. As in the example, compa- nies generally do not recognize any problems. On probing, they realize they have problems but have no idea for getting out of the situation short of rewriting all applications ... a solution they consider too costly.

Future Use

There are two major reasons why use of no method- ology will begin to disappear as a strategy for designing applications. First, trial-and-error is not a productive problem-solving strategy when the requirements for an application can be identified. Rather, a lack of methodology indicates laziness,

shoddy work practices, and lack of rigor, usually where it is most needed. Hopefully in the future, more organizations will recognize the need for rigor in developing applications ... their company's future might well depend on that recognition. Sec- ond, in order to use CASE tools and gain any of their productivity improvements, some methodology is required.

User Involvement in Application Development Each of the previous methodology discussions approaches the problem of application development as if it were done only by technically oriented per- sonnel. Where in this picture is the user of the application? Ultimately, users must supply informa- tion about the business functions and accompany- ing data that are being automated. In this section, we discuss user involvement in application devel- opment so you do not think SEs work only with each other. Although early applications were frequently built without discussions with users, isolation of SEs from users resulted in systems that might work tech- nically, but often did not meet user needs, and fre- quently disrupted the workplace.

In the early 1960s, Scandinavians began to voice concerns over the social side effects of applications. Early systems frequently des killed workers. Socially oriented methodologies of application development were created in response to the concerns about the effects of computerization. Social methodologies describe an approach to SDLC that attends to social and job-related needs of individuals who supply, receive, or use data from the application being built. Social methodologies are not really methodologies; rather, they are user involvement techniques. These techniques ignore technology completely and as- sume that some other approach to the technical aspects of application development is used along with user involvement.

The three main user-involvement techniques are joint application design (JAD), socio-technical systems (STS), and Ethics. The most practical and popular method is joint application design (JAD)

Methodologies 39

which requires an off-site meeting of all involved users and systems people, who meet for five to ten days to develop a detailed functional description of application requirements. Daytime meetings are used for new analysis; nighttime meetings document daytime results for review and further refinement the next day.

There are many benefits from user involvement in application development. First, it builds commit- ment by users who automatically assume ownership of the system. Second, users, who are the real ex- perts at the jobs being automated, are fully repre- sented throughout development. Third, many tasks are performed by users, including design of screens, forms, and reports, development of user docu- mentation, and development and conduct of accep- tance tests.

We assume that user involvement is not only desirable, but mandatory to truly effective application development product and process. This does not imply that such design will result, only that it can. Using a social approach assumes that job enlargement isa desirable by-product of automation.

The most important aspect of user involvement is that it must be meaningful. The users must be de- cision makers and staff who fully understand the im- pact of their decisions, and who are interested in participating in the development process. Using low- level staff, or assigning 'expendable' managers is not the way to have users participate in developing applications. Neither is co-optation of users desired. Co-opting means that you get people to agree with the outcome because they 'participated' in the deci- sion process even though the alternatives are all de- fined by the application developers.

The goal of user participation is for IS and non-IS people to work together as business partners rather than as adversaries. When users participate, they make all nontechnical decisions. The SEs explain and shepherd users to make semi technical decisions, for instance, design of screens. The SEs explain both the impact and reasoning of major technical deci- sions. If this discussion implies that users call the shots, that is what is meant. User involvement means that users run the project, making the majority of

40 CHAPTER 1 Overview of Software Engineering

decisions and having final say on all major deci- sions. The SEs and other Management Information Systems (MIS) staff act as service-oriented techni- cians, as they are.

In many organizations, the social aspects of work are specifically felt not to be within the scope of re- sponsibility of software developers. If the develop- ment staff are only technical in their orientation, this is probably true. Then it is the responsibility of the project manager to educate user and IS management about the need to design the organization and jobs as well as the system.

In the United States, high levels of user involve- ment are still unlikely and usually at the discretion of the project manager. In many cases of 'user involve- ment', the reality is that users are not involved. Even in companies that have user project managers, IS staff can ignore user desires and build the systems they want to build.

SEs and users who have participated in user- involved application development tend to be fully committed to user involvement as a requirement in application development. Hopefully, the days of ap- plication development by technicians who never consult with users are gone, or soon will be. Future generations of computer-literate users will demand a say in how their systems are developed. The progno- sis, then, is for user involvement to continue slow growth of use in the United States.

OVERVIEW ______ _ OF THE BOOK _________ _ In this chapter so far, we have prefaced and intro- duced the major topics of the book. In addition to identifying specifically how the above topics will be used later in the book, there are many more topics that you will also learn that we briefly out- line here.

Applications Applications are the underlying topic of all we dis- cuss in this text. You should already have a fairly good understanding of what an application is. We will not discuss that topic further.

What we will discuss throughout the text is how application types relate to each of the topics. You will get answers to questions such as: Which life cycles and methods are most appropriate to which application types? When do application characteris- tics and technologies affect the choice of life cycle and/or methodology?

Project Life Cycles Project life cycles should also have been mostly re- view. PLCs, per se, are not mentioned again. Rather, the phases of feasibility, analysis, design, testing, language selection, and testing each have their own chapters. One difference between this text and most other texts is that multiple methodologies and devi- ations from the standard PLC are discussed in the context of each phase.

Part I: Preparation for Software Engineering Part I prepares you for the tasks of developing and implementing an application. The chapters in this section introduce you to

• research on learning and software engineering to highlight an effective means of studying and practicing this work

• the ABC Video case used throughout the text • the roles of project manager and software

engineers • methods of gathering information about the

task to be automated • proper behavior during application de-

velopment

Part II: Project Initiation Mter you know how to elicit information, we begin talking about project development. Part II first discusses organizational level re-engineering, a method to developing application plans. Then, feasi- bility analysis is detailed in the next chapter. These discussions are separated from those about the methodologies because these tasks are assumed by most methodologies. For each chapter, the theories

underlying the concepts are introduced, a method of performing the tasks is described, and examples are provided from ABC to help make the information concrete.

Part III: Analysis and Design Part III is devoted to analysis and design activities that each take about 20% of application development time. During analysis, the SE concentrates on defin- ing what the application will do. During design, the requirements are translated to define how the appli- cation will operate in its specific hardware and soft- ware environment. One representative methodology from each broad class of methodologies is discussed in detail in Chapters 7 through 12. Chapters 7 and 8 discuss analysis and design, respectively, for process methodologies. Chapters 9 and 10 relate to data- oriented methodologies. Chapters 11 and 12 present object-oriented methodologies. Based on ABC's rental processing application, we will discuss what each methodology can and cannot do for you during logical definition of application requirements. For each methodology, the theories underlying its de- velopment are described and representative CASE tools available to support application development are provided.

At the conclusion of the methodology discussion, Chapter 13 recaps the graphical representations and thinking processes used in each methodology. The methodologies are compared and contrasted on sev- eral sets of criteria. In addition, future developments in technology and applications and their impact on methodologies are developed.

Some tasks are performed during analysis and de- sign, but are not addressed by most methodologies. These forgotten activities are included in this section and discussed in Chapter 14.

Part IV: Implementation and Operations Many tasks remain to complete an application de- velopment, including programming, testing, main- tenance, and change management. Each of these topics is related to application and methodology types in Chapters 15 through 18. For every chapter,

Overview of the Book 41

applicable automated support tools are identified. Chapter 15 discusses the selection of a target lan- guage for an application. Code for applications will be increasingly generated by the CASE tool. As CASE use increases, the need to code, then, is replaced with a need to choose an appropriate language.

Similarly, many applications now use purchased software rather than customized code. Chapter 16 discusses the selection and purchasing of hardware, software, or consulting services for application development.

Testing is required of all applications developers at present whether a machine generates the code or not. Chapter 17 discusses different types of testing, testing techniques, and the development of test plans for an application.

Change is a way of life in application develop- ment. Chapter 18 deals with the management of change for documents and software. The section on software maintenance describes re-engineering as it applies to deciding whether or not to replace or maintain code. Several replacement options are presented.

Finally, the last chapter discusses careers in soft- ware engineering. Keeping current in a profession that constantly changes is a daunting task. In Chap- ter 19, you will receive tips on the type of reading you should do and the types of professional organi- zations you might join to enhance your ability to stay current. In addition, you will learn the types of jobs available to you as a novice software engineer and an approach for deciding on a starting job.

SUMMARY ________ ~ __ _ This chapter prefaces and summarizes the contents of the text. Software engineering was defined as a systematic approach to the development, operation, maintenance, and retirement of software. A software engineer is a person who has a broad knowledge of methodologies, life cycles, languages, and all as- pects of software development, and who applies that knowledge to the systematic development of appli- cation systems. The two main goals of software en- gineering are to build a quality product through a quality process.

42 CHAPTER 1 OveNiew of Software Engineering

N ext we defined applications characteristics, re- sponsiveness, and types. An application is a set of related programs that perform some business function. The characteristics that all applications have in common are data, processes, and constraints. Application responsiveness reflects whether the application is batch, on-line, or real-time. Finally, application types include transaction processing, query, DSS, and expert systems.

Project life cycle is the breakdown of work for initiation, development, maintenance, and retire- ment of an application. Alternative project life cycles include sequential, iterative, and the learn- as-you-go. The sequential life cycle includes a series of phases for initiation, feasibility, analysis, concep- tual design, design, programming/unit testing, test- ing, implementation and checkout, maintenance, and retirement.

Methodologies are policies, techniques, and tools that guide the activities of each phase of a software project life cycle. The five classes of methodologies in this text are process, data, object, social, and se- mantic. Process and data methodologies are fairly mature guidelines for developing applications. Object and semantic are emerging methodologies that help us build systems using artificial intelligence and new technologies. Social methods are really techniques for involving users and assume the use of one of the other four methodology classes as well.

REFERENCES __________ __

Boehm, Barry W., Software Engineering Economics. Englewood Cliffs, NJ: Prentice-Hall, 1981.

Booch, Grady, Software Engineering with Ada, 2nd ed. Menlo Park, CA: Benjamin-Cummings, 1987.

Booch, Grady, Object Oriented Design with Applica- tions. Redwood City, CA: Benjamin-Cummings, 1991.

Bostrom, Robert P., and J. Stephen Heinen, "MIS problems and failures: A socio-technical per- spective," Part I, MIS Quarterly, September 1977, pp.17-28.

Chen, P. P-S. "The entity-relationship model-Toward a unified view of data," ACM Transactions on Data Structures, Vol. 1, March 1976, pp. 9-36.

Davis, Gordon, and Margrethe Olson, Management Information Systems: Conceptual Foundations, Struc- ture and Development, 2nd ed. New York: McGraw- Hill, 1985.

Department of Defense, Standard for Application Devel- opment, Guideline #2167a. Washington, DC: US Government Printing Office, 1985.

DeMarco, Tom, Structured Analysis. New York: Yourdon Press, 1979.

Eliason, Alan L., Online Business Computer Applica- tions, 2nd ed. Chicago, IL: Science Research Associ- ates, 1987.

Feigenbaum, E., P. McCorduck, and H. P. Nii, The Rise of the Expert Company. New York: Vintage Books, 1989.

Gane, c., and T. Sarson, Structured Systems Analysis: Tools and Techniques. Englewood Cliffs, NJ: Prentice-Hall, 1979.

Gane, Chris, Computer-Aided Software Engineering: The Methodology, The Products and the Future. Engle- wood Cliffs, NJ: Prentice-Hall, 1990.

IEEE, IEEE Software Engineering Dictionary. Piscat- away, NJ: IEEE Press, 1983.

Lientz, R P., and E. R Swanson, Software Maintenance Management: A Study of Maintenance of Computer Application Software in 487 Data Processing Organi- zations. Reading, MA: Addison-Wesley, 1980.

McClure, Carma, CASE is Software Automation. Engle- wood Cliffs, NJ: Prentice-Hall, 1990.

Martin, James, Information Engineering, Book 1: Intro- duction, Book 2: Planning and Analysis, Book 3: Design and Implementation. Englewood Cliffs, NJ: Prentice-Hall, 1990.

Necco, Charles R., Carl L. Gordon, and Nancy W. Tsai, "Systems analysis and design: current practices," MIS Quarterly, December 1987, pp. 461-476.

Parnas, D. L., "On the criteria to be used in decomposing systems into modules," Communications of the ACM, Vol. 15,#12, 1972,pp. 1053-1058.

Sprague, Ralph H., Jr., and Hugh J. Watson, Decision Support Systems: Putting Theory into Practice. Engle- wood Cliffs, NJ: Prentice-Hall, 1986.

Swanson, E. R, Information System Implementation: Bridging the Gap between Design and Utilization. Homewood, IL: R. D. Irwin, 1988.

Turban, Efraim, Decision Support and Expert Systems': Management Support Systems. New York: Macmillan Publishing Company, 1990.

Yourdon, Edward, and Larry L. Constantine, Structured Design. New York: Yourdon Press, 1978.

KEy TERMS _______ _

adaptive maintenance analysis application characteristics application

responsiveness application type automated interface batch applications class coding computer-aided software

engineering (CASE) conceptual design configuration

management constraint control constraint corrective maintenance data data analysis applications data methodology data warehouse decision support

applications declarative language design development economic feasibility embedded system

engineering executive information

system (EIS) expert systems (ES) feasibility goals ofSE group decision support

systems (GDSS) hierarchic logical data

model human interface implementation inferential constraint initiation input interactive processing iterative project life cycle joint application design

(JAD)

knowledge acquisition subsystem

knowledge base learn-as-you-go project

life cycle logical data model maintenance manual interface meta-data meta-meta-data methodology network logical data

model object-oriented logical

data model object-orientation on-line application operations organizational

feasibility output perfective maintenance physical data model postrequisite constraint prerequisite constraint process process methodology product program design proto typing quality assurance query application real-time application relational logical data

model retirement retrieval SE process SE product semantic methodology sequential project life

cycle social methodology software Software Development

Life Cycle (SDLC) software engineer software engineering

spiral application development

storage structural constraint structured problem subsystem design technical feasibility testing time constraint

Study Questions 43

transaction-oriented application

Transaction Processing System (TPS)

unit testing unstructured problem validation verification

EXERCISES _______ _

1. Develop a table of application characteristics down the rows in the first column, and the appli- cation responsiveness levels across the columns. How does each application characteristic differ for each level of responsiveness?

2. Develop a table of application characteristics down the rows in the first column, and the methodology classes across the columns. Begin to develop a comparative table of the way each methodology prescribes documenting the re- quirements for each application characteristic. You will not be able to complete the table at this point.

STUDY QUESTIONS _____ _

1. Define the following terms:

application characteristics

batch application constraint data methodology meta-data object on-line application

project life cycle proto typing real-time application semantic methodology time constraint unstructured problem validation

2. Define how each methodology'S history is affected by technology.

3. What are the four application types and how do they differ?

4. What are the subtypes of decision support sys- tems? How do they differ?

5. What is computer-aided software engineering?

44 CHAPTER 1 Overview of Software Engineering

6. What is an application? 7. How do real-time and on-line applications

differ? 8. What is the range of artificial intelligence

applications? What area do most expert sys- tems cover today?

9. What is the starting point for analysis in a process methodology? for a data methodology?

10. Why is it important to know the orientation of a methodology?

11. If most companies do not use methodologies, why should you learn how to use them?

12. Is some methodology better than none? Is some life cycle better than none? Discuss the pros and cons of using and not using method- ologies and life cycles.

13. What are the components of a feasibility study? What type of analysis is performed for each?

14. What are the phases of a sequential develop- ment life cycle? How do they vary when you use prototyping?

15. What are the five types of constraints? Give an example of each.

16. What are the four application types? Give an example of each.

17. How do on-line and real-time applications differ?

18. Draw a diagram showing the operation of a typical batch application. Then draw a diagram showing the operation of a typical on-line application. Discuss how they are similar and how they are different.

19. What is the difference between a semantic methodology and an object-oriented methodology?

20. What is quality assurance and when is it performed?

21. What is meaningful user involvement? 22. List the three uses of prototyping. 23. What are the dangers in using prototyping? 24. What is wrong with a learn-as-you-go life

cycle? 25. What is dangerous about using no methodol-

ogy and no life cycle?

* EXTRA-CREDIT QUESTIONS 1. Develop the pros and cons of the ethical issues

described in Example 1-5. What is your opin- ion? How can the open questions be resolved?

2. What can be done to further the involvement of users in applications development? Should this be done? How can it be done in an ethical way?

3. Are methodologies as you know them at this point culture free? How can culture get in the way of their use in a multinational organization?

4. Think beyond this text to the development of applications in a multinational organization. What are cultural and ethical issues in building applications that will be used in many countries of unequal computer resources?

PREPARATION ----------------------~----------------FORSOFTWARE ________ ~ ____ ~-

ENGINEERING --------------------------------~-----

The four chapters in this section prepare you for the actual work of software engineering. Chapter 2 serves two purposes: First, research on learning and software engineering are summarized to give you some ideas about how to organize the text's material. Good mental maps of the information ease your learning and help you keep the different methodolo- gies distinct. Second, a case describing an applica- tion to be built is introduced: ABC Video rental processing. The application is developed in each of the methodologies we will discuss.

Project managers and software engineers perform different duties and are usually different individuals on a project team. In Chapter 3 you will learn the

roles of project managers and software engineers and how they complement each other. The kinds of questions we will answer are: What does a project manager do? How does it differ from a software engineer? Why is knowledge of management impor- tant to a software engineer?

Last, in preparation for developing systems, Chapter 4 defines techniques for gathering the in- formation you need to analyze and design a system. Then, we will discuss how you should act and how to evaluate what you are told during information gathering. Sample dialogues between ABC man- agers and the software engineering team illustrate the information presented in Chapter 2.

CHAPT ER2

LEARNING ------------------------.. --------~~

APPLICATION ----------------------.. --------~--~

DEVELOPMENT --------------------,.--------.. --~

INTRODUCTION ____ _

There is rarely one 'right' solution application in software engineering. Just as in Chapter 1, we said there is rarely one 'right' way of getting a solution for an application. Despite this ambiguity of the software engineering process and product, there are approaches to problem solving in software engi- neering that are more successful than others. Your gaining experience to know those approaches is one goal of this text. To assist you, this chapter dis- cusses how we learn, how we evolve from novice to expert, and how you can apply this knowledge to mastering the material in this book. In the second section, the case study we follow throughout the text is introduced. The case is related to learning ap- proaches suggested in the first section, and to the review in Chapter 1. First, let us turn to learning and the development of expertise.

How WE DEVELOP ___ _ KNOWLEDGEAND ____ _ EXPERTISE ______ _

Learning There are two basic stages of skill development in learning that we call the declarative and procedural

knowledge development stages. In the declarative, or what stage, we learn basic skills, rules, and activ- ity sequences. We learn declarative knowledge be- fore process knowledge. During the process, or how stage, we imbed the what knowledge into a process. We learn how to perform the activity sequences, and how to integrate the different rules. In the last part of how learning, we internalize both the declarative and process knowledge so they become part of our automatic memory.l

The internalization of declarative and process knowledge occurs through

• experiencing real life • doing classroom exercises • reading cases and solutions • developing practice problems with feedback • studying both good and bad examples

Cognitive psychology and artificial intelligence research describe human thinking as case-based reasoning. A case is a predetermined representation of event sequences in a particular setting. 2 During

1 For a complete discussion of declarative and process knowl- edge, see Chi, Glaser, & Rees, 1982.

2 Kintsch & Mannes, 1987, discuss case-based reasoning. Schank & Abelson, 1977, also writing about artificial intelli- gence call case-based reasoning "script" based reasoning. The two terms--case and script-are essentially the same.

How We Develop Knowledge and Expertise 47

Declarative Method Knowledge

Procedural Method Knowledge

Problem Domain Knowledge

Problem Statement

Mental Model of Problem

Mental Model

CASE Knowledge

Analyist Knowledge

r-------t ... ~1 of Problem ~ Solution Solution Representation

Declarative Method Knowledge

Procedural Method Knowledge

Methodology Knowledge

Mental ~ Model of

Methodology

FIGURE 2-1 Interaction of Knowledge Types in Systems Analysis (adapted from Vessey & Conger, 1993)

learning, we recognize patterns of alternatives, expected actions, and decisions that work. After reaching a detailed level of understanding of the pat- terns, we internalize a case, imbedding the patterns, actions, and decisions into our knowledge structure.

In systems analysis, two different types of cases might be appropriate: analysis task and problem task. Figure 2-1 illustrates the information used in analysis and how they interact. The analysis domain case is the declarative and process knowledge of actions needed to do the analysis task. We can divide analysis tasks further into subjective and objective activities. Subjective analysis activities are subprob- lems in application development that accompany all methodologies. Some representative analyst knowl- edge includes knowing

• what life cycle is appropriate • what data-gathering technique is likely to be

most effective • when data gathering is complete enough • when we should iterate through earlier stages

of the process

During objective analysis activities, we describe the functioning and design of a proposed application. We may further subdivide objective activities into techniques used, such as methodology or computer- aided software engineering (CASE) tools. When we do not follow a methodology, we rely on our own problem-solving ability and knowledge.

The second type of knowledge required to develop an application is problem task case

48 CHAPTER 2 Learning Application Development

knowledge. Problem task knowledge is the de- clarative and process knowledge of the problem domain being automated. For example, order entry- inventory control processing describes a general problem task domain. If we add that the system is for a retail business, it is less general. If we add that the system is for Sears and Roebuck, for instance, it is less general again. During the automation process, we apply our knowledge of how to do analysis to the problem domain. We use analysis knowledge both to describe the current system and to develop the functions of the new system.

Use of Learned Information Case-based reasoning relies on our recall of past similar experiences, that is, analogous events. Anal- ogies are similar experiences that we use to

• classify problems • plan a course of action • suggest explanations • suggest means of recovery from failures

When the analogy matches the current situation, we use it to predict what will happen based on the analogous event. When the analogy does not fit, we look for similarities between current and past expe- riences from which we can generalize to build new analogies.

During the learning process, we build our own examples to help us learn new information. We rec- ognize similarities between different episodes, com- pile the similar, generalized events, and form a new memory case. This generalization process is learn- ing. Learning calls for failure of an analogous ex- pectation to work for the current case, followed by explanation of the failure which we make sense of and fit into our own memory as a new case.

Why is the use of analogy so important? System analysis is work that requires judgment and adjust- ment. System analysis has nonoptimal solutions (i.e., relies on satisficing), and takes place within a bounded knowledge base. Analogical reasoning is better for systems analysis than reasoning by under- standing because analogical reasoning relies on experience to generate cases while understanding

relies on experimental trial and error. When ana- lysts have applicable analogous experience, we try to fit that knowledge to the current situation to serve several purposes: understanding of situational dy- namics, generating options, and calculating the chance of success of an application option.

In systems analysis tasks, there are frequently one or more aspects of a problem that are unfamiliar to the analyst. In unfamiliar situations, analysts first rely on aspects of the work with which we are familiar, then enlarge and broaden the applicability of our analogical knowledge. But what happens when we do not have the experience to use analogies or our analogies do not appear applicable? Then, we turn to expert/novice differences in problem solving for general tasks to see what happens.

Expert/Novice Differences in Problem Solving The differences between experts and novices are dif- ficult to pin down. Expert analysts are considered to have an extensive, internalized knowledge upon which they draw to apply analogous problem domains and problem-solving techniques to a cur- rent analysis task. They work quickly, knowing what they know and what they don't know, and are able to determine at least one workable solution quickly, sometimes within minutes. A novice, on the other hand, is slow and unsure, exhibiting some, but not all expert behaviors, and making mistakes throughout the process. Experts and novices differ consider- ably in their approaches to solving problems. For instance, novices

• develop local mental models of problem parts, that is, work on bits of small problems rather than on integrating the bits into a whole. For example, novices concentrate on adding customers instead of concentrating on customer maintenance, including add, change, delete, and retrieval processing.

• use undirected search in a trial and error manner (for example, to determine the utility of a new technology). The undirected way is to look through several magazines to see if

How We Develop Knowledge and Expertise 49

they have articles on the technology, instead of looking through a subject index at a library.

• analyze surface features (for example, think of control statuses and their allowable values instead of the implications for processing that relate to each value)

• simulate design entities in isolation (for exam- ple, simulate video rental processing without paying attention to how it works with return processing)

• misconceive actions (for example, never analyze the complete rental/return cycle)

• fail to integrate the chunked local models into a whole global problem solution (for example, fail to integrate history processing into the rental/return cycle)

Novice problem-solving strategies include satis- ficing and conservatism. Satisficing means to know- ingly elect a nonoptimal solution. 3 Novices search for any solution; experts search for the best solu- tion. Conservatism is minimal change of a solution; it means the problem solver takes the first solution rather than testing alternatives. Novices search for alternatives only when the existing method fails, but they cannot always tell that the existing method is failing. So, in becoming conservative, novices use their first conceptualization of a problem. In contrast, experts use optimizing and alternative evaluation in analysis and design. Because of conservatism, novices suffer breakdowns-errors in the problem- solving process. Since the process is both con- strained and directed by a methodology, the breakdowns relate to the analyst's mental model of the problem and use of a methodology to develop a mental model of the solution.

Conversely, experts do

• categorize problems (for instance, ABC Video Rental processing is a simple form of an order entry problem)

• develop global mental models of the problem that they 'see' or visualize the entire problem solution

3 See Simon [1960] for a more complete discussion of satisfic- ing and decision making.

• use directed searches in problem expansion and identification of similar problems

• analyze deep structures, not just define terms but analyze their meaning, fit, and the political and technical implications

• use goals and plans to determine what steps to take in finding a solution

• perform skilled sequences of actions including mental simulation and top-down expansion of the problem

Experts use knowledge of the application devel- opment process to direct actions independently from the problem. For instance, regardless of the prob- lem or methodology, you always begin with a defi- nition of the scope of the activity. This abstract knowledge about structuring of a problem, proce- dures, and process uses internalized cases and plans, and relies on experience. Problem analysis and design involve decomposition of a problem into sub- problems, relying on sub strategies of analogy and understanding to guide decomposition in a top-down manner. When the problem domain is new and the problem type is new, expansion progresses breadth- first. But, for problem solving in familiar domains, experts prioritize areas on which to focus, using a depth-first strategy for each new area.

With methodology training, practice, and feed- back, novice software engineers can display many expert behaviors in a short time, i.e., after analyz- ing and designing as few as three case problems.4

Methodologies sequence events, and constrain and direct the actual analytical process. Guide- lines and heuristics about what to analyze and how to analyze it are supplied by the method with com- ments supplied by the text and instructor. Relation- ships are identified to link each deliverable within a method, associating the thought processes used to develop the deliverables. All of these directed activities speed and simplify both the develop- ment of expert behavior and the internalization of methodologies.

Research on whether there are differences between methodologies for facilitating expert

4 See Vessey & Conger, 1993, for an example of this type of study.

50 CHAPTER 2 Learning Application Development

behaviors is in its infancy. Several laboratory studies by the author and others identify process methods as easier to learn, with no noticeable difference between methodologies in the delivered quality of the resulting proposed logical system. One thing we do know is that not all methodologies work equally well for all problems. This information will be dis- cussed in Chapter 13.

How to Ease Your Learning Process In this text, we assume that you want to go beyond knowing the basics of systems analysis and design, but that you do know the basics. We assume you have a working knowledge of structured systems analysis and design, data base, and programming. Most systems analysis and design courses practice developing data flow diagrams. In this text, we will discuss DFDs and compare and contrast them with other methods, building on your current state of knowledge. If you don't feel confident about your ability to draw data flow diagrams, there are exer- cises at the end of Chapter 6 for practice. For data- base knowledge you should know and understand the value of normalization, and you should be familiar with SQL and at least one database package. For programming, you should have practice with some procedural language (e.g., Cobol) writing and debugging programs that read sequential files to gen- erate reports. Knowledge of data structures, files, and a structured language, such as Pascal, is helpful but not necessary to using this book successfully.

Application development is essentially a prob- lem-solving exercise which is unique because there is rarely one right or best answer to an automation problem. Practitioners and professors of application development will both tell you that the best way to learn software engineering is to "Do it!" A quote to support this idea comes from Confucius:

I see and I forget, I hear and I remember, I do and I understand.

In doing, you will make mistakes, get con- fused, and think you are completely wrong. Don't

give up. Ask questions. Since we learn declarative knowledge first, try to remember as much of the pro- cedural what knowledge as you can while you read the text.

Try to think like an expert. Try to develop a global picture of the problem, methodology, or other subject in your mind and develop a plan of attack for your work session. Try to categorize probiems both that you are working on and that you are having with the work. Analyze your thought processes to develop a better understanding of your problem- solving approach. See if you can mentally simulate your application design, asking yourself how com- plete it is and how well it solves the problem. Attempt to analyze the' deep structures' by asking what each term means and what it implies. Talk about all of these thought processes both with your instructor and with other students.

Practice your reasoning process by reviewing the example in the text, by working through problems at the end of each chapter, and by talking to other students about the reasoning you used to develop your representations. Try different ways of doing the same thing. When you find mistakes, try to learn why what you did was not the best, and how you could have reasoned to develop a better answer. Through these processes, you will learn valuable problem- solving skills that will be useful throughout your career in IS.

APPLICATION _____ _ DEVELOPMENT ________ _ CASE _______ __ Now, we are going to switch gears, away from the theoretical to the realistic. In this section, we pre- sent the case used throughout the text. The setting, a video store, is used for two reasons. First, it is a sim- ple business that should allow you to build an accu- rate, complete mental model. A complete mental model is crucial to developing an accurate solution in any methodology. Second, most of us rent videos and have analogous knowledge that we can practice using. As you read the cases, try to apply the ideas discussed in the previous section. Ask yourself,

What is the 'big' picture? Do I understand this prob- lem? Use analogies from your experiences as a video store customer (or clerk) to the way Vic wants to run his business.

The case-ABC Video Rental Processing-is representative of the class of order processing/ inventory control problems. Through its process- ing, customer, inventory, and order files are main- tained. In addition, ABC Video Rental Processing also is unique in that the video rental business is different from other businesses, and ABC's video rental processing is distinct from other video rental businesses.

ABC Video rental processing similarities and dif-: ferences from other types of order processing appli- cations highlight the importance of knowing how to learn. The similarities allow you to use analogy to determine the general requirements of the applica- tion. For instance, all orde~ entry applications require customer, order, and inventory databases. Con- versely, each company does its own detailed pro- cessing for order fulfillment. In ABC's case, it is a rental company, not a sales company, and rentals are not handled the same as sales. So even if you already know order processing, only a portion of the knowl- edge will be applicable to the rental situation. Keep this in mind when you discuss your own video store experiences. Each store has its own 'brand' of pro- cessing that might differ from ABC's. You must con- stantly evaluate the applicability of your past experience to the current situation, trying to use everything possible without forcing inappropriate past knowledge on the new client's application. Next, the context of the industry is described.

History of the Video Rental Business The video rental industry experienced phenomenal growth during the 1980s; The cost of entry into the industry was low, every mom-and-pop store, super- market, and small time entrepreneur entered the market. There was no stability in the market and competition was fierce. For instance, some busi- nesses required "membership fees," others did not.

Application Development Case 51

Some busine~ses charged one price for all rentals, usually about $2.00 per videotape per day. Some businesses Qffered promotions, such as "Two-Fer- Tuesdays," for which two tapes were the same price as one.

Sooribusinesses recognized that 80% of their videos were rented within 20 days of a tape's release into the market. With this recognition, video stores introduced a two-tiered pricing system, charging a new-release price and an old-release price. The mar- ket began to destabilize and small store owners, for whom the business was a sideline, were forced to decide if they wanted to devote the floor space to videos which soon became obsolete, or if they would abandon the business. They abandoned the business in droves and the video rental industry went through a period of consolidation.

The business today is stable, but is becoming monopolized by large chains: RKO and Blockbuster, for instance. ABC is an anomaly in this market because it is still a one-person, one-store operation. Vic, ABC's owner, would like to offer unique and useful services with a minimum of 'bureaucracy' in the process, and to eventually franchise his business expertise. With these goals in mind, we turn to his business requirements for defining the video order processing application Vic wants to build.

ABC Video Order Processing Task ABC Video rents video cassettes to customers. Since this business is becoming more competitive, Vic, the owner, wants to automate rental processing, inven- tory maintenance, and an expert system to speed and simplify the rental process. Vic prepared information for the consulting team to begin work. Vic tried to separate what he wanted from what he needed. So, the application business requirements are listed. Then, Vic's 'vision' of the application is presented.

General Requirements (Excerpted from a niemo from Vic to consultants)

... ABC Video currently owns two PC ATs and can get IBM compatible PCs cheaply. I would like all the

52 CHAPTER 2 Learning Application Development

machines hooked together somehow to share the information and have some equipment backup in case a PC breaks down. Each PC will have a printer for two-part forms. If the customer wants a copy of an order, he or she takes the top copy and signs the bot- tom. I need a signed copy to legally charge for unre- turned tapes.

I want to minimize typing throughout all the process- ing. Bar code readers are cheap. Can we use that tech- nology for keeping track of rentals?

There are three to six clerks doing rentals at anyone time, sharing machines. Rental/return processing is about 90% of the business. Machines should be allowed to do any processing, but should stay set at rental/return processing once there. I want to be able to know where every tape in the store is---out on rental, on the shelf, or waiting reconditioning.

Business requirements relate to customer, videos, rentals, and history information. Each of these requirements are listed below.

Customer Requirements

Customers are people who desire to rent videos for one or more days.

1. All customers must be 'registered'. This means they must have an easy to remember identifica- tion code, plus their phone number, name, address, credit card number, credit card type, and credit card expiration date on record before they may rent videos.

2. All members of a household should be able to share the same identification number.

3. Customers are required to pay rentals in advance and settle late fees before any new rentals are allowed.

4. Customers can return tapes in three ways:

• Drop off through a slot in the door • Drop off at the desk as they walk in to get

new videos • Drop off as they take out new rentals

5. Customers who fail to return tapes or damage tapes are charged for the video on their credit cards. Their customer record must be marked 'bad credit risk' and they will not be allowed to rent videos.

6. Retrieval of customer information must be allowed by identification, phone, name, address, zip, or credit card number.

7. All fields must be allowed to be changed as required.

8. Reports on number of new customers by month, by year, 'bad credit risk' customers, late return- ing customers, expired credit card numbers must all be allowed.

9. Deleting of customers must be allowed by the manager (Vic) only.

Video Requirements

Videos are taped movies, sports, or music events that are rented to customers.

1. All videos received in the store must be 'regis- tered' and tracked. Minimum information is identification number of copies, title, vendor, code, and date received. Video registration should use some technology (a bar code reader?) that does not require typing.

2. Individual copies of videos should be identifi- able for rental/return processing.

3. All copies of a title must be identifiable to track rental trends.

4. Counts of the number of rentals by copy and by title should be available for reporting.

5. Retrieval of video information for reporting must be allowed on any single or multiple crite- ria. Common reports needed will be for mainte- nance (based on how many rentals), number of tapes and rentals by type (e.g., musical, horror, drama, comedy), and for tapes that have not rented in the last x days.

6. I don't know how hard or expensive this is, but I would like some history information, such as

• rentals by copy by title • days rented by month by year by copy

by title • rentals by customer so I can warn them when

they try to re-rent a title

7. Future provisions should allow for

• tracking the number of days of rentals by copy by title or by dates of rentals

• multiple rental products (such as VCRs, cam- corders, CDs, video games, Nintendo game sets, and so on)

• automatic debit card or credit card payments • variable rental charges based on promotions,

date of receipt, and so on

Rental Processing

1. First, NO BUREAUCRACY! Second, the process MUST BE EASY. The rental process must not require customers to carry a card, must not require clerks to type much, and must be easy to learn. Return processing must also be simple and flexible.

2. To take out tapes, customer ID and video IDs are entered. All other information should be pulled from the computer.

3. The system should compute total charges, include late fees, and compute change for money entered.

4. The computer must be hooked to a cash drawer or cash register that unlocks when the money is entered.

5. A printed copy of orders must be kept and signed by customers. These go to accounting and are reconciled at the end of the day.

6. End of day totals for the cash registers must show a total number of tapes out, cash paid, tapes in, on-time tapes, late tapes, late fees, and a total amount of money in for the day.

Vic's Vision of Rental/Return Processing

Customers choose videos for rental either by taking the empty box from a shelf in the store or by telling the clerk the video name(s). The clerk retrieves the tape(s), which are filed alphabetically by name. The clerk enters customer identification (could this be phone number?) into the system to retrieve the cus- tomer's record and to create an order. Any late fees from previous rentals must be settled before a new rental can occur.

The clerk uses a bar code reader (or other scanner) to scan the video identifier and enter videotape identifi- cation into the system. For each video bar code entered, the system completes the rental detai1line on the screen with today's date, videotape identification, video name, and rental price. When bar code IDs for all videotapes to be rented have been entered, the sys- tem computes the total fee, automatically computing and adding in sales tax. Late fees may be added to the total if any are outstanding. The customer is told the total amount and the money is paid.

When the clerk enters the money amount into the sys- tem and puts the cash into the cash register, the sys- tem reduces the amount paid by the total fee amount to obtain the amount of change due to the customer. The amount due to ABC for the rental is reduced to

Application Development Case 53

zero on the order. The customer signs a copy of the order form as it is printed on a printer and takes the video( s) home.

On return of tapes, the clerk scans the bar code IDs of the videos. The system should retrieve and display the order with the return date and any late fees added to the detai1line. If either there are no late fees or late fees are settled upon return of the video, the order is deleted from the system and the history of use infor- mation for the tape is updated. Late fees, and the order information about tape( s) rented that caused the late fee( s), remain on file until they are paid.

Trend analysis should include query capabilities with statistics built in. This should be available on an ad hoc basis without having to anticipate all queries and/or types of analysis in advance. Part of the analy- sis is used to determine how many tapes of each film to purchase. Trends might be based on sequential nights of rental, number of nights rented within the first 20 days, number of nights rented within the sec- ond 20 days, and so on. Each individual tape, even though it might be the nth copy of m copies of the same film, should be identifiable for this analysis. These requirements are not included with the descrip- tion of required file information above, because you should determine the best way to supply this information.

Discussion

Let's stop here a moment and think about the ABC Rental Processing case. First, get a global mental model of the problem. The problem is to automate rental/return, customer, and video inventory pro- cessing, including totaling of orders, computing change, monitoring of late returns, and creation of historical information. This sounds like a complete statement of problem scope, and it could be used for that purpose. In this case, the problem is small enough to hold most of the functions in mind at once.

Do you know enough to automate the problem? No, you do not, not if you want to do it properly. The processes, in terms of how a customer will interact with ABC personnel, are fairly simple. Rental pro- cessing has fairly well-defined data requirements and business requirements about how to do the process steps. The flow of processes for rentals still

54 CHAPTER 2 Learning Application Development

needs elaboration, but is complete enough for under- standing the general problem.

What don't you know? The kinds of questions we will ask will be details of what we already know: How many? How often? What about variations on the process? Questions will also elaborate on con- straints and determine if there are interfaces. Some examples of specific questions include: How many videos are there in the store? How many new ones arrive each month, week, day? How many customers are there? How many rentals per day? What kind of security is needed? Does Vic already have software in mind for this application?

There are many more questions we will ask as we move through the text, and the type of questions varies with the methodology. Even with many ques- tions, we do know quite a bit about the overall process and Vic's ideas for how the process should work. We know much less about specific details of the operation that we need to fully understand the problem and devise a workable solution. We will get more details as we progress through the text.

In terms of the Chapter 1 discussion on types of applications, rental processing will be on-line with interactive processing. It is a transaction processing application with some query processing. The rental application transaction portion automates the paper- work of rentals, returns, and payments for rentals. The query and reporting part of the rental application uses predefined data in a read-only manner, and has predefined reporting requirements as well as ad hoc reporting requirements. The rental processing case is used throughout this text to reason through each methodology.

SUMMARY ----------~----

In this chapter we explained the nature of learning and experience. Declarative knowledge is knowl- edge about what actions, procedures, or steps are taken to perform some task. Declarative knowledge is a required but incomplete learning. Process knowledge is knowledge about how to perform, reason, and integrate the steps we know from declarative learning. While we learn, we form analogies or cases that form patterns of experi-

ences. When we match a pattern from experience with some current problem, we use analogical thinking. When a past experience does not match some current problem, we analyze the differences to develop a new case based on the new situation. The internalization of cases in our memory is learning.

Novices differ from experts in their problem- solving approach. Novices make mistakes because they do not have a global view of a problem, cannot mentally simulate a solution to the problem, and do not see connections and meaning in problem parts. Experts are able to analyze novel problems because they use analogies from their experience to develop a global view of the problem, can take a top-down view of what they know and do not know, can sim- ulate their solutions mentally, and understand con- nections and meaning in problem parts. Several tips for practicing software engineering were provided to speed and simplify your learning.

The case company, ABC Video, and its role in the video rental business was described, rental-order processing details were developed.

REFERENCES --------~---

Adelson, B., and E. Soloway, "The Role of Domain Experience in Software Design," IEEE Transactions on Software Engineering, SE-11, Vol. 11, 1985, pp. 1351-1360.

Jeffries, R, A. A. Turner, P. G. Polson, and M. E. Atwood, "The Processes Involved in Designing Soft- ware," in Cognitive Skills and Their Acquisition (J. R Anderson, ed.). Hillsdale, NJ: Lawrence Erlbaum Associates, 1987, pp. 255-283.

Kintsch, W., and S. M. Mannes, "Generating Scripts from Memory," in Knowledge Aided Information Processing (E. van der Meer and J. Hoffman, eds.). NY: Elsevier Science Publishing Co., Inc., 1987, pp.61-80.

Klein, G. A., and R. Calderwood, "How do People Use Analogies to Make Decisions?," in Proceedings of Case Based Reasoning Workshop (J. Kolodner, ed.), DARPNISTO, Clearwater Beach, FL, May, 1988, pp. 209-218.

Littman, D. c., J. Pinto, S. Lechovshy, and E. Soloway, "Mental Models and Software Maintenance," in Empirical Studies of Programmers-l st Workshop

(E. Soloway and S. Iyengar, eds.). Norwood, NJ: Ablex Publishing Co., July 5-6, 1986, pp. 80-98.

Schank, R. c., and R. P. Abelson, Scripts, Plans, Goals and Understanding. Hillsdale, NJ: Lawrence Erlbaum Associates, 1977.

Schank, R. C., "Explanation: A First Pass," in Experi- ence, Memory and Reasoning, (J. L. Kolodner and C. K. Riesbeck, eds.). Hillsdale, NJ: Lawrence Erlbaum Associates, 1986, pp. 139-166.

Shemer, I., "Systems analysis: A systemic analysis of a conceptual model," Communication of the ACM, \'o1.30,#6,June, 1987,pp.506-512.

Simon, H., The New Science of Management Decision. NY: Harper and Row, 1960.

\'essey, I., and S. A. Conger, "Requirements specifica- tion: Learning object, process, and data methodolo- gies," Communications of the ACM, accepted for publication, 1993.

Wand, Y., and R. Weber, "A unified model of software and data decomposition," in Proceeding of the 12th International Conference on Information Systems (J. I. De Gross, I. Benbasat, F. DeSanctis, and C. M. Beath, eds.). NY: SIGBDP, Association for Computing Machinery, 1991.

KEy TERMS _______ _

analogy analysis domain breakdown case case-based reasoning categorize problems conservation declarative knowledge deep structures directed search expert

generalization global mental model goal local mental model novice plan problem domain process knowledge satisficing surface features undirected search

EXERCISES _______ _

1. Develop pseudo-code for ABC Video's rental processing system. Identify and discuss what the essential portions of rental processing are. Dis- cuss which procedures could be either included or omitted without changing the essence of the problem. (Note to Instructor: This is a useful

Study Questions 55

exercise to ensure that all students have a good understanding of the problem.)

2. Describe a work situation you have experienced. Discuss the organization: the structure of the organization, its goals, its strategies for meeting its goals, its culture, its managers' style, the social life at work. A. Describe your job and how your job con-

tributed to the organization's goals. Describe the computer applications, if any, you used in your job. Analyze what you did on your job and recommend computer applications that could have streamlined, enhanced, or broadened your job. Do you have the 'big picture' of the company and your job's role? If not, how would you go about developing a global view?

B. Describe some area of the organization (you mayor may not have worked there) that could use an application to speed its work, make its work more accurate, enhance jobs, provide better information to workers, or simplify work life. Describe the application and how it would meet its goals.

STUDY QUESTIONS ___ _

1. Define the following terms: analogy conservatism declarative knowledge global mental model

problem domain satisficing surface features

2. Which comes first-declarative knowledge or process knowledge? Why? How does learning work?

3. Why and how do we use analogies? 4. Why are analogies better used in systems analy-

sis and design than a trial-and-error method of problem solving?

5. ,Describe the details of what it means to rent a tape at ABC. How do the manual processes translate into computer processes? Use analo- gies from your own experience to discuss rentals.

6. Make a list of questions you have about ABC order processing that still need to be answered.

56 CHAPTER 2 Learning Application Development

Use analogies from your own video rental expe- rience to identify issues that still need to be resolved.

7. Describe the details of what it means to return a tape. How do the manual processes translate into computer processes? Identify subprocedures for which you have choices about when and how they are performed.

8. How do you develop a global mental model of some problem? How do you know if you have a global mental model of some problem? How do you validate your mental model?

9. What does it mean to create historical informa- tion? When does history get created? In the ABC case, is history created at video rental time? or at video return time? or at some other time? How do you know when you have the cor- rect answer to this type of question?

* EXTRA-CREDIT QUESTIONS 1. Write a one page analysis of some work experi-

ence you know about. Describe some function and how it contributed to the organization's

goals. Describe the computer applications, if any, used in the function. Analyze the job and recommend computer applications that could streamline, enhance, or broaden the function. Make a list of questions you need answered to gain a complete understanding of the problem areas.

2. Draw a diagram or verbally describe (in pseudo- code or your own words) how ABC Video per- forms order processing. Make a list of questions you have about ABC order processing that still need to be answered. Describe how your experi- ence as a video store customer helps you under- stand what ABC is trying to do. Describe, from your experience as a video store customer, how you think a video store should be automated. How does it differ from Vic's desires? What should you do about those differences? What are Vic's goals for the application in addition to pro- cessing rental/returns? What features might you consider for the application to meet those goals? List three functions you can put in the system to help meet Vic's goal of "no bureaucracy."

CHAPT ER3

PROJECT ------------------------------~--------.. -----

MANAGEMENT -------------------------------------

INTRODUCTION ____ _

The role of the software engineer (SE) differs from the project manager in that the SE provides technical expertise, while the project manager provides orga- nizational expertise. Depending on the size of an organization and project team, one person might per- form both roles. Small project teams (Le., less than five people) and organizations with limited software development staff (Le., less than 10 people) expect one person to assume both software engineer and project manager roles. The larger the organization, the more likely the functions are split and the more extensive each person's experience is expected to be.

The proj ect manager and software engineer are responsible for tasks that include both complemen- tary and supplementary skills. In general, the soft- ware engineer is solely responsible for management of the life cycle, including the following areas detailed in Chapters 4 through 14:

• Management and conduct of development process

• Development of all documentation • Selection and use of computer-aided software

engineering (CASE) tools • Elicitation of user requirements • Technical guidance of less skilled staff

• Assurance that representation techniques, such as data flow diagrams, are correct, consistent, and validated

• Oversight of technical decisions • Assurance that constraints (e.g., two-second

response time) are identified and planned as part of the application

Complementary activities are activities that are performed jointly but with different emphasis depending on the role. Complementary activities include planning the project, assigning staff to tasks, and selecting from among different application alternatives.

The project manager (PM) is solely responsible for organization liaison, project staff management, and project monitoring and control. These major responsibilities are discussed in this chapter.

When one person or another is identified as solely responsible for some activity, it does not mean that they alone do the work. The SE and PM are team leaders who work together in all aspects of develop- ment. The SE may have project management expe- rience. Sole responsibility means that when a disagreement occurs, responsibility for the final de- cision rests with the responsible person. Different management styles determine how open a manager is to suggestion and discussion of alternatives.

58 CHAPTER 3 Project Management

FIGURE 3-1 Example of Too General a Plan

A short discussion of appropriate behaviors for proj- ect managers is also included in this section. These behaviors are the project manager's responsibility toward the project.

First we discuss the joint SE and PM activities. Then we discuss activities for which the project manager is solely responsible. Management styles and a brief discussion of project manager respon- sibilities to the project team are included in the section on personnel management. The last sec- tion lists computer-aided support tools for project management.

COMPLEMENTARY ------- ACTIVITIES ______ _

Joint activities of the software engineer and project manager include project planning and control, assigning staff to tasks, and selecting from among different alternatives for the application.

Project Planning To plan the project, the project manager works with the SE to determine human, computer, and organiza- tional resources required to develop the application. While a detailed discussion of planning is included in Chapter 6, the aspects of special interest to the project manager are in this section.

A project plan is a map of tasks, times, and their interrelationships. It can be very general (see Figure 3-1) or very specific (see Figure 3-2). Neither ex- treme of plan is very useful although some plan is better than none. A rule of thumb for level of detail is to define activities for which a weekly review of progress allows the SE and project manager to know whether the schedule is being met. Figure 3-3 shows an example of a well-defined plan.

The general methodology of planning is as follows:

1. List tasks. Include application development tasks, project specific tasks, interface organi-

FIGURE 3-2 Example of Too Detailed a Plan

zation tasks, and review and approval tasks.

2. Identify dependencies between tasks. 3. Assign personnel either by name or by skill

and experience level. 4. Assign completion times to tasks; compute

the most likely time for each. 5. Identify the critical path.

The project manager and SE share responsibility for developing the plan. The SE's responsibility is to know all of the tasks relating to the application be- ing developed; the project manager's responsibility is to ensure that all organizationally related tasks are included in the list. (The application tasks are dis- cussed in Chapter 6.) Organization tasks include the following:

1. Review documents for completeness, con- tent, consistency, and accuracy.

Complementary Activities 59

2. Negotiate, agree, and commit to start and end FIGURE 3-3 Example of Acceptable Level dates for work. of Detail

60 CHAPTER 3 Project Management

3. Define necessary application interfaces; plan for detailed interface design work.

All documentation, plans, and design work of the project team is subject to review by at least the user/sponsor. Many other departments or organiza- tions might also be required to review some or all of the work. These organizations might include man- agers of IS, users, quality assurance, legal, audit, operations, other application groups, government regulators, industry regulators, or others. Each or- ganization applies its specialized knowledge to the application documents to assess their adequacy.

The second task is to obtain agreement and com- mitments from outside agencies or departments. Fre- quently, resources and work are provided by other departments. Clerical support, for example, might be from an Administrative Services Department. Operations departments supply support in terms of computer time, memory, disk space, terminals, log- on IDs, access to software environments, access to data bases, and so on as necessary to develop and test the application. Auditors frequently want to comment on auditing plans and change the design based on their findings. Quality assurance depart- ments might review documents to find inconsisten- cies and errors that" require correction. Vendors might need to install hardware, software, or related applications that need liaison from the project team and testing once installed. All of these activities need to be scheduled and planned. Since dates for com- mitments might not be known when the plan IS developed, the plan contains the dates at which con- tact should be initiated and dates by which the com- mitment must be made in order not to impact the delivery date.

Third, the project manager obtains requirements for application interfaces from other application ar- eas. An interface is data that is sent or received be- tween applications. The interface application areas might be in the same company, but might also be an industry group or a government organization. The plan reflects dates by which contact should be initi- ated and by which the information is required.

If a make-or-buy decision will be made, the pro- ject manager and SE work together to develop the subplan for this decision. Sub activities relating to

acquisitions include creating and submitting requests for a proposal (RFP), obtaining vendor quotes, eval- uating vendor quotes, selecting and obtaining man- agement approval for a vendor, negotiating contract and delivery dates, and planning and testing of the acquired item.

When all of the items are identified, they are re- lated to each other. Tasks that are related are drawn on a task dependency diagram showing the se- quences of dependencies. Sequences may be inter- dependent (see Figure 3-4). When all sequences of tasks are on the diagram, independent tasks are added. Milestones, such as the completion of a fea- sibility analysis document, are shown and are visu- ally obvious because the preceding" sets of tasks all feed into that task. Task sequencing can vary de- pending on the methodology used. (See Chapter 6 for more on this topic.)

Sequencing tasks is the first step to identifying the critical path of tasks for the application's develop- ment. The critical path is the sequence of dependent tasks that together take the most development time. If any of the tasks in the critical path are delayed, the project is also delayed. So, the critical path tasks are the greatest source of risk for project completion.

The next step is to estimate the amount of work. For this discussion, we assume the project manager and SE assign times to tasks based on their experi- ence (Le., reasoning by analogy). Other methods are discussed in Chapter 6. Times are assigned to each task based on its complexity and amount of work. Three times should be estimated: an optimistic time, a realistic time, and a pessimistic time. The formula used to compute the most likely time is shown in Figure 3-5. The figure weights the most likely, real- istic time by a factor of two in relation to the other estimates.

While times ate being assigned, the skill sets and experience levels of a person to do this task should be defined. The list of skill sets and experience lev- els is used to determine how many people and what type of people are required on the project for each phase. Other assumptions will surface, and a list of them should be kept, as shown in Table 3-1. The assumptions become part of the planning document.

When resource requirements and timing are com- plete, several activities take place. The SE develops

Complementary Activities 61

FIGURE 3-4 Example of Interdependent Sequences of Tasks

(0 + 2R + P) /4= Most Likely Time Estimate

Legend:

O-Optimistic Time Estimate R-Realistic Time Estimate P-Pessimistic Time Estimate

FIGURE 3-5 Formula for Determining Schedule Time

a schedule; the project manager develops a budget. They both identify the critical path and discuss it in terms of potential problems and how to minimize their likelihood. Task definitions are made more de- tailed for critical tasks, to allow more control and monitoring.

When complete, the plan, schedule and budget are submitted to the user and IS managers for com- ment and approval. Work begins, if it hasn't already, with the plan used to guide project work. The plan is used by the project team to see where their work

62 CHAPTER 3 Project Management

TABLE 3- 1 Project Assumptions

Type Assumption

Availability of configuration, component of mainframe, special hardware, programmer support equipment, tools, time

User time involvement. This may be expressed in time per day for a number of days, or may be in number of days.

Need for services from audit, law, vendors, quality assurance, or other support groups

Software performance

Test time, terminal time, or test shot availability

Disk space

Memory, CPU time, tape mounts, imaging access, or other mainframe resources

Personnel

Hardware/software availability

fits in the whole project, and it is used to monitor progress toward project completion.

The plan should never be cast in concrete. Plans should change when the tasks are wrong, times are underestimated, or there are changes in project scope that alter the activities performed in some way.

Example

Programmers will gain access to IEW by September 10,1994.

A middle manager representative from Accounts Payable will be available in a Joint Application Design session scheduled for June 1-5, 1994.

The Audit Department will be able to review and comment on the adequacy of audit controls within 7 business days of receiving the review document.

The Database Management Software will be able to process 10,000 transactions per day.

Batch programs can be tested simultaneo~sly with on-line programs.

Batch programs will be able to average three test runs per day with an average turnaround of less than 2.5 hours.

Batch programs will be less than 160K and will require no more than two tape mounts each.

Operations will make available 100 cylinders of IBM 3390 disk space for the project beginning 9/10/94. An additional 50 cyl. will be added for test databases by 10/30/94. An additional 250 cyl. will be added for pro- duction database conversion by 11/30/94.

For testing, 30 CPU minutes per day plus 75 hours of terminal access time will be required beginning 10/30/94.

Two senior programmer/analysts with 2-3 years of Focus experience and 2-3 years of on-line, multiuser, application development experience is required by 6/30/94.

Four programmers with 1-2 years of Focus experience and one year of VM/CMS experience is required by 7/15/94.

Imaging equipment will be available for application test- ing no later than 9/10/94.

15 PCs or IBM 3279 terminals will be available for access and testing use no later than 9/10/94.

Assigning Staff to Tasks Task assignment is fairly straightforward. The ma- jor tasks are to define the tasks and skills needed, list skills and availability of potential project members, and match people to tasks. The project manager and

SE actually begin discussing possible project staff when they are planning the project and tentatively assigning people to tasks. Then the project man- ager's real work begins.

The hard part of an assignment is the judgment required to match people whose skills are not an ex- act match for those needed; this is the usual case. For instance, you might want two programmer analysts with the following list of skills:

• design and programming experience on a sim- ilar application

• three to five years experience in the opera- tional environment

• one to two years of experience with the data- base software

• managerial experience for two to four people • known for high quality work • known as an easy-going personality

Suppose your manager gives you a junior pro- grammer right out of a training program, an analyst who does not program and who has no operational environment, database, or managerial experience, and a senior programmer who does no design, is known to be difficult, and sometimes does high qual- ity work.

The good news is that you have three people instead of two. The bad news is no one of them has all of the qualifications you want. What do you do? This is what management is all about.

The project manager should get to know the team members well. This means assessing their position with the company, expectations on the project, spe- cific role desired for the person, possible start and end dates for work, and personality or personal is- sues that might affect their work. Much of this in- formation can be got from previous performance reviews. But nothing substitutes for discussing the information with the person.

The project manager has responsibilities to his or her manager, the client sponsor, and to the rest of the project team to get the best, most qualified peo- ple possible. In these capacities, the project man- ager honestly discusses previous problems with the person, any personal problems that might detract the person's attention from work, and any outside jobs, school, or other commitments that might also hin- der their commitment. The person and the project

Complementary Activities 63

manager both should be given an opportunity to ac- cept or reject the possibility of work. Even when there is no choice, it is also the responsibility of the project manager to make his or her expectations of quality and quantity of work clear. If the person will not report directly to the project manager, the per- son she or he will report to should also be at the meeting. In this way, everyone knows exactly what was said and what commitments were (or were not) made.

The answer to the task assignment problem above is to assign the tasks to best fit the skills. Assign the senior person responsibility for the work of the junior one, and provide motivation and incentives for quality work (see the following section on moti- vation). You also alter the schedule, if needed, to more closely mirror the actual skills of the team.

The heuristics, or rules of thumb, for personnel assignment are as follows:

1. Assign the best people to the most complex tasks from the critical path. Assign all critical path tasks. As the experience and skill levels of people decrease, assign less complex and smaller tasks. Do not give new, junior, or unqualified staff any tasks on the critical path. Assignment of senior people to critical tasks minimizes the risk of missing the target date.

2. Define a sequence of work for each person to stay on the project for as long as their skill set is needed. Try to assign tasks that provide each person some skill development.

3. Do not overcommit any person by assigning more tasks than they have time. Make sure each person will be busy, but allow time to finish one task before beginning another.

4. Allow some idle time (2-5%) as a contin- gency for each person. Do not allow more than eight sequential hours (i.e., one day) of idle time for any person.

5. Do not schedule any overtime. Scheduled overtime places unfair stress on people's pro- fessional and personal commitment and is a regular enough occurrence in development that it should not be scheduled at the outset.

The project manager is also responsible for coor- dinating movement from another assignment to the

64 CHAPTER 3 Project Management

current development project. This coordination is done with the other project manager(s) involved and possibly the personnel department. New hires should be assigned a 'buddy' to help them get familiar with the company, its facilities, the computer environ- ment, policies, and procedures. Senior staff should be assigned to mentor junior staff, encouraging the learning of new skills on the job.

Finally, the project manager must ensure that each person understands the expectations and duties assigned to him or her. All staff should have a copy of their job description. They should know the extent of their user interaction, extent of their intraproj-ect responsibility and communication, and policies about chain-of-command on who to go to with prob- lems, project errors found, or problems with work assignments.

Ideally, the team should be given an overview of the application, a chance to review the schedule, and an opportunity to comment on their ability to meet the deadlines assigned. If they cannot meet the dead- lines and have reasonable explanations, the plan, schedule, and budget should be changed. In addition, any training or learning on-the-job that is required should result in a lengthening of the schedule. If the team members agree to the schedule, then they are committed to getting the work done within the time

allowed and should be held accountable for that as part of their work assignment.

Selecting from Among Different Alternatives Applications all have alternatives for implementa- tion strategy, methodology, life cycle, and imple- mentation environment. The project manager and SE together sort out the options, develop pros and cons, and decide the best strategies for the application.

Implementation Strategy

Implementation strategy is some mix of batch, on-line, and real-time programming. The decision is based on timing requirements of users for data accuracy, volume of transactions each day, and num- ber of people working on the application at anyone time. All of these numbers are estimates at the plan- ning stage of an application, and are subject to change. The strategy decision might also change. In general, though, a decision can be made at the feasibility stage to provide some direction for data gathering.

As Table 3-2 shows, the timing of data accuracy drives the decision between batch and on-line. Keep

TABLE 3-2 Decision Table for Implementation Strategy Selection

Timing of Data Currency

< 1 hour

< 4 hours

< 24 hours

Peak Transaction VolumelNumber of People Entering Data

<10

10-59

> 59

Options

Batch application

On-line application

Real-time application

x X

in mind that these are rules of thumb and need to be used in an organizational context. If data can be accurate as of some prior period, a batch applica- tion might be developed. If data must be accurate as of some time of the business day, either on-line or real-time strategies would be successful.

If the volume of transactions divided by the num- ber of people is very high (over 60 per minute), then a high-performance application, with many concur- rent processes, that is, a real-time application, might be warranted.

If the volume of transactions divided by the num- ber of people is low (less than 25 per minute), but the timing requires on-line processing, an on-line appli- cation is best.

The gap in transactions per minute from 10 to 60 requires more information, specific to the project, for a decision. Answers to several questions are needed. For instance, how complex is a transaction? How was the number of workers arrived at, and can the number change? Is management willing to fund the difference in cost for a real-time application over an on-line one? Are there other factors (e.g., specific database software to be used) to consider in the decision? These questions are all context specific and the resulting decision would be determined by their answers.

Implementation Environment

The implementation environment includes the hardware, language, software, and computer-aided support tools to be used in developing and deploying the application. The decision is not final at the fea- sibility and planning stage, rather the alternatives and a potential decision are identified. The issues to be resolved for a final decision are then identified.

Frequently there is no choice of implementation environment. The organization has one environment and there are no alternatives; all development uses one mainframe and one language (for instance, COBOL). More often, as personal computers and lo- cal area networks become more prevalent, the alter- natives are mainframe or network with PCs as the workstation in the chosen environment.

The decision is based frequently on the experi- ence of the project manager, SE, and potential team members. People tend to use what they know and not use what they do not know. Ideally, the implemen-

Complementary Activities 65

TABLE 3-3 Decision Table for Implementation Environment

CPU Bound

I/O Bound

< 100,000 Trans/ Day

> 100,000 Trans/ Day

Hardware Mainframe

LAN

LAN + Mainframe network

x X

tation environment should be selected to fit the application, not the skills of the developers.

For instance, if a real-time application is be- ing built for a Sun workstation environment under Unix operating system, C++ or Ada are probably the languages of choice. Certainly, Cobol is not a choice.

Guidance in implementation environment selec- tion comes from the user. Do they have equipment they want to use? How is it configured? What other software or applications are on the equipment? How amenable is the user to changing the configuration to fit the new application?

Then, with this information, the decision table in Table 3-3 can be used as a guideHne for selecting the implementation environment.

In general, whenever there is a specific require- ment, it tends to drive the remairling decisions. Whenever there are general requirements, the deci- sion can remain open for a longer time. Some direc- tion-either toward a mainframe solution or a PC/LAN solution-should be tentatively decided during feasibility and planning. During this process, the project manager should identify the issues for further information needed in making a final decision.

Methodology and Project Life Cycle

The final issue to be tentatively decided is which methodology and how streamlined the Hfe cycle

66 CHAPTER 3 Project Management

TABLE 3-4 Decision Table for Methodology and Life Cycle Selection

Source of Complexity

Process Y

Data Y

Knowledge representation

Balanced

Novel problem N N

Methodology

Process X

Data X

Object X

Semantic

will be. Frequently, there is no choice about these decisions, either. The organization supports one methodology and one life cycle and there is no dis- cussion allowed. Equally frequently, enlightened managers know that not all projects are the same, therefore the development of the projects should also not be the same.

Methodology choices are process, data, object, social, semantic, or some hybrid of them (see Chap- ter 1). Life cycle choices are the sequential waterfall, iterative prototyping, or learn-as-you-go (see Chap- ter 1). These decisions are not completely separated from those of implementation environment in the previous section, because any fixed implementation requirements can alter both the methodology and the life cycle choices.

Assuming no special implementation require- ments, the application itself should be the basis for deciding the methodology. In a business environ- ment, the rule of thumb is to choose the methodol- ogy that addresses the complexity of the application best. If the complexity is procedural, a process method is best. If the complexity is data related, a data methodology is best. If the problem is easily

Y Y

N N Y Y Y Y

X X X

X X X X X

X X

broken into a series of small problems, an object method might work best. If the project is to automate expert behavior or includes reasoning, a semantic methodology is best. A decision table summarizing heuristics on deciding methodology and life cycle is shown as Table 3-4.

Life cycle choice also requires some decision about what type and how much involvement there is of users. If som~ intensive, accelerated require- ments or analysis technique is used [see joint requirements RI~rlni~g (JRP) and joint application design (lAD), Part II Introduction], either a stream- lined sequ~ntiaJ life; cycle or an iterative approach can beus~d. Very large, complex applications with known requirements usually follow a sequential wa- terfalllife cycle. If some portion of the application- requirements, software, language-is new and untested, prototyping should be used. Object orien- tation assumes prototyping and iteration. If the prob- lem is a unique, one-of problem that has never been automated before, either alearn-as-you-go prototyp- ing or an iterative life cycle would be appropriate.

In the next sections, the activities for which the project manager has sole responsibility are detailed.

These activities include liaison, personnel manage- ment, and project monitoring and reporting.

LINSON ______________ _

The project manager is a buffer between the techni- cal staff and outside organizations. In this liaison role, the project manager communicates and negoti- ates with agents who are not part of the project team. A liaison is a person who provides communications between two departments. Examples of outside agents include the project sponsor (who mayor may not be the user), IS managers, vendors, operations managers, other project managers, and other depart- ments such as quality assurance (for validation and testing), law (for contracts), and administration (for ~lerical and secretarial support).

For each type of liaison, status reports are an important means of communication (see sample in Figure 3-6). Status reports document progress, iden- tify problems and their resolution, and identify changes of plans to all interested parties. In addition, many other communications of different types are described for each type of liaison. The guidelines here are just that-guidelines. They are developed assuming that open communications between con- cerned parties is desired, but the guidelines require judgment and knowledge of the situation to sepa- rate a good action from a less good action.

Project Sponsor The sponsor pays for the project and acts as its champion. A champion is one who actively sup- ports and sells the goals of the application to others in the organization. A champion is the 'cheerleader' for the project.

The goals of liaison with the champion are to ensure that he or she knows the status of the project, understands and knows his or her role in dealing with politics relating to the project, and knows the major problems still requiring resolution.

The major duty of the champion is to deal with the political issues surrounding the project that the project manager cannot deal with. Politics are in every organization, and politics relate to organi-

Liaison 67

zational power. Power usually is defined as the ability of a person to influence some outcome. One source of power comes from controlling organi- zational resources, including money, people, infor- mation, manufacturing resources, or computer resources.

Political issues of application development do not relate to the project, but to what the project repre- sents. Applications represent change. Changes can be to the organization, reporting structure, work flow, information flow, access to data, and extent of organizational understanding of its user con- stitllency. When changes such as these occur, some- on'e,'s status changes. When status changes, the people who perceive their status as decreasing will rebel.

The rebellion may be in the form of lies told to analysts, refusal to work with project members, complaints about the competence of the project team, or any number of ways that hinder the change. If the person causing trouble is successful, the proj- ect will fail and his or her status will, at worst, be unchanged. Politics, left unattended, will lower the chances of meeting the scheduled delivery date and raise the risk of implementing incorrect require- ments. The project manager usually tries to deal with the political issues first, keeping the sponsor in- formed of the situation. If unsuccessful, the sponsor becomes involved to resolve the problem.

In some organizations, the project manager com- municates to the sponsor only through his or her manager. In others, the project manager handles all project communications. In general, treat the spon- sor like your boss. Tell him or her anything that will cause a problem, anything they should know, and anything that will cause the project delays.

User The user is the person(s) responsible for providing the detailed information about procedures, pro- cesses, and data that are required during the analy- sis of the application. They also work with the SE and project manager in performing the feasibility analysis, developing the financial and organizational assessments of user departments for the feasibility study.

68 CHAPTER 3 Project Management

ICIA Industries-Interoffice Memo

DATE: October 10, 1994

TO: Ms. S. A. Cameron

FROM: J. B. Berns

SUBJECT: Order Entry and Inventory Control Project Status

Progress

We have resolved the testing problems between batch and on-line by going to a two-shift programming environment. The on-line programmers are working from 6 A.M. to 2 P.M. and the batch programmers are working from 2 P.M. to 10 P.M. This is not an ideal situation, but it is working at the moment.

We are still two weeks behind the schedule for programming progress, and we may not be able to make up the time, but we should not lose any more time.

The on-line screen navigation test began two days ago and is going smoothly. Several minor spelling problems have been found, but no logic problems have been found. George Lucas should complete the user acceptance of the screen navigation and screen designs within three days if no other problems surface.

Problems

The decode table for warehouse location, due 5/12/94 from George Lucas, is still not deliv- ered yet. This is going to delay testing of the on-line inventory allocation programs begin- ning in ten days if we do not have it. Is there another person we can contact to get this information?

Operations found what appears to be a bug in one of the CICS modules. When a screen call is made, two bytes of the information are lost. We are double-checking all modules to ensure that it is not an application problem. Jim Connelly is calling IBM today to see if they have a fix for the problem. At the moment, this is not causing any delays to testing. But it will cause delays beginning next week if the problem is not resolved. The delays will be to all on-line modules calling screens and will amount to the time per module to code a work- around for the unresolved problem. This should be about one hour each for a total of 120 hours. We hope this delay can be avoided; everyone possible is working on the prob- lem, including two experts from our company whom we called in last night as a free service to ICIA.

FIGURE 3-6 Sample Status Memo and Report

Project manager-user communication includes both planned and unplanned status meetings, writ- ten communications for status, analysis, interview results, documentation, and walk-throughs of appli- cation requirements as specified by the project team. Timing of user communications differs with the type of communication, but is most often daily until the application begins programming and testing. Then, a minimum of weekly personal contact should main- tain the relationship.

In general, tell the user everything that might affect them, the project, or the schedule negatively; do not tell them anything else.

IS Management IS managers, like most managers, want to know progress, problems and their solutions, warnings of lateness, and political issues. They do not want to handle all problems for their managers, nor do they appreciate finding out a project will be late the week before it is due. Tell your manager anything that might get him or her in trouble, that they need to know, or that might impact the project negatively. Always expect to propose solutions and argue if you think your solution is better than their's. Always accept their solution if it is mandated, unless it is unethical or illegal.

Technical Staff Technical staff here means the project team. Always be open with them. Keep them current on progress, problems and resolutions, and any information that affects their ability to do their job. Praise quality work. Practice team building using common sense, like having small victory parties at the end of phases, sharing birthdays, or announcing promotions.

Operations Operations affect the project differently depend- ing on the phase. In early phases, word processing and PCs must be available for documentation. Computer-aided software engineering tool access might be required. Timing, type, and needs of ac- cess should be planned and negotiated well in

Liaison 69

advance. The kinds of problems a team might suffer from no access may delay documentation but does not delay the work of analysis. In the worst case, the work can be done manually.

During design, the database administrator must have access and resources allocated for the definition and population of a test database. This must also be negotiated well in advance.

During implementation, old data must be con- verted to the new format and environment, programs must be placed in production, and users begin using the application. At this time, the operations depart- ment assumes responsibility for running the appli- cation. This responsibility must also be planned and negotiated in advance.

When programming and testing begin, all project members need access to compilers, test database, editors, and, possibly, testing tools to work on their programs. Absence of resources at this time can severely delay project completion. For each day of person-time lost, there can be one day of project delivery time lost. Timing, type, and volume of ac- cess are all negotiated items. Advance negotiation should begin at least one month prior to the need. Most operations managers will tell you they want to know about a demand for their resources as soon as you can identify the demand and the date needed. Most operations managers will also tell you they want all requirements at once. So you should be pre- pared to discuss analysis, design, and implementa- tion needs before much work takes place.

In general, operations managers need to know what the project needs from them and when. They also should be sent progress reports and told of any problems that affect the use of their resources.

Vendors A vendor is any company, not your own, from which you obtain hardware, software, services, or information. If the application is installed in an ex- isting environment, probably no vendor contacts are needed. If, however, acquisition of software, hard- ware, or both is planned, there are three types of con- tact with the vendor that take place. The first is proposal communication, the second is for negotia- tions, and the last is customer support.

70 CHAPTER 3 Project Management

A Request for Proposal (RFP) (see Chapter 16) is a document developed by the PM and SE to solicit bids from potential vendors. Vendors are asked to respond with an estimate of service and price within some number of days (e.g., 30). All bids received by the cut-off date are reviewed. Proposal communica- tions are usually limited to information about the pro- posal. RFPs are accepted and responded to by vendor marketing staff with some technical assistance. Proj- ect manager contact is with the marketer.

Part of the RFP process is the development of a list of required features for the item being bid upon. This list should have priorities and weights assigned to it during the proposal stage for use during the analysis. Bids are rated on the requirements then compared to see which vendor most closely meets the needs of the application.

When a vendor is selected, a contract must be negotiated. Negotiation may be with the marketer, but might also be with a financial person or with the marketer's manager. Similarly, the project manager might do all or some of the negotiation with assis- tance from a financial person or his or her manager. Negotiations deal with price, time period of the con- tract, number of sites, number of users, type of license, guarantees in case the vendor goes out of business, warrantees, and so on. There is no one way to negotiate, and most often, all negotiations are turned over to legal staff for completion of contract terms. It is important never to commit to any terms until they are seen and approved by some manager in the organization. Frequently, contracts have far- reaching implications that an individual project man- ager may not know.

Other Project Teams and Departments Other IS organizations that might need project communications include a database administration group, other project teams, and a quality assurance group. Other departments might include law, or audit. In all cases, the communication is similar. These groups need to know what their relationship to your project is, how soon and what type of sup- port you need, who to contact for questions and

information, and project status that might change any of these requirements.

In addition, you also have needs of these teams. If any of the organizations is performing work you need to complete your project, then you need the same things from them that they need from you. You need to know exactly what they will do for you and how it will be transmitted to your project, whom to contact, and task status that might affect your schedule.

To summarize, many other groups and depart- ments in the organization need to have liaison activ- ities with a project. It is the project manager's job to provide that liaison with communications tailored to the needs of the other organization.

PERSONNEL _____ _ ~ANAGEMENT ____ _

For personnel management, the project manager hires, fires, coaches, motivates, plans, trains, and evaluates project team members.

Hiring Hiring is usually coordinated through a personnel office that oversees all IS hiring, not just one proj- ect. Newspaper advertisements can be more cost- effective, general, and get a better response when coordinated for all projects. The personnel office receives the responses and filters obviously unqual- ified applications out from the pool of applicants. Then, working with the project manager, the per- sonnel department screens the applicants and arranges project interviews.

As in most things, timing is important. Ads take from one to two weeks to get approved and placed. Receipt of resumes usually takes the same amount of time. Interviewing is time consuming and can take another one to two weeks for each hire. Then, offers are made and salary negotiations completed. The elapsed time to hire someone might be seven weeks or longer.

In addition, scheduling interviews may mean early-morning, evening, or lunch-time work. People searching for a job who already have one may not

want to take vacation time for an interview. If the person appears qualified, the project manager is expected to shift his or her schedule to fit the needs of the applicant.

Firing You may not agree, but keeping a person in a job for which they are unsuited does more damage to the manager, the person, and the project than you might think. Project managers are damaged because they think of little else and agonize over the de- cision much longer than necessary. People usu- ally know if they are going to be terminated because they did not complete their specified tasks. They should have been told, in writing, before the termi- nation date.

Prolonging a termination is damaging to the per- son being fired because it gives them a false sense of hope, makes them lose confidence in the person not following through on their described actions, and also allows them to influence other project members negatively.

Finally, procrastination on firing is damaging to the project because the longer the termination is delayed, the more likely the person being terminated will begin talking of his or her situation to other proj- ect members and disrupting work. As more people find out, more time is spent speculating on the situ- ation. Less work gets done and the staff eventually loses confidence in the project manager.

No one gets into trouble overnight. Usually there is a period during which a problem is known, but it might be corrected before any real problems arise. It is at this time that the project manager should sit down with the person and talk about the situation. Legally, everyone in this situation is entitled to at least one warning letter which is also placed in their personnel file. This is followed by a letter of repri- mand stating that performance is substandard with reasons for that judgment. The letter also states that the person is on probation and will be terminated by a specified date unless some actions are taken. The actions are then listed. If the person does the as- signed work satisfactorily, they are off probation. All of these communications are in writing, monitored and approved by personnel and the IS manager, and

Personnel Management 71

are the basis for any future legal action by the employee.

If the work is performed satisfactorily, probation ends. If not, the person is terminated. Termina- tion from a project does not necessarily require ter- mination from a company. If a person is ill-suited to a particular project, she or he might still be a valu- able employee. A good project manager will first try to place the person somewhere else in the organiza- tion. If the person is terminated from the com- pany, the company can try to help them find another job through an out-placement service or by provid- ing company resources (a desk and phone away from the project) until a job is found. If the person is terminated for antisocial behavior, an addic- tion, or for some other nontechnical problem, the project manager might help them seek profes- sional help.

Motivating Motivation has personal and professional aspects. Professional motivation arises from a desire to do a good job. People are motivated to do a good job when they are treated like a professional and given meaningful, interesting work that includes some dis- cretionary decision making and some creative de- sign.

Personal motivation arises from a desire to improve one's position in life. Position in life is defined individually and may mean earning more money, buying a bigger house, becoming an analyst, or becoming a manager, and so on.

Project management style is the determining factor of personal motivation. A project manager who facilitates participation, fosters controlled risk- taking, and allows people to grow as individuals will gain undying loyalty from his or her staff. A project manager who treats the staff as stupid, lazy, and unmotivated might obtain desired behaviors from them, but it will be through intimidation and coercion.

The proj ect manager needs to know the proj- ect team members individually in order to tailor reward systems and assignments to help them reach their goals. Project manager commitment to help- ing team members reach personal goals determines

72 CHAPTER 3 Project Management

how professionally motivated the team members will be.

There are three aspects to motivation. First, the project work itself can be used to further profes- sional goals that include doing novel work and advancing to new levels of seniority, experience, or responsibility. Second, the project manager must be careful to tailor reward and pt iishment systems to fit the tasks, being unbiased in terms of importance of individual contributions to the work. Third, the individual professional must make a commitment to doing something extra to gain the reward, either on-the-job or on his or her own time.

Take, for instance, a mainframe Cobol program- mer who wants to move to a personal computer LAN environment using C++. The project has relaxed deadlines and the project manager might be able to help the person, but some commitment from the pro- grammer is needed. The project manager recom- mends that the person find, attend, and pass a C++ course for which the company will pay. Then, the person will be assigned a task in the desired envi- ronment. If the task is successful, more tasks will follow. If the task is not successful, the situation will be reassessed.

Professional motivation might also come from fostering development of association ties outside of work. Meetings or user groups of vendors, l profes- sional associations, 2 or other professional groups related to work duties might be paid for by the com- pany to foster professional motivation.

Motivation also has a negative side. The actions that would be taken should the person fail to do their job competently must also be known. There should be company policies about quality and quantity of work that are also included as part of job descrip- tions. In the absence of company policy, the project manager should adopt rules, with the knowledge and consent of their manager, about punishments for fail-

1 Guide and Share are IBM mainframe user groups with over 10,000 members each. DECus is the Digital Equipment users group. In these huge groups, there are subgroups with inter- ests in every software package, language, and development environment offered by the vendor.

2 The Association for Computing Machinery (ACM) is one example.

ure to meet work requirements. These should also be made known to everyone on the project.

Career Path Planning Motivating is an immediate activity of the project manager, but all employees and managers should be encouraged to develop longer range aspirations, as well. The project manager should help plan, with each individual, the tasks from this project that can be used to further his or her career.

The project manager should discuss goals and career paths at the beginning of the project and at least annually during performance reviews after that. The discussion should include a frank assessment of current perceptions of the individual's verbal, organizational, and professional skills, as well as helping the person plan courses, assignments, or opportunities to improve his or her performance. There should be direct ties from performance to rewards. Any time an individual does something sig- nificant enough to be mentioned on an appraisal, he or she should be told and either praised or counseled to change.

Training The purpose of project training is to specifically address weaknesses of staff in techniques, technol- ogy, or tools used on the project. The SE and any project leaders are directly responsible for identify- ing training needs. The project manager is responsi- ble for obtaining the training for the individual(s) who need it. A senior mentor for the trained skill should be assigned to monitor progress in the devel- opment of the skill, once training is complete.

Nonrelated training, as discussed above, may also be authorized by the project manager depending on employee need, rewards, and fit with employee goals.

Evaluating Evaluations are annual assessments of the person from both professional and organizational perspec- tives. Evaluations are written and usually are signed

by the reviewed person and the reviewer. Quality and quantity of work assignment are the professional assessments and are the most important aspects of junior level work. Junior staff, having no business experience, are monitored most closely for their abil- ity to do their work. Competence for the assigned jobs is determined, and the more competent, the faster the person is promoted.

As people become more senior, quality and quan- tity of assigned work becomes assumed and organiz- ing, motivating, communications, and interpersonal skills become more important. The non task specific skills are viewed from an organizational perspective. More emphasis is placed on the ability to persuade, manage, motivate, and communicate with others, thus describing a good manager.

Promotion for most senior people is to the man- agerial ranks. In some companies, the importance of very senior, technical experts, is recognized. In those companies, equal emphasis is placed on the professional and organizational assessments. Tech- nical staff can aspire to the senior technical positions without having to sacrifice their technical expertise in the bargain.

The usual performance evaluation contains sections for assignments, communications and inter- personal relations, absences, planning and organiza- tion, supervision, delegation, motivation, training, and special considerations. Each of these is de- scribed briefly.

The assignments section contains a brief descrip- tion of four or five major assignments with expecta- tions on quality and quantity of work for each as well as a brief paragraph assessing the extent to which the assignment was met. Quality and quantity of work are intangible and frequently subjective assessments, but there are always expectations of the amount of work a person should do, and of the extent to which reworking is needed. In addition, the individual's job description should give guidance on expectations for work quality and quantity. Finally, the extent to which the person needs to be monitored and assisted is an indicator of the extent to which they can work independently and competently at their job. The dis- cussion of quality and quantity should be presented in terms of job description, manager expectations, and extent to which expectations are met. Specific

Personnel Management 73

examples are required to demonstrate very high and very low quality work.

Project managers evaluate communications and human relations. Assessments of both relating verbal and written communication skills are developed. Communication skills are related to specific project assignments and to other project activities, such as walk-throughs, that are not major assignments. Communication evaluation includes grammar, speed, persuasiveness, clarity, and brevity. The per- son's ability to develop and deliver a presentation, and actual experiences doing these are described.

Another area of assessment is interpersonal relationships with project manager, senior staff members, peers, others in the department, and users. Additional comments might discuss specific inci- dents that vary from the general assessment and that might highlight a need for improvement, or identify a particular skill. For instance, a person with good negotiating skills might be identified by their arbi- tration of a disagreement between two other project members.

Work absences are mentioned in terms of total days missed, number of absences, and type of ab- sence. If there are company policies about absences and they are exceeded, a comment about the extent to which absences affected work might be added. The ability of the person to meet deadlines, main- tain an accurate status of the project, and need spe- cial communications due to absences are all described. Extraordinary situations causing a long absence, such as emergency surgery, are included.

For planning and organization, accuracy, detail, independence of work, and cooperation with other affected groups are all assessed. In addition, the per- son's adherence to their own plans is discussed. Do they use it properly as a road map, or is it a rigid rule from which no straying is allowed, or is it ignored and treated as a task done for management?

Delegation is the extent to which the work is shifted from the manager to subordinates. Issues rated are how well work assignments match people's skills, allow monitoring to ensure completion, provide for personal and career improvement of subordinates.

Managerial style is assessed in terms of group motivation. Does the project manager obtain

74 CHAPTER 3 Project Management

commitment from staff with enthusiasm, discom- fort, unhappiness, or anger? Does the manager ask or command? How successful is the strategy and what must the manager do to change unsuccess- ful strategies? Are tactics altered to fit the person being managed, or is everyone treated the same way? Are people treated fairly or is favoritism prevalent?

Can the manager motivate others to learn new skills? To what extent does the manager provide needy staff with training, either formal or informal, on techniques, technology, and tools? If formal train- ing is given by the person being rated, summaries of student ratings of quality and quantity of training should be presented. The person's ability to pro- vide mentoring and quality of mentoring might be addressed.

Finally, there is usually a section for the project manager to recommend future assignments, training, or other professional activities for further develop- ment of the individual.

MONITOR ______ _ AND CONTROL ____ _

Status Monitoring and Reporting The rationale of the planned application develop- ment is that you monitor the plan to communicate activity status and interim checkpoints to clients. The overall goal-meeting the project installation date-is the end point of a lengthy complex set of processes. Without the plan, knowing whether or not the installation date will be met is difficult. Status monitoring is the comparison of planned and actual work to identify problems. Project control is the decisions and actions taken based on the proj- ect's status.

In a planned approach, project team members report time spent on each activity for some period. The sample time sheet (see Figure 3-7), allows breakdowns for several tasks listed across the top of the form and hours worked on the task reported by day of the month. Totals by day of the month and

by task over the period are tallied by row and column totals. This type of reporting allows the project man- ager to easily see for each person weekend work, how many hours are spent on each activity over a period, and how many effective work hours there are per day.

In addition, each person should write a short progress report. The report summarizes progress in qualitative terms, identifies problems, issues, errors, or other conflicts that might delay the work. If a task will be later than its schedule date, the reason for lateness must be explained. The project manager and SE both review the reports and time sheets to decide if problems need further action. A sample progress memo is shown as Figure 3-8.

The SE and project manager map actual progress of each person against the planned times. When progress looks slow, the project manager asks the person specifically if there are problems, if there are enough resources, for example, test shots, and if the person thinks they can meet the deadline. If the task appears to have been underestimated, the schedule is checked to see if changing the time allotted will cause completion delays. Similar tasks are checked to see if they are also underestimated. The cumula- tive effect of changes is checked to see if completion is in jeopardy. If it is, the project manager discusses the problem with his or her manager and they decide on the proper course of action.

The best policy is to address potential problems early, before they become big problems. If a person cannot finish work because of too many assign- ments, then reassign some of the work to another person. If they have not got enough testing time, arrange for more time. Active management prevents many problems.

Problem follow-up includes determining the severity and impact, planning an alternative course of action, modifying the plan as required, and continuing to monitor the problem until it is resolved or no longer has an impact on the deliv- ery date.

Tell the client about problems that may not be solved so they are prepared for delays if they become inevitable. When changes become needed, tell the client about changes to planned dates even when they do not change the completion date.

Monitor and Control 75

Project: ___________ _ Month:

Name:

Activities

Day of Total Month for Day

1/16

2/17

3/18

4/19

5/20

6/21

7/22

8/23

9/24

10/25

11/26

12/27

13/28

14/29

15/30

Total

FIGURE 3-7 Time Sheet

76 CHAPTER 3 Project Management

ICIA Industries-Interoffice Memo

DATE: October 10, 1994

TO: J. B. Berns

FROM: M. Vogt

SUBJECT: Order Entry and Status

Progress

We completed our screen design and navigation testing 10/7/94 and turned the modules over to George Lucas for user acceptance. He requested changes to several items:

1. The location of the total at the bottom of the screen is moved left five spaces. 2. The PF key assignment for PF3, which we were using to END any process. He would

like END to be PF24. We explained that this is not a good design because the operator needs more key strokes (and hence is more likely to err) for PF24. Also, this is a very time-consuming change, about 10 hours, and that he should have mentioned his prefer- ence during the reviews. He decided to think about it and talk to some real operators before making a firm decision.

The other testing is progressing well. I am almost done testing the entire order process, except for inventory allocation. I need the warehouse codes from George by next week if I am to continue testing the programs.

Problems

The warehouse codes which were promised some months ago are getting to be on the criti- cal path. If I do not have them by next week, I cannot continue to test the inventory alloca- tion portion of the application. I can assign my own code scheme, then change it to the real one if I have to, but I would like to avoid the double work.

FIGURE 3-8 Sample Progress Report

The kinds of problems that occur and the activi- ties the project manager monitors change over the course of the development. For instance, during the definition of the project scope, the project manager monitors the following:

Is the client cooperative? Are all the stockholders identified and involved?

Are users being interviewed giving accurate, complete information?

Are users participating as expected? Are there any apparent political issues to be

addressed? Does the scope look right? That is, does the

current definition appear to include relevant activities?

By analysis, the project manager knows most users and how they work, should have identified potential political problems and dealt with them, and should be comfortable that the project scope is cor- rect. The activities monitored turn toward the project team, and include the following:

Do all analysts know the scope of activity and work within it?

Is the analysts' work emphasis on what and not how?

Are users participating as expected? Are all project members pulling their weight? Is everyone interested and happy in their job? Is there any friction between team members, or

between team members and users? Does everyone know what they and all others

are doing? Is there constant feedback-correction with users

on interview results? Are team members beginning to understand the

users' business and situation? Are the team members objective and not trying to force their own ideas on the users?

Are walk-throughs finding errors and are they getting resolved?

Are documents created looking complete? Does the user agree?

Is the analysis accurately addressing the prob- lems of the user? Are team members analyz- ing and describing exactly what is needed without embellishment?

Is typing turnaround, printing of word-processed documents, copying, or other clerical support acceptable?

Does communication between teams and be- tween teams and users appear to be satis- factory?

Is the project on time? What is the status of critical path tasks? Has the critical path changed because of tasks that finished early?

Where are the biggest problems right now? How can we alleviate the problems?

What do we not know that might hurt us in design?

The functional requirements that result from analysis should describe what the application will

Monitor and Control 77

do. The project manager is constantly vigilant that the requirements are the users. One problem many projects have is that the user wants a plain functional application but the analysts design a high- priced application with the user functions, but with many unnecessary features, or 'bells and whistles,' as well. This problem, if it occurs, must be dealt with before analysis ends or extraneous functions will be in the resulting application. When over-design problems surface, it is important to try to trace them to specific analysts for retraining in providing their services.

In design, the emphasis shifts to monitoring the rate, type, and scope of changes from the users. If the business is volatile, requirements change may become a constant problem. Change management procedures should be developed and used. At this point, the project manager's worries include the following:

Do the analysts know the application? Is the translation to operational environment

correct and complete? Are walk-throughs finding errors? Are errors

being resolved? Are users participating as expected? Are users

properly involved with screen design, test design, acceptance criteria definition?

Are all project members pulling their weight? Is everyone interested and happy in their job?

Is there any friction between team members, or between team members and users?

Does everyone know what they and all others are doing?

Are all team members aware of their changing responsibilities, and are they comfortable with and able to do design tasks?

Does communication between teams and between teams and users appear to be satis- factory?

Is the project on time? What is the status of critical path tasks? Has the critical path changed because of tasks that finished early?

Where are the biggest problems right now? How can we alleviate the problems?

What do we not know that might hurt us in programming? Is the implementation environment suitable for this application?

78 CHAPTER 3 Project Management

Can the database management software accommodate this application?

The number of project team members usually increases for programming to do parallel develop- ment as much as possible. The communication over- head necessary to know everyone's status and for them to know the project status increases. The prob- lems in the programming and unit testing stage tend to focus on communications and programmer performance.

Does everyone understand how their work fits into the project? Does everyone know their critical-path status? Are all current project members pulling their weight? Does every- one know what they and all others are doing?

Is testing time sufficient? Is terminal access sufficient?

Does everyone know the technologies they are using sufficiently to perform independently?

Are junior staff paired with senior mentors? Are users requesting further changes? Are users participating as expected in test

design, user documentation development, conversion, and training?

Is there constant feedback-correction with users on suspected errors?

Are prototypes being used as much as possible to demonstrate how the application will work?

Are walk-throughs productive, finding errors? Are errors getting resolved?

While programming and unit testing are proceed- ing, tests for integration and system level concerns are being developed. The database is being estab- lished and checked out. The operational environment is being prepared. Concern shifts from getting the application expressed in code to getting it working correctly. The kinds of questions a project manager might have are the following:

Are all current project members pulling their weight? Does everyone know what they and all others are doing?

Is testing time sufficient? Is terminal access sufficient?

Are users requesting further changes? Are users participating as expected in testing?

Is there constant feedback-correction with users on suspected errors?

Are walk-throughs productive, finding errors? Are errors getting resolved?

Does the system level test really prove that the functions are all accounted for?

Does the integration test verify all interconnec- tions? How can it be leveraged to prove the reliability of the interconnections during the system test?

What do we not know about the operational environment that might hurt the project?

Is the database software working properly? Are back-up and recovery procedures adequate for testing?

How can we use the integration and system tests to develop a regression test package?

Is documentation being finalized? Is everyone working to capacity? Should we start letting programmers go to other projects? If we let a key person go, who can take their place when a problem occurs?

Finally, testing is complete, the application appears ready, and the user is ready to work. There should have been a plan for actually implementing the operational application that eases the user into use without too much trauma. The easing-in period gives the project team some time to fix errors found in production without excessive pressure. The issues now center on getting the application to work in its intended environment for its intended users. The questions include the following:

Is the site prepared adequately? Is air condi- tioning sufficient? Are lighting and ergonomic design sufficient?

Are users properly trained and ready to do work?

Are work cycles and evaluation of results identi- fied sufficiently to allow implementation and verification of results?

When errors are found, are they getting resolved?

Are users taking charge as expected? Are all current project members pulling their

weight? Does everyone have enough work to do? Can people be freed to other projects?

Automated Support Tools for Project Management 79

Is communication between teams and between teams and users appearing satisfactory? Are users told whenever major problems occur? Are they participating in the decision making about error resolution?

Many of the questions above are technical in nature and would be referred to the SE to monitor. The project manager is like a mother hen and is sup- posed to worry about everything. Obviously, if the plan addresses the activities as it should, many of the answers to the above sets of questions are found in weekly progress reports of team members. Compil- ing the individual progress reports and project prog- ress reports in a project log allows the manager and any of the staff to review decisions, problems and

their resolutions, and other issues as they occur dur- ing the development.

AUTOMATED _____ _ SUPPORT TOOLS ____ _ FOR PROJECT _____ _ ~ANAGEMENT ____ __ Project management support tools have increased in sophistication and performance since the mid-1980s when the first PC-based tools arrived. The tools in this section support project planning, task assign- ment and monitoring, estimation tools, and sched- uling tools (see Table 3-5). Key tool capabilities

TABLE 3-5 Automated Support Tools for Project Management

Product

CA-products

DataEasy Project Management

Demi-Plan

Foundation

IEW, ADW (PS/2 Version)

Life Cycle Manager

Life Cycle Project Manager

Maestro

microGANTT

Milestone

Multi-Cam

Company

Computer Associates International, Inc. Islandia, NY

Data Easy Software Foster City, CA

Demi Software Ridgefield, CT

Arthur Anderson & Co. Chicago,IL

Knowledgeware Atlanta, GA

Nastec Southfield, MI

American Management Systems Fairfax, VA

SoftLab San Francisco, CA

Earth Data Corp. Richmond, VA

Digital Marketing Corp. Walnut Creek, CA

AGS Mgmt Systems King of Prussia, PA

Technique

Project planning

Task mapping

Critical path project planning and tracking

Project management Project planning

Project planning

Project planning, task assignment, tracking

Problem tracking

Project planning

Critical path project planning and tracking

Project planning and tracking

(Continued on next page)

80 CHAPTER 3 Project Management

TABLE 3-5 Automated Support Tools for Project Management (Continued)

Product Company Technique

PMS II North America MICA Inc. Project planning, task San Diego, CA assignment, tracking

Critical path PERT

Primavera Project Manager Primavera Systems Inc. Bala Cynwyd, PA

Project planning, task assignment, tracking

Project Microsoft Bellevue, WA

Project planning, task assignment, tracking

Project Workbench, Fast Project

Applied Business Technology NY, NY

Project planning, task assignment, tracking

System Architect Popkin Software and Systems, Inc. NY, NY

Project planning

Teamwork Cadre Technologies Inc. Providence, RI

Planned completion date tracking

vsDesigner Visual Software, Inc. Santa Clara, CA

not considered here include word processing, spreadsheets, calendars, or interfaces to electronic mail (these are considered useful for all organiza- tion members). Other tools that are used by a pro- ject manager but are discussed in other sections of the text are for configuration management, quality control, and metrics.

SUMMARY ________ ~ __ _ The project manager role is frequently separate and distinct from that of the software engineer. The soft- ware engineer is generally responsible for technical aspects of project work. Some tasks are joint, com- plementary activities shared by project managers and software engineers. For these joint activities, the software engineer contributes technical skills, and the project manager contributes organizational skills.

Project completion tracking Critical issues monitoring

The project manager is solely responsible for most people-related aspects of projects. The three main tasks of the project manager are organizational liaison, employee management, and project monitor- ing and control. Organizational liaison includes cre- ating working relationships with other organizations and departments, resolving project-related problems regardless of their nature, and reconciling the project design with expectations of others. Employee man- agement includes working with Personnel to hire, fire, and staff the project. Employee management also includes individual employee monitoring to help them evaluate, set, and attain career goals. Proj- ect monitoring and control is the other major proj- ect management activity. Monitoring means to trace the progress of project work and compare it to bud- geted time and resources to maintain progress. Con- trol includes deciding and implementing project changes when progress is not satisfactory. Project changes might include change of job assignments,

introduction of training, or change to schedules, and plans.

REFERENCES __________ __

Abdel-Hamid, Tarek, and Stuart E. Madnick, Software Project Dynamics: An Integrated Approach. Engle- wood Cliffs, NJ: Prentice Hall, 1991.

Gilbreath, R. D., Winning at Project Management: What Works, What Fails and Why. NY: John Wiley and Sons, 1986.

Gildersleeve, Thomas R., Data Processing Project Man- agement. New York: Van Nostrand Reinhold Com- pany, 1974.

Glass, Robert L., Software Conflict: Essays on the Art and Science of Software Engineering. Englewood Cliffs, NJ: Prentice Hall, Yourdon Press, 1991.

Cleland, D. I., and William R. King, Systems Analysis and Project Management. NY: McGraw-Hill, 1983.

King, William R., and D. I. Cleland (eds.), Project Man- agement Handbook, 2nd ed. NY: Van Nostrand Rein- hold, 1988.

Pfeffer, Jeffrey, Organizations and Organization Theory. Boston: Pitman, 1982.

Rogerson, Simon, Project Skills Handbook. Lund, Sweden: Chartwell-Bratt, 1989.

KEy TERMS _______ _

champion complimentary activities critical path evaluations heuristic implementation

environment implementation strategy interface liaison personnel management

project control project plan request for proposal

(RFP) sponsor status monitoring task dependency

diagram user vendor

EXERCISES ________ _

1. List and discuss three advantages and three dis- advantages to project team members using time sheets to report work activities. What might

Study Questions 81

some alternatives for reporting task progress and time spent be?

2. Write an honest appraisal of yourself for the work you have done in school toward your cur- rent degree. Give specific examples of good and, maybe, poor work. Rate your knowledge and skills gained in terms of a schedule that ends when you graduate.

3. Discuss the following comment: "It is impor- tant for a project manager to have been a programmer and an analyst. Otherwise, the manager has no feel for the problems and their severity."

STUDY QUESTIONS ____ - 1. Define the following terms:

champion critical path heuristic liaison project plan

2. When and why are the software engineer and project manager roles split?

3. Describe the project manager's role in planning.

4. Describe a general planning methodology. 5. What kinds of reviews are done on project doc-

umentation? Why are they necessary? 6. What are five types of operations resources that

might be needed on a project? 7. What is the minimum lead time recommended

for resource requests? 8. What is an RFP and when is it used? 9. What is the purpose of a task dependency

chart? 10. What is a critical path and why is it

important? 11. Should a plan be finalized and cast in

concrete? 12. List four types of assumptions made during

planning and describe why each is important. 13. Why should project team members submit time

sheets? 14. Describe how to assign staff to tasks. Why is

the process rarely this simple? 15. Describe the heuristics for assigning staff to

projects. 16. Should planned overtime be in a schedule?

82 CHAPTER 3 Project Management

17. List five things every person should know about his or her job when working on an appli- cation development project.

18. What are the three alternatives for implementa- tion strategy?

19. What are the heuristics for deciding implemen- tation strategy?

20. List two choices for implementation en- vironment.

21. Describe the heuristics for deciding implemen- tation environment.

22. What are the choices for methodology and life cycle?

23. Describe the heuristics for deciding meth- odology.

24. Describe the heuristics for deciding life cycle. 25. What is a liaison? What project manager duties

require liaison work? 26. List the contents of a project status report. 27. What is politics and how does it affect applica-

tion development work? 28. Why are performance appraisals done?

* EXTRA-CREDIT QUESTIONS 1. List and discuss types of assessment from a per-

formance appraisal. How does a manager ensure the ratings are fair and objective? What should a manager do if he or she does not like the person being reviewed?

2. Develop a project plan for ABC Video based on the information in Chapter 2 only. Use the case and this chapter to decide the tasks. Use your experience, whatever it is, to decide the times for the tasks. Do not look at other information in this or other texts when planning the work. What assumptions do you have? How comfort- able are you with your estimates? Keep this assignment and redo it at the end of Chapter 6.

CHAPTER4

DATA GATHERING ----------------~--------------FOR APPLICATION --------------------------~-----DEVELOPMENT

--------------------.. --------~-----

INTRODUCTION ____ _

Each phase of application development requires interaction between the developers and users to obtain information of interest at the time. Each phase seeks to answer broad questions about the applica- tion. For instance, in feasibility analysis, the ques- tions are broad and general: What is the scope of the problem? What is the best way to automate? Can the company afford (not) to develop this applica- tion? Is the company able to support application development?

In analysis we seek what information about the application. For instance, What data are required? What processes should be performed and what are the details of their performance? What screen design should be used?

In design, we develop how information relating to the application. For example, How does the appli- cation translate into the specific hardware environ- ment selected? How does the logical data design translate into a physical database design? How do the program modules fit together?

The kind of interaction that elicits answers to questions such as these differs by information type and phase. In this section we describe the alterna- tives for obtaining information to be used for appli-

cation development. The alternative data gathering techniques are described, then related to application types. Then, ethical considerations in data collec- tion and user relations are discussed.

DATA TYPES _____ _

Data differs on several important dimensions: time orientation, structure, completeness, ambiguity, se- mantics, and volume. Each of these dimensions is important in defining requirements of applications because they give guidance to the SE about how much and what type of information should be col- lected. Also, different data types are related to different application types and require different requirements elicitation techniques. Inattention to data dimensions will cause errors in analysis and design that are costly to fix. Error correction cost is an increasing function of the phase of development (see Table 4-1).

In addition to obtaining information, we also use the techniques for validating the information and interpretation in the proposed application. Use of validation techniques during each phase increases the likelihood that logic flaws and misinterpretations will be found early in the development.

84 CHAPTER 4 Data Gathering Application Development

TABLE 4- 1 Cost of Error Correction by Phase of Development

Phase in Which Cost Ratio to Errors are Found Fix the Error

Feasibilityl Analysis 1

Design 3-6

Code/Unit Test 10

Development Test 14-40

Acceptance Test 30-70

Operation 40-1000

From Boehm, Barry, Software Engineering Economics. Engle- wood Cliffs, NJ: Prentice-Hall, 1981.

Time Orientation Time orientation of data refers to past, present, or future requirements of a proposed application. Past data, for example, might describe how the job has changed over time, how politics have affected the task, its location in the organization, and the task. Past information is exact, complete (if maintained), and accurate. There is little guessing or uncertainty about historical records.

Current information is information about what is happening now, and its relevance in determining the future. Fpr instance, current application information relates to operations of the company, the number of orders taken in a day, or the amount of goods produced. Current policies, procedures, business industry requirements, legal requirements, or other constraints on the task are also of interest in appli- cation development. Current information should be documented ip. sOme way that it can be read by the develoPPlent team to increase their knowledge of the application and problem domains.

Future requirements relate to changes in the in- dllstry expected to take place. They are inexact and difficult to verify. Economic forecasts, sales trend projections, and business 'guru' prognostications are examples of futllre information. Futllre-oriented in- formation might be used, for example, by managers in an executive information system (EIS).

Structure Structure of information refers to the extent to which the information can be classified in some way. Structure can refer to function, environment, or form of data or processes. Information varies from un- structured to structured with interpretation and defi- nition of structure left to the individual SE. The information structuring process is one in which the SE is giving a form and definition to data.

Structure is important because the wrong applica- tion will be developed without it. For instance, knowing that the user envisions the structure of the system to be one with 'no bureaucracy,' minimal user requirements, and no frills, gives you, the SE, a good sense that only required functions and data should be developed. In the absence of structuring information, technicians have a tendency to develop applications with all 'the bells and whistles' so the users can never complain that they don't have some function.

An example of structuring of data is shown in Figures 4-1 and 4-2. When you begin collecting information about employees for a personnel appli- cation, you might get information about the em- ployees themselves, their dependents, skills the employees might have, job history information, company position history, salary history, and per- formance reviews.

The information comes to you in pieces that may not have an obvious structure, but you know that all of the data relates to an employee so there must be relationships somewhere. In Figure 4-2, we have structured the information to show how all of the information relates to an employee and each other in a hierarchic manner. Each employee has specific one-time information that applies only to them, for instance, name, address, social security number, em- ployee ID, and so on. In addition, each employee might have zero to any number of the other types of information depending on how many other compa- nies they have worked at, whether they have chil- dren, and how long they have worked at the company. The most complex part of the data struc- ture is the relationship between position, salary, and reviews. If salary and performance reviews are dis- joint, they would be as shown, related to a given

FIGURE 4-1

FIGURE 4-2

Name Dependent's Date of Birth

Age

Job Title at Time of Raise

Social Security Number Job Salary

Address

Raise Amount Dependent's Name

Current Job Title Performance Reviewer

Date of Raise Date of Performance Review

Past Job Title

Job Title at Time of Review

Performance Rating

Unstructured Personnel Data

Personal Information Social Security Number Name Address Date of Birth

I Dependent Information Dependent Date of Birth Dependent Name Dependent Relationship

I 1

Job Information Job Title Job Department Job Begin Date Job End Date Job Salary

Data Types 85

,..-'-- Performance Ratings Performance Rating Date Performance Rating Performance Reviewer

,-r- Raise Information Raise Date Raise Amount

Structured Personnel Data

86 CHAPTER 4 Data Gathering Application Development

position the person held in the company (see Figure 4-2). The other option is that salary changes are de- pendent on performance reviews and the hierarchy would be extended another level.

Completeness Information varies in completeness, the extent to which all desired information is present. Each ap- plication type has a requisite level of data complete- ness with which it deals. Transaction processing systems deal with complete and accurate informa- tion. GDSS and DSS deal with less complete infor- mation. EIS, expert systems, or other AI applications have the highest levels of incompleteness with which they must cope.

In applications dealing with incomplete informa- tion, the challenge to you is to decide when the information is complete enough to be useful. Some- times this decision is made by the user, other times it is made within the application and there need to be rules defining complete enough.

Ambiguity Ambiguity is a property of data such that it is vague in meaning or is subject to multiple meanings. Since ambiguity deals with meaning, it is closely related to semantics. An example of ambiguity is to ask the following query:

PRINT SALES FOR JULY IN NEW YORK

In this query, New York can mean New York State or New York City; both answers would be correct. Ob- vious problems will occur to a person who asks that request for one context (the state) and gets an answer for the other context (the city). Contextual cues help SEs to define the one correct interpretation of am- biguous items; further problems arise because of multiple semantic interpretations within a single context. For that reason, semantics is discussed next.

Semantics Semantics is the study of development and change in the meaning of words. In business applications, semantics is the meaning attached to words. Mean-

ing is a social construction; that is, the people in the organization have a collectively shared defini- tion of how some term, policy, or action is really interpreted.

Semantics is important in applications develop- ment and in the applications themselves. If people use the same terms, but have different meanings for the terms, misunderstandings and miscommunica- tions are assured. If embedded in an application, semantically ambiguous data can never be processed by a program without the user being aware of which 'meaning' is in the data. Applications that have semantically mixed data then rely on the training and longevity of employees for proper interpretation of the data. If these key employees leave, the ability to correctly interpret the meaning of the data is lost. Losing the meaning of information can be expensive to the company and can result in lawsuits due to improper handling of information.

An example of semantic problems can be seen in a large insurance company. The company uses the term 'institution' to refer to its major clients for retirement funds. The problem is that 'institution' means different things to different people in the company. In one meeting, specifically convened to define 'institution,' 17 definitions surfaced. The problem with semantic differences is not that 16 of the 17 definitions are wrong. The problem is that all 17 definitions are right, depending on the context of their use. It is the SEs job to unravel the spaghetti of such definitions to get at the real meaning of terms that are not well defined at the corporate level. Un- raveling the meaning of the term 'institution' took about 20 person-months over a two-year period to get the user community to reach consensus on the corporate definition of the term 'institution.'

Volume Volume is the number of business events the sys- tem must cope with in some period. The volume of new or changed customers is estimated on a monthly or annual basis whereas the volume of transactions for business operation is usually measured in volume per day or hour, and peak volume. Peak volume is the number of transactions or business events to be processed during the busiest period. The peak period

might be annual and last several months, as with tax preparation. The peak might be measured in seconds and minutes, for example, to meet a Federal Reserve Bank closing deadline.

Volume of data is a source of complexity because the amount of time required to process a single transaction can become critical to having adequate response time when processing large volumes. Inter- active, on-line applications can be simple or ex- tremely complex simply because of volume. For instance, the ABC rental application will actually process less than 1,000 transactions per day. Contrast this volume with a credit card validation application that might service 50,000 credit check requests per hour. Credit card validation is simple processing; servicing 50,000 transactions per hour is complex.

Applications that mix on-line and batch process- ing using software that requires the two types of processes to be distinct, requires careful attention to the amount of time necessary to accommodate the volumes for both types of processing. For instance, the personnel application at a large oil company was designed for 20 hours of on-line processing with global access, and four hours of batch reporting. When the system went 'live,' the on-line processing worked like a charm because it had been tested, retested, and overtested. The batch portion, for which individual program tests had been conducted, required about 18 hours because of the volume of processing. After several weeks, the users were fed up because printed reports had been defined as the means of distributing query results, and they had none. The solution required an additional expendi- ture of over $200,000 to redevelop all reports as pseudo-on-line tasks that could run while the inter- active processes were running. Simple attention to the volume of work for batch processing would have identified this problem long before it cost $200,000 to fix.

DATA COLLECTION_-----'--__ TECHNIQUES _____ _

There are seven techniques we use for data gathering during application development. They are inter- views, group meetings, observation, temporary job

Data Collection Techniques 87

assignment, questionnaires, review of internal and outside documents, and review of software. Each has a use for which it is best served, and each has limitations to the amount and type of informa- tion that can be got from the technique. The tech- nique strengths and weaknesses are summarized in Table 4-2, which is referenced throughout this section.

In general, you always want to validate the infor- mation received from any source through trian- gulation. Triangulation is obtaining the same information from multiple sources. You might ask the same question in several interviews, compare questionnaire responses to each item, or check in-house and external documents for similar infor- mation. When a discrepancy is found, you reverify it with the original and triangulated sources as much as possible. If the information is critical to the applica- tion being correctly developed, put the definitions, explanations, or other information in writing and have it approved by the users separately from the other documentation. Next, we discuss each data collection technique.

Single Interview An interview is a gathering of a small number of people for a fixed period and with a specific purpose. Interviews with one or two users at a time are the most popular method of requirements elicitation. In an interview, questions are varied to obtain specific or general answers. You can get at people's feelings, motivations, and attitudes toward other departments, the management, the application, or any other entity of interest (see Table 4-2). Types of interviews are determined by the type of information desired.

Interviews should always be conducted such that both participants feel satisfied with the results. This means that there are steps that lead to good inter- views, and that inattention to one or more steps is likely to result in a poor interview. The steps are summarized in Table 4-3. Meeting at the conve- nience of the interviewee sets a tone of cooperation. Being prepared means both knowing who you are in- terviewing so you don't make any embarrassing statements and having the first few questions pre- pared, even if you don't know all the questions.

88 CHAPTER 4 Data Gathering Application Development

TABLE 4-2 Summary of Data Collection Techniques

Strengths

Get both qualitative and quantitative information

Get both detail and summary information

Good method for surfacing requirements

Interviews

Weaknesses

Takes some skill

May obtain biased results

Can result in misleading, inaccurate, or irrelevant information

Requires triangulation to verify results

Not useful with large numbers of people to be interviewed (e.g., over 50)

Group Meetings

Strengths

Decisions can be made

Can get both detail and summary information

Good for surfacing requirements

Gets many users involved

Strengths

Surface unarticulated procedures, decision criteria, reasoning processes

Not biased by opinion

Observer gets good problem domain understanding

Weaknesses

Decisions with large number of participants can take a long time

Wastes time

Interruptions divert attention of participants

Arguments about turf, politics, etc. can occur

Wrong participants lead to low results

Observation

Weaknesses

Might not be representative time period

Behavior might be changed as a result of being observed

Time consuming

Review Software

Strengths

Good for learning current work procedures as constrained or guided by software design

Good for identifying questions to ask users about functions-how they work and whether they should be kept

Weaknesses

May not be current

May be inaccurate

Time consuming

Data Collection Techniques 89

TABLE 4-2 Summary of Data Collection Techniques (Continued)

Strengths

Anonymity for respondents

Attitudes and feelings might be more honestly expressed

Large numbers of people can be surveyed easily

Best for limited response, closed-ended questions

Good for multicultural companies to surface biases, or requirements and design features that should be customized to fit local conventions

Questionnaire

Weaknesses

Recall may be imperfect

Unanswered questions mean you cannot get the information

Questions might be misinterpreted

Reliability or validity may be low

Might not add useful information to what is already known

Temporary Assignment

Strengths

Good to learn current context, terminology, procedures, problems

Bases for questions you might not otherwise ask

Weaknesses

May not include representative work activities or time period

Time consuming

May bias future design work

Q.eview Internal Documents

Strengths

Good for learning history and politics

Explains current context

Good for understanding current application

Weaknesses

May bias future design work

Saves interview luser time

Not useful for obtaining attitudes or motivations

Review External Documents

Strengths

Good for identifying industry trends, surveys, expert opinions, other companies' experiences, and technical information relating to the problem domain

Weaknesses

May not be relevant

lnformation may not be accurate

May bias future design work

90 CHAPTER 4 Data Gathering Application Development

TABLE 4-3 Steps to Conducting a Successful Interview

1. Make an appointment that is at the convenience of the interviewee.

2. Prepare the interview; know the interviewee. 3. Be on time. 4. Have a planned beginning to the interview.

a. Introduce yourself and your role on the project. b. Use open-ended general questions to begin the

discussion. c. Be interested in all responses, pay attention.

5. Have a planned middle to the interview. a. Combine open-ended and closed-ended questions

to obtain the information you want. b. Follow-up comments by probing for more detail. c. Provide feedback to the interviewee in the form of

comments, such as, "Let me tell you what I think you mean, ... "

d. Limit your notetaking to avoid distracting the interviewee.

6. Have a planned closing to the interview. a. Summarize what you have heard. Ask for correc-

tions as needed. b. Request feedback, note validation, or other actions

of interviewee.

• Give him or her a date by which they will receive information for review.

• Ask him or her for a date by which the review should be complete.

c. If a follow-up interview is scheduled, confirm the date and time.

A good interview has a beginning, middle, and end. In the beginning, you introduce yourself and put the interviewee at ease. Begin with general questions that are inoffensive and not likely to evoke an emo- tional response. Pay attention to answers both to get cues for other questions, and to get cues on the hon- esty and attitude of the interviewee. In the middle, be businesslike and stick to the subject. Get all the in- formation you came for, using the techniques you chose in advance. If some interesting side informa- tion emerges, ask if you can talk about it later and then do that. In closing, summarize what you have heard and tell the interviewee what happens next. You may write notes and ask him or her to review

them for accuracy. If you do notes, try to get them back for review within 48 hours. Also, have the in- terviewee commit to the review by a specific date to aid in your time planning. If you say you will fol- low up with some activity, make sure you do.

Interviews use two types of questions: open- ended and closed-ended. An open-ended question is one that asks for a multisentence response. Open- ended questions are good for eliciting descriptions of current and proposed application functions, and for identifying feelings, opinions, and expectations about a proposed application. They can also be used to obtain any lengthy or explanatory answers. An example of open-ended question openings are: "Can you tell me about ... " or "What do you think about ... " or "Can you describe how you use ... ".

A closed-ended question is one which asks for a yes/no or specific answer. Closed-ended questions are good for eliciting factual information or forcing people to take a position on a sensitive issue. An example of a closed-ended question is: "Do you use the monthly report?" A 'yes' response might be followed by an open-ended question, "Can you ex- plain how?"

The questions can be ordered in such a way that the interview might be structured or unstructured (see Table 4-4). A structured interview is one in which the interviewer has an agenda of items to cover, specific questions to ask, and specific infor- mation desired. A mix of open and closed questions is used to elicit details of interest. For instance, the interview might start with "Describe the current rental process." The respondent would describe the process, most often using general terms. The inter- viewer might then ask specific questions, such as, "What is the daily volume of rentals?" Each struc- tured interview is basically the same because the same questions are asked in the same sequence. Tal- lying the responses is fairly easy because of the structure.

An unstructured interview is one in which the interview unfolds and is directed by responses of the interviewee. The questions tend to be mostly open-ended. There is no set agenda, so the inter- viewer, who knows the information desired, uses the responses from the open-ended questions to develop ever more specific questions about the topics. The

Data Collection Techniques 91

TABLE 4-4 Comparison of Structured and Unstructured Interviews

Strengths

Structured

Uses uniform wording of questions for all respondents

Easy to administer and evaluate

More objective evaluation of respondents and answers to questions

Requires little training

Results in shorter interviews

Unstructured

Provides greater flexibility in question wording to suit respondent

Can be difficult to conduct because interviewer must listen carefully to develop questions about issues that arise spon- taneously from answers to questions

May surface otherwise overlooked information

Requires practice

Weaknesses

Structured

Cost of preparation can be high

Respondents do not always accept high level of structure and its mechanical posing of questions

High level of structure is not suited to all situations

Reduces spontaneity and ability of interviewer to follow up on comments of interviewee

same questions used above as examples for the structured interview might also be used in an un- structured interview; the difference is that above, they are determined as a 'script' in advance. In an unstructured situation, the questions flow from the conversation.

Structured interviews are most useful when you know the information desired in advance of the in- terview (see Table 4-4). Conversely, unstructured in- terviews are most useful when you cannot anticipate the topics or specific outcome. A typical series of in- terviews with a user client begins with unstructured interviews to give you an understanding of the prob- lem domain. The interviews get progressively struc-

Unstructured

May waste respondent and interviewer time

Interviewer bias in questions or reporting of results is is more likely

Extraneous information must be culled through

Analysis and interpretation of results may be lengthy

Takes more time to collect essential facts

tured and focused as the information you need to complete the analysis also gets more specific.

User interview results should always be commu- nicated back to the interviewee in a short period of time. The interviewee should be given a deadline for their review. If the person and/or information are critical to the application design being correct, you should ask for comments even after the deadline is missed. If the person is not key in the development, the deadline date signifies a period during which you will accept changes, after the date you continue work, assuming the information is correct.

It is good practice to develop diagram( s) as part of the interview documentation. At the beginning of

92 CHAPTER 4 Data Gathering Application Development

the next interview session, you discuss the dia- gram(s) with the user and give him or her any writ- ten notes to verify at a later time. You get immediate feedback on the accuracy of the graphic and your understanding of the application. The benefits of this approach are both technical and psychological. From a technical perspective, you are constantly verifying what you have been told. By the time the analysis is complete, both you and the client have confi- dence that the depicted application processing is correct and complete. From a psychological per- spective, you increase user confidence in your ana- lytical ability by demonstrating your problem understanding. Each time you improve the diagram and deepen the analysis, you also increase user con- fidence that you will build an application that answers his or her need.

Interviews are useful for obtaining both qualita- tive and quantitative information (see Table 4-2). The types of qualitative information are opinions, beliefs, attitudes, policies, and narrative descriptions. The types of quantitative information include fre- quencies, numbers, and quantities of items to be tracked or used in the application.

Interviews, and other forms of data collection, can give you misleading, inaccurate, politically mo- tivated, or irrelevant information (see Table 4-2). You need to learn to read the person's body language and behavior to decide on further needs for the same information. Table 4-5 lists respondent behaviors you might see in an interview and the actions you might take in dealing with the behaviors.

For instance, if you suspect the interviewee of lying or 'selectively remembering' information, try to cross-check the answers with other, more reliable sources. If the interview information is found to be false, ask the interviewee to please explain the dif- ferences between his or her answers and the other information. The session does not need to be a con- frontation, rather, it is a simple request for explana- tion. Be careful not to accuse or condemn, simply try to get the correct information.

Persistence and triangulation are key to getting complete, accurate information. You are not required to become 'friends' with the application users, but interviews are smoother, yield more informa- tion for the time spent, and usually have less' game-

playing' if you are 'friendly' than if you are viewed as distant, overly-objective, or noninterested.

Meetings Meetings are gatherings of three or more people for a fixed period to discuss a small number of topics and sometimes to reach consensus decisions. Meet- ings can both complement and replace interviews. They complement interviews by allowing a group verification of individual interview results. They can replace interviews by providing a forum for users to collectively work out the requirements and alterna- tives for an application. Thus, meetings can be use- ful for choosing between alternatives, verifying findings, and for soliciting application ideas and requirements.

Meetings can also be a colossal waste of time (see Table 4-2). In general, the larger the meeting, the fewer the decisions and the longer they take. There- fore, before having a meeting, a meeting plan should be developed. The agenda should be defined and cir- culated in advance to all participants. The number of topics should be kept to between one and five. The meeting should be for a fixed period with specific checkpoints for decisions required. In general, meet- ings should be no longer than two hours to maintain the attention of the participants. The agenda should be followed and the meeting moved along by the project manager or SE, whoever is running the meet- ing. Minutes should be generated and circulated to summarize the discussion and decisions. Any follow-up items should identify the responsible person(s) and a date by which the item should be resolved.

Meetings are useful for surfacing requirements, reaching consensus, and obtaining both detailed and summary information (see Table 4-2). If decisions are desired, it is important to ask the decision makers to attend and to tell them in advance of the goals for the meeting. If the wrong people participate, time is wasted and the decisions are not made at the meeting.

Joint application development (lAD) is a spe- cial form of meeting in which users and technicians meet continuously over several days to identify ap- plication requirements (see Figure 4-3). Before a

Data Collection Techniques 93

TABLE 4-5 Interviewee Behaviors and Interviewer Response

Interviewee Behavior

Guesses at answers rather than admit ignorance

Tries to tell interviewer what she or he wants to hear rather than correct facts

Gives irrelevant information

Stops talking when the interviewer takes notes

Rushes through the interview

Wants no change because she or he likes the current work environment

Shows resentment; withholds information or answers guardedly

Is not cooperative, refusing to give information

Gripes about the job, pay, associates, supervisors, or treatment

Acts like a techno-junkie, advocating state-of- the art everything

JAD session, users are trained in the techniques used to document requirements, in particular, diagrams for data and processes are taught. Then, in prepara- tion for the JAD session, the users document their own jobs using the techniques and collecting copies of all forms, inputs, reports, memos, faxes, and so forth used in performing their job.

A JAD session lasts from 3 to 8 days, and from 7 to 10 hours per day. The purpose of the sessions is to get all the interested parties in one place, to de-

Interviewer Response

After the interview, cross-check answers

Avoid questions with implied answers. Cross-check answers

Be persistent in bringing the discussion to the desired topic

Do not take notes at this interview. Write notes as soon as the interview is done. Ask only the most important questions. Have more than one interview to get all information.

Suggest coming back later

Encourage elaboration of present work environment and good aspects. Use the information to define what gets kept from the current method.

Begin the interview with personal chitchat on a topic of interest to the interviewee. After the person starts talking, work into the interview.

Get the information elsewhere. Ask this person, "Would you mind verifying what someone else tells me about this topic?"

If the answer is no, do not use this person as an informa- tion source.

Listen for clues. Be noncommittal in your comments. An example might be, "You seem to have lots of problems here; maybe the application proposed might solve some of the problems." Try to move the interview to the desired topic.

Listen for the information you are looking for. Do not become involved in a campaign for technology that does not fit the needs of the application.

fine application requirements, and to accelerate the process of development. Several studies show that JAD can compress an analysis phase from three months into about three weeks, with comparable results. The advantage of such sessions is that users' commitment is concentrated into the short period of time. The disadvantage is that users might allow interruptions to divert their attendance at JAD meet- ings, thus not meeting the objective. JAD is dis- cussed in more detail in the Introduction to Part II.

94 CHAPTER 4 Data Gathering Application Development

FIGURE 4-3 JAD Meeting

Observation Observation is the manual or automated monitoring of one or more persons' work. In manual observa- tion, a person sits with the individual(s) being ob- served and takes notes of the activities and steps performed during the work (see Table 4-2). In auto- mated observation, a computer keeps track of soft- ware used, e-mail correspondence and partners, and actions performed using a computer. Computer log files are then analyzed to describe the work process based on the software and procedures used.

Observation is useful for obtaining information from users who cannot articulate what they do or how they do it (see Table 4-2). In particular, for expert systems, taking protocols of work is a use- ful form of observation. A protocol is a detailed minute-by-minute list of the actions performed by a person. Videotaping is sometimes used for continu-

ous tracking. The notes or tapes are analyzed for events, key verbal statements, or actions that indicate reasoning, work procedure, or other information about the work.

There are three disadvantages to observation (see Table 4-2). First, the time of observation might not be representative of the activities that take place nor- mally, so the SE might get a distorted view of the work. Second, the idea that a person is being ob- served might lead them to change their behavior. This problem can be lessened somewhat by exten- sive observation during which time the person be- ing observed loses their sensitivity to being watched. The last disadvantage of observation is that it can be time-consuming and may not yield any greater understanding than could be got in less time- consuming methods of data collection.

Advantages of observation are several. Little opinion is injected into the SE's view of the work.

The SE can gain a good understanding of the cur- rent work environment and work procedures through observation. The SE can focus on the issues of importance to him or her, without alienating or dis- turbing the individual being observed. Some barriers to working with the SEs that are needed for inter- views and validation of findings might be overcome through the contact of observation.

Some ground rules for observation are necessary to prepare for the session. You should identify and define what is going to be observed. Be specific about the length of time the observation requires. Obtain both management approval and approval of the individual(s) to be observed before beginning. Explain to the individuals being observed what is being done with the information and why. It is unethical to observe someone without their knowl- edge or to mislead an individual about what will be done with the information gained during the obser- vation session.

Temporary Job Assignment There is no substitute for experience. With a tem- porary job assignment, you get a more complete appreciation for the tasks involved and the complex- ity of each than you ever could by simply talking about them. Also, you learn firsthand the terminol- ogy and the context of its use (see Table 4-2). The purpose, then, of temporary job assignment is to make the assignee more knowledgeable about the problem domain. Temporary assignments usually last two weeks to one month-long enough for you to become comfortable that most normal and excep- tional situations have occurred, but not long enough to become truly expert at the job.

Temporary assignment gives you a basis for formulating questions about which functions of the current method of work should be kept and which should be discarded or modified.

The disadvantage of work assignments are that it is time-consuming and may not be a representative period (see Table 4-2). The choice of period can minimize this problem. The other disadvantage is that the SE taking the temporary assignment might become biased about the work process, content, or people in a way that affects future design work.

Data Collection Techniques 95

Questionnaire A questionnaire is a paper-based or computer-based form of interview. Questionnaires are used to obtain information from a large number of people. The major advantage of a questionnaire is anonymity, thus leading to more honest answers than might be got through interviews. Also, standardized questions provide reliable data upon which decisions can be based.

Questionnaire items, like interviews, can be ei- ther open-ended or closed-ended. Recall that open- ended questions have no specific response intended. Open-ended questions are less reliable for obtaining complete information about factual information and are subject to recall difficulties, selective perception, and distortion by the person answering the question. Since the interviewer neither knows the specific re- spondent nor has contact with the respondent, open- ended questions that lead to other questions might go unanswered. An example of an open-ended ques- tion is: "List all new functions which you think the new application should do."

A closed-ended question is one which asks for a yes/no or graded specific answer. For example, "Do you agree with the need for a history file?" would obtain either a yes or no response.

Questionnaire construction is a learned skill that requires consideration of the reliability and validity of the instrument. Reliability is the extent to which a questionnaire is free of measurement errors. This means that if a reliable questionnaire were given to the same group several times, the same answers would be obtained. If a questionnaire is unreliable, repeated measurement would result in different answers every time. Questionnaires that try to mea- sure mood, satisfaction, and other emotional char- acteristics of the respondent tend to be unreliable because they are influenced by how the person feels that day. You improve reliability by testing the ques- tionnaire. When the responses are tallied, statistical techniques are used to verify the reliability of related sets of questions.

Validity is the extent to which the questionnaire measures what you think you are measuring. For instance, assume you want to know the extent to which a CASE tool is being used in both frequency

96 CHAPTER 4 Data Gathering Application Development

of use and number of functions used. Asking the question, "How well do you use the CASE tool?" might obtain a subjective assessment based on the individual's self-perception. If they perceive them- selves as skilled, they might answer that they are extensive users. If they perceive themselves as novices, they might answer that they do not use the tool extensively. A better set of questions would be "How often do you use the CASE tool?" and "How many functions of the tool do you use? Please list the functions you use." These questions specifically ask for numbers which are objective and not tied to an individual's self-perception. The list of functions verifies the numbers and provides the most specific answer possible.

Some guidelines for developing questionnaires are summarized in Table 4-6 and discussed here. First, determine the information to be collected, what facts are required, and what feelings, lists of items, or nonfactual information is desired. Group the items by type of information obtained, type of questions to be asked, or by topic area. Choose a grouping that makes sense for the specific project.

For each piece of information, choose the type of question that best obtains the desired response. Se- lect open-ended questions for general, lists, and non- factual information. Select closed-ended questions to elicit specific, factual information, or single answers.

Compose a question for each item. For a closed- ended question, develop a response scale. The five- response Likert-like scale is the most frequently used. The low and high ends of the scale indicate the poles of responses, for instance, Totally Disagree and Totally Agree. The middle response is usually neutral, for instance, Neither Agree Nor Disagree. Examine the question and ask yourself if it has any words that might not be interpreted as you mean them. What happens if the respondent does not know the answer to your question? Do you need a response that says, I Don't Know? Is a preferred response hid- den in the question? Are the response choices com- plete and ordered properly? Does the question have the same meaning for every department and possible respondent? If the answers to any of these questions indicate a problem, reword the question to remove the problem.

If you have several questions that ask similar information, examine the possibility of eliminating

TABLE 4-6 Guidelines for Questionnaire Development

1. Determine what facts are desired and which people are best qualified to provide them.

2. For each fact, select either an open-ended or close-ended question. Write several questions and choose the one or two that most clearly ask for the information.

3. Group questions by topic area, type of question, or some context-specific criteria.

4. Examine the questionnaire for problems:

• More than two questions asking the same informa- tion

• Ambiguous questions • Questions for which respondents might not have

the answer • Questions that bias the response • Questions that are open to interpretation by job

function, level of organization, etc. • Responses that are not comprehensive of all possi-

ble answers • Confusing ordering of questions or responses

5. Fix any problems identified above. 6. Test the questionnaire on a small group of people

(e.g., 5-10). Ask for both comments on the questions and answers to the questions.

7. Analyze the comments and fix wording ambiguities, biases, word problems, etc. as identified by the comments.

8. Analyze the responses to ensure that they are the type desired.

9. If the information is different than you expected, the questions might not be direct enough and need rewording. If you don't get useful information that you don't already know, reexamine the need for the questionnaire.

10. Make final edits, print in easy-to-read type. Prepare a cover letter.

11. Distribute the questionnaire, addressing the cover letter to the person by name. Include specific instruc- tions about returning the questionnaire. Provide a self-addressed, stamped envelope if mailing is needed.

one or more items. If you are doing statistical analy- sis of the answers, you might want similar questions to see if the responses are also similar (i.e., are cor- related). If you are simply tallying the responses and

acting on the information, try to use one question for each piece of information needed. The minimal- ist approach keeps the questionnaire shorter and eas- ier to tally.

Pretest the questionnaire on a small group of rep- resentative respondents. Ask them to give you feed- back on all of the items that they don't understand, that they think are ambiguous, badly worded, or have responses that do not fit the item. Also ask them to complete the questionnaire. The answers of this group should highlight any unexpected responses that, whether the group identified a problem or not, mean that the question was not interpreted as in- tended. If the pretest responses do not provide you with new information needed to develop the project, the questionnaire might not be needed or might not ask the right questions. Reexamine the need for a questionnaire and revise it as needed. Finally, change the questionnaire based on the feedback from the test group. The pretest and revision activities increase the validity of the questionnaire.

Provide a cover letter for the questionnaire that briefly describes the purpose and type of information sought. Give the respondent a deadline for complet- ing the questionnaire that is not too distant. For instance, three days is better than two weeks. The more distant the due date, the less likely the ques- tionnaire will be completed. Include information about respondent confidentiality and voluntary ques- tionnaire completion, if they are appropriate. Ideally, the questionnaire is anonymous and voluntary. To the extent possible, address the letter to the individ- ual respondent.

Give the respondent directions about returning the completed questionnaire. If mailing is required, provide a stamped, self-addressed envelope. If interoffice mail is used, provide your mail stop address. If you will pick up responses, tell the person where and when to have the questionnaire ready for pickup.

Document Review New applications rarely spring from nothing. There is almost always a current way of doing work that is guided by policies, procedures, or application sys- tems. Study of the documentation used to teach new employees, to guide daily work, or to use an appli-

Data Collection Techniques 97

cation can provide valuable insight into what work is done.

The term documents refers to written policy manuals, regulations, and standard operating proce- dures that organizations provide as a guide for man- agers and employees. Document types include those that describe organization structure, goals, and work. Examples of each document type follow:

Policies Procedures User manuals Strategy and mission statements Organization charts Job descriptions Performance standards Delegation of authority Chart of accounts Budgets Schedules Forecasts Any long- or short-range plans Memos Meeting minutes Employee training documents Employee manuals Transaction files, e.g., time sheets, expense

records Legal documents, e.g., copyrights, patents,

trademarks, etc. Historical reports Financial statements Reference files, e.g., customers, employees,

products, vendors

Documents are not always internal to a company. External documents that might be useful include technical publications, research reports, public sur- veys, and regulatory information. Examples of ex- ternal documents follow:

Research reports on industry trends, technology trends, technological advances, etc.

Professional publications with salary surveys, marketing surveys, or product development information

IRS or American Institute of CPA reports on taxes, workmen's compensation, affirmative action, financial reporting, etc.

98 CHAPTER 4 Data Gathering Application Development

Economic trends by industry, region, country, etc.

Government stability analyses for developing countries in which the application might be placed

Any publications that might influence the goals, objectives, policies, or work procedures relating to the appli- cation

Documentation is particularly useful for SEs to learn about an area with which they have no previ- ous experience. It can be useful for identifying issues or questions about work processes or work products for which users need a history. Documents provide objective information that usually does not discuss user perceptions, feelings, or motivations for work actions.

Documents are less useful for identifying atti- tudes or motivations. These topics might be impor- tant issues, but documents may not contain the desired information.

Software Review Frequently, applications are replacing older software that supports the work of user departments. Study of the existing software provides you with informa- tion about the current work procedures and the extent to which they are constrained by the software design. This, in turn, gives you information about questions to raise with the users, for instance, how much do they want work constrained by the appli- cation? If they could remove the constraints, how would they do the work?

The weaknesses of getting information from soft- ware review are that documentation might not be accurate or current, code might not be readable, and the time might be wasted if the application is being discarded.

To summarize, the methods of collecting infor- mation relating to applications include interviews, group meetings, observation, questionnaires, tempo- rary job assignment, document review, or software review. For obtaining information relating to re- quirements for applications, interviews and JAD meetings are the most common.

DATA COLLECTION ___ _ AND APPLICATION ___ _ TyPE ________ _

In this section, we identify the data gathering tech- niques most useful for each application type. Like most aspects of application development, the tech- niques can be used for all application types, but because of their strengths and weaknesses, they do not always result in the type of information that is needed most. In this section, we first match data col- lection techniques to the data types discussed in the first section. Then, the data types are matched to application types (from Chapter 1). Next, we match the data collection techniques to application types based on the data types they have in common.

Data Collection Technique and Data Type Table 4-7 summarizes the discussion of the above sections. By matching technique for data collection to data type, we are more likely to identify informa- tion of interest than using other techniques. As the table shows, interviews and meetings are useful for eliciting all types of information. This is the reason they are most frequently used in application work.

Observation provide~ only crude numerical esti- mates of volumes, and is restricted to current time, varying ambiguity, and possibly variable semantics (see Table 4-7). Because the information from an observation is unstructured, some skill is required of the SE to iIppose a structure on it that fits the sit- uation. Al~o, the information may be incomplete.

Questionnaires can ask structured questions about any time frame but only obtain complete answers for questions asked (see Table 4-7). If the questions are open-ended, the completeness might be quite low. Ambiguity in questionnaires should be low, but the question semantics might be misinterpreted by the respondents. Questions about volume at a depart- ment or organization level are usually inappropri- ate. Information about the volume of transactions or time for transaction processing for individual work- ers would get meaningful information.

Data Collection and Application Type 99

TABLE 4-7 Data Collection Techniques and Data Type

Technique Time Structure Completeness Ambiguity Semantics Volume

Interview All All All All Varies All

Meeting All All All All Varies All

Observation Current Unstruct. Incomplete May vary Varies Crude measure

Questionnaire All Structured Complete for Low Fixed but might Individual questions asked be subject to volumes

Temporary job Current Unstruct. Incomplete assignment

Internal Past- Unstruct. Incomplete documents current

External Mostly Unstruct. Incomplete documents current-

future

Software Past- Structured Complete for review current software

Temporary job assignments are similar to obser- vation in having a high degree of uncertainty asso- ciated with the information obtained (see Table 4-7). The information tends to be current, unstructured, and incomplete depending on the period of work. Ambiguity varies from low to medium depending on how well-defined and structured the work is. Semantic content might vary depending on the shared definitions in the work group.

Documents provide unstructured, incomplete informations from which no relevant volume infor- mation is likely. The time orientation differs whether the documents are internal or external to the com- pany (see Table 4-7). Internal documents are mostly oriented to the past or current situation. External documents are mostly oriented to current or future topics. The semantics of external documents on ma- ture technologies or topics tend to be relatively fixed while that of internal documents might vary by department or division.

interpretation only

Low-med. Varies For period of obser- vation but may not be represen- tative

Low-med. Varies Maybe

Low-med. Relatively N/A fixed

Low-med. Fixed Maybe

Software provides past, and possibly cur- rent, information that is structured because it is automated. The ambiguity should be low to me- dium, and semantics should be fixed since the application imbeds definitions of data and pro- cesses in code. Information on volumes may be present but should be cross-checked using other methods.

Data Type and Application Type Application types are transaction processing (TPS), query, decision support (DSS), group decision sup- port (GDSS), executive information (EIS), and ex- pert systems (ES). Each of these has one or more predominate datatype characteristics that identifies its application. Table 4-8 shows all applications categorized for all data types. Here we discuss only

100 CHAPTER 4 Data Gathering Application Development

TABLE 4-8 Data Type by Application Type

Technique Time Structure Completeness Ambiguity Semantics Volume

TPS Current Structured Complete

Query Past, Structured Complete current

DSS All Structured Varies

GDSS Current- Unstruct. Incomplete future

EIS Future Unstruct. Incomplete

Expert system Current Semi- Incomplete based on structured past

the data types that differentiate between application types.

TPS contain predominantly known, current, structured, complete information (see Table 4-8). Recall that TPS are the operational applications of a company. To control and maintain records of cur- rent operations, you must have known, structured, current, and complete information.

Query applications have similar characteristics to TPS with the difference that they might concen- trate on historical information in addition to current information (see Table 4-8). Queries are questions posed of data to find problems and solutions, and to analyze, summarize, and report on data. To per- form summaries and reports with confidence, the data must be structured, complete, and interpreted consistently being both unambiguous and of fixed semantics.

DSS are statistical analysis tools that allow development of information that aids the decision process. The type of data that identifies DSS so that all time frames might be represented, may be incomplete, ambiguous, have variable semantics and medium to high volume (see Table 4-8). DSS might be used, for instance, in analyzing which of two variations on a given product might enjoy the larger market share. To do this analysis, past sales, current sales, and sales trends in the industry

Low Fixed Any

Low-med. Varies Med.-high

Med.-high Varies Low

Med.-high Varies Low-med.

Med.-high May vary Low

might all be analyzed and tied together to develop an answer.

GDSS are meeting facilitation tools for groups of people. GDSS tools operate in a structured man- ner working on data that is unstructured, current, and future-oriented. GDSS mostly deal with data that is incomplete and contains semantic and other ambigu- ities (see Table 4-8). The tools themselves are com- plete, unambiguous, and so forth, but the meeting information they process is not.

EIS are future-oriented applications that allow executives to scan the environment and identify trends, economic changes, or other industry activity that affect their governance of a company. EIS deal mostly with 'messy' data that is unstructured, incomplete, ambiguous, and contains variable semantics (see Table 4-8). Interpretation is always a problem with such data, which is why executives who excel at reading the environment are highly compensated.

Last, expert systems manage and reason through semistructured, incomplete, ambiguous, and variable semantic data (see Table 4-8). Experts and ESs take random, unstructured information and impose a structure on it. They reason through how to inter- pret the data to remove ambiguity and to fix the semantics. Therefore, even though the data coming into the application might have these fuzzy char-

Data Collection and Application Type 101

TABLE 4-9 Data Collection Technique and Application Type

TPS Query DSS GDSS EIS ES

Interview X* X

Meeting X X

Observation X X

Questionnaire X X

Temporary job assignment X X

Internal documents X X

External documents X X

Software review X X

*Boldface identifies most frequently used method.

acteristics, the data processing is actually highly structured.

Data Collection Technique and Application Type Finally, in discussing different data types, we desire to know which data collection techniques are best for each application type. By combining the infor- mation in Tables 4-7 and 4-8, we develop Table 4-9 to summarize data collection techniques for each application type. The table entry in boldface shows the principle method of data collection for each technique.

TPS and query applications can profit from the use of all techniques. Meetings and interviews predominate because they elicit the broadest range of responses in the shortest time (see Table 4-9). Observation and temporary job assignment are particularly useful in obtaining background informa- tion about the current problem domain, but need to be used with caution so as not to prejudice the design of the application. Questionnaires are useful when the number of people to be interviewed is over 50. Also, questionnaires are useful in identifying characteristics of users that determine, for instance, training required of users during organizational fea-

X X X

Limited Limited X

Limited

X X X

Limited Limited Limited

sibility analysis. Also, if the screen requires, for instance, colors or different types of screen arrange- ments, questionnaires might be useful for present- ing a small set of alternatives from which the actual users choose.

DSS also are shown as having a use for all data collection techniques, but not all techniques are practical in all cases (see Table 4-9). DSS are gen- erally developed for use by people in jobs with a sig- nificant amount of discretion in what they do and how they do it. Therefore, observing or working with one or two people as representative may result in a biased view of the application requirements for a general purpose DSS. Even for a custom DSS, observation and job assignments might both be impractical if the SE does not know enough about the job being supported to interpret what she or he observes. The same holds true of documents. Docu- ments, such as statistical reports, might be useful for providing samples of the types of analyses desired in a DSS. Other documents, such as poli- cies, procedures, and so on, are not likely to be rele- vant to the application. For general purpose DSS with a large number of users, questionnaires are a useful way to identify the range of problems and analysis techniques required in the DSS. This infor- mation might be followed by interviews or meetings to determine DSS details.

102 CHAPTER 4 Data Gathering Application Development

GDSS are usually custom-built suites of software packages that provide different types of support for automated meetings. As such, the SE working on a GDSS environment needs to know the types of issues, number of participants, as well as types of reasoning and group consensus techniques desired. GDSS components are neither common knowledge nor frequently used; you might build one GDSS in a career. Therefore, significant time would be spent finding out about the market,vendors, and GDSS components. External documents on vendor prod- ucts are useful in developing questions that elicit the required information. After knowledge of the market is obtained, interviews and meetings are useful to determine the specific requirements and to review, with users, what the GDSS can and cannot do. Other methods might have some limited value. For in- stance, observation of an actual meeting that might be automated would be useful for the SE to gain insight about how a tool might work. Internal docu- ments that provide information about meetings that the GDSS is expected to provide would also be useful. Both of these techniques, observation and document review, have a specific limited role in pro- viding the information needed to build a GDSS. Any software review that is done would be review of other company's GDSS facilities or of vendor prod- ucts, rather than review of in-house software.

EIS are similar to GDSS in the rarity and general lack of knowledge about what an EIS is. EIS are not standard applications with a screen for data entry of some type and reports that are displayed. EIS are information presentation facilities that can be struc- tured with menus and selection tools, but may dis- play document pages, newspaper articles, book abstracts, summary reports, and so on. EIS are usu- ally built for a small number of users, which elimi- nates the use of questionnaires. EIS are custom and one-of-a-kind environments for which past docu- ments or software will be of limited value. Obser- vation is most likely limited because executives would be uncomfortable in being observed. Tempo- rary job assignment is not possible because you can- not just 'be an executive' for a week or two. This leaves external documents, interviews, and meetings as the most likely techniques for data collection (see Table 4-9). As with GDSS, external documents will

be mostly to identify the market, vendors, and prod- ucts. Interviews are most likely to be used to deter- mine executives' information needs and preferred delivery platforms.

Finally, SEs use interviews, observation, and external documents the most in developing expert systems (see Table 4-9). Experts frequently can talk about external aspects of their jobs, the physical cues they use as inputs, and the result of their reasoning and how it is applied to the business. They are just as frequently unable to discuss their reasoning pro- cesses and how they put the cues together to make sense of unstructured situations. Experts, by defini- tion of the term expert, have so internalized their work that they just do it. They don't think con- sciously about how they are doing what they do. Therefore, observation, in particular, the use of pro- tocol analysis, is useful in getting information the expert might not be able to articulate. Protocol analysis is time-consuming and indefinite because you, the SE, are inferring a reasoning process from actions taken. At best, the protocol analysis gives you questions to ask about the work that assist the experts in discussing aspects of work they ordinar- ily cannot. Thus, observation is interleaved with in- terviews to discuss what is observed. As the process continues, structure is imposed on both the data and the problems to begin to develop the ES. The process of obtaining an expert's reasoning processes is called knowledge elicitation. The process of structuring the unstructured data and reasoning information is called knowledge engineering. Knowledge engi- neering is an activity that is difficult to learn and re- quires training through an apprenticeship approach in which the trainee works with an expert knowledge engineer.

PROFESSIONALISM ___ _ AND ETHICS _____ _

A profession is defined as a job requiring advanced training. Computer information systems develop- ment and any job dealing with information tech- nologies qualify as professions. Professionalism is acting in accordance with the highest expectations of a professional group. Those expectations are codi-

fied in professional codes of ethics for various orga- nizations. The organizations relating most closely to IS professions are the Association of Computing Machinery (ACM) and Data Processing Manage- ment Association (DPMA). Both organizations have ethical conduct codes and the codes are similar. The most widely publicized code for the Association for Computing Machinery [1990], follows:

1. The developer shall act with integrity at all times. a. The developer shall qualify an opinion out-

side his or her area of competence. b. The developer shall not falsify his or her

qualifications. c. The developer shall not knowingly issue false

statements about the present or expected sta- tus of a system.

d. The developer shall not misuse confidential or proprietary information.

e. The developer will remain sensitive to and will reveal potential conflicts of interest.

2. The developer should constantly strive to in- crease his or her competence in the profession. a. A developer will diligently attempt to de-

velop systems that perform their intended functions and satisfy the organization's needs.

b. A developer will help his or her colleagues develop professionally.

3. A developer shall accept only assignments for which there is reasonable expectation of meeting the goals of the system.

4. A developer should use his or her special knowl- edge to advance the health, privacy, and general welfare of the public and society. a. A developer should always consider the indi-

vidual's right to privacy when working with data.

b. A developer should refrain from participating in a project in which he or she feels there will be undesirable consequences for individuals, organization, or society as a whole.

If you read the ACM Code of Ethics carefully, note that it contains ethical topics and professional- ism topics. To separate out what is professional con- duct from what is ethical conduct, we first define

Professionalism and Ethics 103

ethics terms and relate ethics to IS professions. Any- thing that is unethical is also unprofessional, but the reverse is not true. Professionalism is a broader sub- ject than ethical behavior. In fact, the early name for codes of ethics was' codes of professional behavior.' Ethics is in the section on data collection because many of the issues are concerned with user relations and are most evident in data collection activities.

So, what is ethics? Ethics is the branch of phi- losophy that studies moral judgment and reasoning. A dilemma is any situation requiring a choice between two unpleasant alternatives. Therefore, an ethical dilemma is any situation in which a decision results in unpleasant consequences requiring moral reasoning. The addition of information technologies to organizations presents novel, little understood opportunities for unethical behavior that are rarely discussed in texts.

Ethics is an issue of growing interest as it relates to information technologies. You, as users and developers of ITs, are sometimes in particular cir- cumstances that subject you to dilemmas that need to be reasoned through to reach an ethical decision. One problem with ethics is that it is misunderstood as religious upbringing and the application of reli- gious thought to real life situations. In fact, that is incorrect. Ethical decisions and reasoning are based on philosophies of rights, equity, and utility, that is, the greatest good for the greatest number of people. Ethics requires evaluation of alternatives, requiring only belief in the equality and dignity of man. Next, we discuss ethics as it relates to different aspects of data collection and user interactions in application development. Then, a procedure for reasoning that is likely to lead to ethical decisions is presented for your use.

Ethical Project Behavior Confidentiality

Always be trustworthy of information told in confi- dence. In fact, assume that any interview informa- tion is in confidence, unless the person being interviewed is specifically told that it is 'on the record.' Besides being unethical, telling 'tales out of school' will eventually return to hurt your career.

104 CHAPTER 4 Data Gathering Application Development

If you think some information gained in privacy should be shared, ask if the interviewee minds if you discuss it. With permission, the bounds of confiden- tiality are removed and you are free to discuss the information.

The exception to this rule occurs when a person confides in you about an illegal act. You are legally bound to report any illegal activity to the managers, company authorities, and police, if no action is taken. By law, if you do not report illegal acts, you are an accessory to the act and are also libel to legal action.

Privacy

Experts have a right to know when their experience and knowledge are being used in an application. The basic rule is treat others as you would like to be treated. Would you like it if the company observed your use of computers and built systems based on it? Especially in building expert systems there are ethical issues about ownership of expertise. There should be no observation, in person or by computer, without permission. No one should be coerced into cooperation. Participation should be voluntary.

Ownership

Computers are now so much a part of corporate life that we tend to get confused about who owns the resources. On an intellectual level, most people rec- ognize that the company that owns the computers also owns the computer time. But, in a given situa- tion, most people feel that if the resource is not used it is wasted, and that computer time is like the ether, a free resource that is there for the taking. Most executives do not feel the same way, whether or not there is a policy about computer resource use.

Find out, in advance, the company policy or owner feelings about personal use of computing resources, then follow their guidance. Actions like running a program for a friend, doing personal fi- nances, keeping track of the baseball team, and so on mayor may not be ethical, depending on how the company feels about the use of its resources.

Who owns work and work-related products should be spelled out in detail so that if you feel

something is rightfully yours, so does the client! company and you can feel ethical about taking it. For instance, technical, user, or operational documenta- tion, screen designs, data dictionary, program code, vendor literature, or other products that you develop or gather in the course of development are all subject to ownership confusion. If you work for a consulting company and develop a proprietary application, like ABC's rental system, you have no right to sell the processing to other companies. This right is nego- tiable and belongs only to the client unless that right is specifically itemized in the contract. Be clear about ownership and you are less likely to be fired or sued over ownership rights.

The expertise that you gain from working on a project is intellectual property. Expertise is yours unless you sign a contract to the contrary. However, it is unethical to use your company-specific knowl- edge for personal, noncompetitor, or competitor financial gain unless you have an agreement with your employer about such use. Usually employers ask that you not divulge proprietary information, but the definition of proprietary may be open to inter- pretation. Also, employers can bar you from using information for one to two years if they can prove that it might hurt their business. The best course of action is to get such issues in the open and decided in advance so no conflict occurs.

Politics

Try to never be mixed up in a political battle. This is easier than it sounds, especially if you are the SE or project manager. Politics is the science of man- agement often driven by personal motivation. In organizations, most people have the company's interest in mind when they make decisions; everyone is also assumed to at least consider their personal sit- uation in decisions, as well. Some people put per- sonal improvement ahead of all other considerations, even to the detriment of the company. Extreme self- ish motivation without regard to the outcome for others or the company is unethical.

In a political battle, the politician(s) try to ma- nipulate the project results to improve their position in the company. Political maneuvering might take

different forms: stalling, lying, artificial require- ments, false cooperation, or different public and pri- vate statements. You, as the SE or PM, must become sensitive to such actions and learn how to diffuse them. The tactics are manifested in the discussion of interviewee actions and interviewer reactions in Table 4-5.

Courtesy

It is not necessary to tell every project problem to the user. You are ethically bound to discuss problems that might impact schedule, budget, or accuracy. When to tell a user about problems requires common sense. You should tell them early enough to warn them that the problem is coming, and late enough not to have been a whistleblower for nothing. Never wait until the last minute when nothing can be done to fix the problem, or all project participants lose cred- ibility. Always solicit user assistance in problem res- olution once they are told. The purpose of weekly status meetings is to provide status and identify problems and their anticipated resolution. These problems always foreshadow schedule and budget problems when they remain outstanding for a long period. A problem outstanding several months with no solution in sight will probably impact the sched- ule and budget. In keeping the user up-to-date on technical problems you indirectly apprise them of potential cost and budget overruns.

Personal Manner and Responsibility

When people work on a project with others, they sometimes lose sight of their contribution as stand- ing on its own for quality review. Somehow the notions of 'on time, within budget, and accurate' have meaning to the project but not to the individ- ual who is coding and testing a module. One role of the PM and SEs is to instill the sense of responsibil- ity in every person. Each person should know their tasks, budget, expected resource use, and due date. Each person should be held accountable for meet- ing their deadlines and for having no errors in the code. Accountability is easy to displace in project work; who is responsible becomes diffuse. Some

Professionalism and Ethics 105

people say the project manager is always account- able. Some say the analysts and SEs. Some say no one. The short answer is that everyone is responsi- ble for and should be made accountable for his own work and its integration into the project whole.

Do not talk to your manager, client, or your employees about work problems that do not relate to project completion. This is just good business. Managers and clients want answers and solutions, not problems. Therefore, they should be informed of status and problems that might someday affect them, but should otherwise be left alone. A manager doesn't want to know how Suzie in the typing pool or Carl in the copy room butchered your work. You deal with it and forget it. If you have a problem with the quality of someone's work who does not report to you, mention it to that person, and if unresolved, talk to their manager. The less accusatory and more factual you can be, the less like a whiner and com- plainer you appear. Be sure you can back up any accusations you make.

Do not tell the client or your manager about your personal problems unless you have a personal rela- tionship. Personal problems can always be blamed for everything that goes wrong, but that is neither adult nor ethical. Henry Ford's famous quote, "Never complain, never explain," comes to mind here. Your job at work is to work, so just do it.

Do not get emotionally involved with the user. If there is a budding relationship, it can wait until the project is complete. Emotional involvements are easy to fall into when you are together 10 to 15 hours a day for months at a time. They also are prone to collapse as soon as a new project begins and you and they both work with others 10 to 15 hours a day. Emotional attachments cloud judgment and do not belong in the office.

Never intentionally mislead. Never lie. Never give false impressions, false perceptions, or any information that might cause users to infer a better, bigger, more functional application than you plan to deliver. Users will form their opinions based on what you and their managers tell them. Don't oversell the application and what it can do for their job. Also, if a downsizing is taking place at the same time, don't falsely give people hope that their jobs will be saved

106 CHAPTER 4 Data Gathering Application Development

when they might not. You don't raise alarms, but you don't give false hope either.

Ethical Reasoning When you feel you are confronted with a problem that requires ethical reasoning, you need some way to identify all potential stakeholders, to evaluate the alternative courses of action, and to reason through the alternatives. One such method is presented here as a way to initiate reflection on your own thinking about the way you reason through tough problems. This is certainly not the only method of problem reasoning.

Identify Stakeholders

First, identify who might benefit or suffer from your decision. This action identifies stakeholders, people who have a stake in the outcome of your action. This is a difficult task, especially with computer use when you might not know the stakeholders personally. Stakeholders might be stockholders of a company, the company itself, your boss, you, the user com- munity, the user/client for the application, society, or people subject to direct or indirect connection to the application. For instance, space shuttle astronauts, patients in a hospital, people who live near the plant in which the application will run, e-mail recipients, report users, governments, data entry clerks and their managers, all might be stakeholders.

Identify Actions Stakeholders Would Choose

Then, identify the action each stakeholder would prefer you to take and why. This task defines all pos- sible actions. Begin with yourself. What do you want to do? Why do you think this is the best decision? Answer these questions from the perspective of each stakeholder group. Putting yourself in each stake- holder group's position requires objectivity and dis- tance from the problem.

Eliminate Alternatives

Next, determine if there are any policies, procedures, laws, or other guidelines that make one or more

alternatives untenable. Cross them off the list. Once a type of conduct crosses over into governance by laws, it is no longer an ethical issue, but becomes a legal one. Always obey the laws of the country you are in and the country you represent. For instance, bribery is a way of life in many countries, but not in the United States. Therefore, you are legally bound not to use bribery in business when you work for an American firm.

Policies and procedures of companies are similar in codifying conduct, but do not hold the same strin- gency of penalty for their transgression. Violation of policies is usually a fireable offense, meaning you lose your job when you violate a policy. Procedures are less stringent, but are expected to be followed. You might receive a letter of reprimand for not fol- lowing a procedure exactly.

Guidelines, such as the professional code of ethics listed above, also provide heuristics about conduct to help you in governing your work behav- ior. There is no direct penalty in not following a code of ethics. You might be sued or fired, but the pun- ishment is not from the professional organization.

Reason Through Negative Outcome Alternatives

For the possible courses of action remaining on your list, reason through each by asking key negative questions. If the answer to any of these negative out- come questions is yes, remove the alternative action from the list.

Are the rights of any person or group violated by this action? Consider the right to privacy, ownership of information about individuals' buying habits, pay- ment habits, income, tax status, and so on. Consider the rights to company privacy of customer, financial, personnel, medical, and other proprietary inform a - tion. Ask if the lack of security and access controls, for instance, subject the database to casual brows- ing by system users. If such browsing could result in a violation of privacy to customers, it should be prevented.

Does taking this action result in inequitable treat- ment of a person or group? Equitable treatment requires judgment of equality. In multinational com- panies, inequity might be seen as a business deci-

sion. For instance, many US corporations initially got into international business by dumping their sec- ond rate quality goods in other markets. Was this ethical? The answer is in the manner in which it was done. If the goods were sold as second quality, there is no issue. If the goods were sold as first quality, the companies basically lied and were unethical.

Companies might be subject to inequity because of their internal staff quality, too. Does the company lose money because of the inefficiency of design? A manager, for instance, might insist on using a par- ticular software because he knows it, even though it is not efficient for the task. The manager is making a trade-off of current knowledge versus cost and time for learning a new product, that can cross the line into unethical behavior when it costs the com- pany tangible amounts of money. Using mainframes which rent for millions instead of networks that cost thousands could be construed as unethical when net- works are not even considered because of a lack of expertise. In other words, making a business deci- sion to stay with a significantly more expensive alternative after considering all alternatives, is ethi- cal. Avoiding a comparison of alternatives or mak- ing a decision because of technical ignorance is not ethical.

Does taking this action have the potential of plac- ing a person or company in jeopardy financially, physically, legally, or morally? Hospital applications that hook patients to computerized monitors, trans- portation industry applications that affect safety of planes and cars, power plant applications that deal with monitoring power-generation equipment, and so forth, are all potentially life-threatening. We need such applications, but their design and maintenance must be of the highest possible quality to pose the least risk to human life. If corners are cut on analy- sis, design, or testing, lives can be lost.

Reason Through Neutral and Positive Outcome Alternatives

For remaining actions, ask key positive outcome questions to select the best alternative. Does taking this action result in the best possible outcome for all stakeholders? What is the result of taking no action?

Summary 107

If only negative outcomes are possible, does tak- ing this action result in the least harm to all stake- holders? If this is the case, who suffers and what type of injury? If the stakeholder is warned in advance, can the problem be averted?

Select a Course of Action

When all the pros and cons of each alternative have been identified, select the alternative that produces the greatest good for the greatest number of people, that does not violate anyone's rights, and that results in the most equitable decision, with all stakehold- ers' equity considered.

SUMMARY ________ ~ __ _ Data gathering is done during every phase of appli- cation development, but serves different purposes in each phase. The types of data collected depends on the type application and phase of development.

Data types refer to the characteristics of data for time-orientation, structure, completeness, ambiguity, semantics, and volume. Attention to data types in selection of data collection technique is less likely to cause errors and more likely to find errors than inat- tention to data type. The cost of errors rises dramat- ically the later in the development process it is found. Time orientation of data refers to past, pres- ent, or future data requirements for an application. Data structure refers to the extent to which data can be classified. Data completeness is the extent to which desired information is present. Ambiguous data have unclear or multiple meanings; companies strive for unambiguous definitions for data. Data semantics are the meanings, we as organization employees, give to data. Volume is the numbers of each item of interest in an application. Volumes can have widely varying time orientations. SEs must attend to peak as well as average volume.

Several data collection techniques were dis- cussed, including interviews, group meetings, questionnaires, observation, temporary job assign- ment, review of internal and external documents, and review of software. Interviews are meetings between two or three people for obtaining any type

108 CHAPTER 4 Data Gathering Application Development

of information. Interviews can be structured or unstructured. Questions asked can be open-ended or closed-ended.

Group meetings include four or more people and can substitute for interviews or can be used to validate interview findings. Joint application devel- opment meetings are a special type of meeting specifically convened to develop application re- quirements. Special training and planning are required for JAD sessions. Both interviews and meetings require attention to an agenda and time period.

Observation is the monitoring of one or more per- sons' work. Observation is useful for learning a problem domain and is most often used in expert system development. A data analysis technique called protocol analysis is used to infer the reasoning processes of experts from detailed manuscripts of their actions during a period.

Temporary job assignment is an alternative means of gaining problem domain experience for nonmanagerial, nonexecutive jobs. Question- naires are structured forms of interviews conducted on many people, usually more than 50. Statistical techniques are frequently used in analyzing ques- tionnaire results. Reliability and validity of the ques- tions are issues to be considered in questionnaire development.

Document review is useful in gaining background information about an application area. Documents can be internal or external to the company.

Software review is the analysis of programs and documentation to learn the details of a current application.

In developing the information about data collec- tion technique related to application type, we also re- lated data collection technique to data type and data type to application type. From these analyses, we find that interviews and meetings are most fre- quently used because they are the only techniques useful regardless of application type. The other tech- niques have specific purposes for each application type. For instance, software review for TPS, tempo- rary job assignment, or observation are useful in gaining problem domain experience. Observation is most useful in expert system development. External

documents are important in unique GDSS and EIS development. Questionnaires are most useful in DSS for general use in a company, for surveying user preferences for design options, or for obtaining detailed information about the application from a large number of people.

KEy TERMS ________ _

closed-ended question data ambiguity data completeness data semantics data structure data time-orientation data volume dilemma document ethical dilemma ethics intellectual property interview joint application

development (JAD) knowledge elicitation knowledge engineering

meetings observation open-ended question peak volume politics profession professionalism protocol questionnaire reliability semantics stakeholder structured interview triangulation unstructured interview validity

REFERENCES ______ ~ __ __ Flaaten, Per 0., Donald J. McCubbrey, P. Declan

O'Riordan, and Keith Burgess, Foundations of Busi- ness Systems, 2nd ed. Fort Worth, TX: Dryden Press, 1992.

Gause, Donald C, and Gerald M.Weinberg, Exploring Requirements Quality Before Design. NY: Dorset House Publishing, Inc., 1989.

Lucas, Henry C, Jr., The Analysis, Design, and Imple- mentation of Information Systems, 4th ed. NY: McGraw-Hill, Inc., 1992.

Mockler, Robert J., and Dorothy G. Dologite, Knowledge-based Systems: An Introduction to Expert Systems. NY: Macmillan Publishing Co., 1992.

Zahedi, Fatemah, Intelligent Systems for Business: Expert Systems with Neural Networks. Belmont, CA: Wadsworth Publishing, 1993.

EXERCISES ----------~ .... --

1. Ethics is far from a settled issue, especially as it relates to use of information technologies. One issue, for instance, is that development of artifi- cially intelligent applications might be unethical because we do not know how they will turn out. That means, we cannot predict if a person or company will get hurt. Debate this issue and develop conclusions for your class. Summarize the debate and send it to a trade magazine such as Communications of ACM, Computerworld, or Datamation.

2. For ABC Video, play the roles of Vic, Mary, and Sam. Either write or playact an interview to elicit requirements for the proposed rental appli- cation. Mix the use of open and closed questions to follow a chain of logic.

3. Develop a questionnaire that might be used with the user community of the Office Information System case in the Appendix.

STUDY QUESTIONS ___ _

1. Define the following terms: ambiguity professionalism ethical dilemma reliability joint application semantics

development structure of data professional triangulation

2. Why are data types important? What happens when the wrong data collection techniques

are used? How does data collection tech- nique relate to costs in applications?

3. How do data types relate to applications? 4. Discuss the cost of fixing errors in applications. 5. How do ambiguity and semantics differ? Why

are they both important? 6. When are temporary job assignments not a use-

ful data collection technique? 7. What type of information can be got from tem-

porary job assignments? 8. What is the use of reviewing documents? How

do you choose whether to review internal or external documents?

Study Questions 109

9. Why would you ever review software? What are the pitfalls of software and software docu- mentation review?

10. Compare and contrast individual interviews and meetings, listing two purposes that are the same for both techniques and two that are different.

11. Compare and contrast structured and unstruc- tured interviews.

12. Compare and contrast open-ended questions and closed-ended questions.

13. Describe how an unstructured interview pro- gresses. What types of questions are used as the opening? How does the interviewer know what types of questions to ask? What types of questions are used after the opening?

14. Which kinds of data can you best get from observation?

15. Which kinds of data can you best get from external document review?

16. Which kinds of data are you unlikely to get from a questionnaire?

17. Which data collection technique is most useful for obtaining expert reasoning processes? Why? Describe the use of the technique.

18. Which data collection technique is most useful for obtaining executive needs for an EIS?

19. Why are expert systems and EIS unique? 20. Which question types are used for factual,

detailed explanations of work processes? 21. How do you select between structured and

unstructured interviews? 22. What is the typical follow-up to an interview?

Who does what and when? 23. Why are meetings a useful data collection tech-

nique? How do you plan a meeting to avoid wasting time?

24. Describe how to develop a questionnaire. 25. Describe protocol analysis. When is it

used? What application type(s) is it most used for?

26. What type of data are most likely in a DSS? 27. Describe the time-orientation of EIS. What

type of data is associated with EIS? 28. Describe knowledge engineering. When is it

used and why?

110 CHAPTER 4 Data Gathering Application Development

29. What is the difference between professionalism and professional ethics?

30. Discuss three of the six areas of ethical conduct by IS professionals.

31. Describe an ethical dilemma you might face in application development work. How should it be dealt with?

32. Describe the reasoning process for developing an ethical solution to some issue.

* EXTRA-CREDIT QUESTIONS 1. For ABC Video's rental application, we still do

not know accurate counts for volumes of rentals, late returns, on-time returns, late fees, or cus- tomers. How would you go about finding this information? Be specific in identifying a data collection technique, the number of people involved, and the amount of time involved. At what stage of the development process should this information be got?

2. The ACM's Code of Ethics, number 2, discusses the need for developers to constantly increase competence in their profession and to help others to do likewise. Is this an ethical issue? Who are the stakeholders to the issue? Reason through the issues and develop your own thoughts on the subject. Compare them to class- mates, arguing for your position.

3. List and define the data type for all data cur- rently identified for ABC's rental application. Refer to Chapter 2 for the data definitions.

--- PROJECT INITIATION ----- -----------------------------------.............. -------

The two chapters in this section address the activities that take place before analysis of a specific project begins. Project initiation can take place in several different ways. First, it can be part of a larger enter- prise reengineering effort. Second, a project might be initiated as part of an information systems plan- ning effort. Third, a project might be initiated based on a user request for a specific project. All three methods of project initiation are equally feasible and equally useful in beginning an application develop- ment project.

Chapter 5 addresses the first two project initiation efforts. The main discussion is how to do a reengi- neering design of an organization and plan applica- tions and technologies to support the redesign. Enterprise level planning, such as an information systems plan, is described as a subset of activities that focus on applications only and are an abbrevi- ated reengineering study. Most researchers and in- dustry experts, such as James Martin, recommend that at least an information systems plan (ISP) is a worthwhile planning activity in existing organiza- tions. Both reengineering and ISPs result in plans for multiple applications which are prioritized for development.

Enterprise level planning exercises are costly, and some companies cannot afford to spend computer resources on such studies. In these companies, appli- cation development projects are initiated via a direct request from a user. Also, companies that do enterprise level plans might desire to reconfirm rec- ommendations that might be two or three years old. For direct initiation and for reconfirmation of rec- ommendations, a user memo to the Information Sys- tems Manager or to an IS Steering Committee can initiate project assessment. Such an assessment is called a feasibility study.

Chapter 6 details the activities involved in a fea- sibility study. A feasibility study is performed to assess the financial, technological, and organiza- tional readiness of the company for the application. Feasibility is an important analysis that is usually conducted on individual application projects rather than on a whole group of applications, such as might be identified in an ISP or organizational reengineer- ing project. The feasibility analysis determines the extent to which new technologies, skills, or training are required by the user and developer staffs and assesses the ability of the company to pay for the development project.

111

112 PART II Project Initiation

Part of the technical feasibility is to define a direction for the application development through an evaluation of technical development alternatives. For instance, an application might be on-line or real- time; it might be on a standalone PC, on a PC con- nected to a local area network, or on terminals attached to a mainframe; it might use a 4GL data- base software such as Orac1e™ or a full-service database such as IMS DB/DC. 1 Likely alternatives are evaluated to determine the extent to which func- tional requirements would be supported, and to determine any alternative-specific benefits that might be present. A recommendation for technical

1 Oracle ™ is a trademark of the Oracle Corporation. IMS DBIDC is a product of IBM Corporation.

concepts is made and may (or may not) be accepted at the completion of the feasibility study. Even though the concept need not be cast in concrete at this time, it helps to have a sense of the operational environment for conducting the analysis phase of the project.

A risk assessment should be performed as part of feasibility analysis. The risk assessment identifies technical, personnel, and financial problems that could hinder the successful completion of the proj- ect. For each risk defined, two types of plans are developed. First, a contingency plan to deal with the problem if it should occur is defined. Second, imme- diate steps to minimize the probability of the risk's. occurrence are planned and taken.

CHAPT ER5

ORGANIZATIONAL ________ ----- REENGINEERING AND ENTERPRISE _______ .--------

PLANNING ______ ~ __ ~-

INTRODUCTION ____ _

As the economy becomes more global and the busi- ness climate more competitive, companies need to reevaluate what they do and how they do it. Reengi- neering is the evaluation and redesign of business processes. The goal is to streamline the organiza- tion to include only the business functions that should be done rather than necessarily improve on what is done today. Reengineering can introduce radical change into organizations with information technologies as key to supporting new organiza- tional forms and providing information delivery to its users.

When radical approaches are not necessary (or wanted), the techniques of reengineering can be scaled down to provide enterprise level plans for information systems. Enterprise level planning tech- niques originally were developed in response to managers' complaints that IS departments did not respond to their information needs and frequently built applications that the company did not need. Enterprise planning techniques match IS plans to organization plans and are also used within the context of reengineering. Techniques include stake- holder analysis, critical success factors, and infor-

mation systems planning (ISP). In this chapter, we first develop the conceptual basis and methodology for reengineering. Enterprise techniques are defined for use in reengineering analysis. Then, enterprise level IS planning, without organization design, is described. The last section identifies computer-aided software engineering (CASE) tools that support re- engineering and enterprise level analysis techniques.

CONCEPTUAL _____ _ FOUNDATIONS _____ _ OF ENTERPRISE ____ _ REENGINEERING ____ _

Organizational reengineering is the evaluation and redesign of business processes, data, and technol- ogy (see Figure 5-1). The goals ofreengineering are to achieve dramatic improvements in quality, ser- vice, speed, use of capital, and reduced costs. The rationale for business reevaluation comes from need. The need may be to turn around a failing company, to increase competitiveness, to improve customer service, to increase product quality, or any combi- nation of these. The philosophy of reengineering is

113

114 CHAPTER 5 Organizational Reengineering and Enterprise Planning

Before

Data

After

FIGURE 5-1 Reengineering Targets

that, when implemented alone, total quality pro- grams, organization redesign, or information tech- nology are inadequate for an organization to realize its potential. The main resources of organizations today are people and information. Both people and data, the raw material of information, have to be optimized to even try to meet the company's poten- tial. Organization redesign optimizes the people resource; the interjection of quality improves both organization and data. Complete reevaluation of technology that provides the information infrastruc- ture optimizes the data and delivery of data to the people who need it. This chapter discusses how to

evaluate the organization and its information re- quirements, how to reengineer both the organization and the technology, and how to plan for the imple- mentation of radical change.

Reengineering theory comes from management and IS. Management theories about organization design, job design, and reskilling are all used in re- design of work and the organization structure.

First, good management practice dictates that only essential activities be done. To assure this, re- engineering assumes an organization level plan for all functions, activities, and processes that accom- plish the activities. It also assumes that the plan is actively managed to ensure that all processes di- rectly relate to the organization's mission, goals, and objectives. Nonessential processes, departments, and layers of management are eliminated to streamline, speed, and lower the cost of process performance.

Second, in job redesign, a caseworker approach is preferred to an assembly line approach. Casework- ers 1 have increased control, decision making, au- thority, and discretion. Redesigned, enlarged jobs improve the quality of work life, thus, improving the quality of work.

To satisfy employees and customers, for instance, customer service departments might adapt a case- worker approach to work. In the caseworker ap- proach, employees know the entire process from beginning to end and work independently to service their personal customers. In addition, the caseworker works closely with the marketing and sales force for those same customers. The consequences of case- workers are great. The customer service agents have reskilled, enlarged jobs that are more interesting. Intrabusiness communications between, for instance, sales and customer service, are improved. External customer relations should also improve because cus..: tomers have one consistent representative with whom they work.

Enlarged jobs are not a way to squeeze more work out of already overworked people. In the cus- tomer service example, initially a clerk does a small number of activities that present a partial view of a large number of customers. In a reengineered job,

1 Hackman [1990].

Conceptual Foundations of Enterprise Reengineering 115

the clerk does a large number of related activities that present a complete view of a small number of customers. The move is away from an assembly line approach and toward self-sufficient workers or work groups.

The first reengineering improvement for case- workers comes from job redesign. If the 80-20 rule is applied to most businesses, 80% of the transac- tions in the business are the norm, and 20% are exceptions. Organizations are typically designed to handle exceptions well. The 80% of their work that is normal tends to take much longer than needed. One goal of reengineering is to increase handling speed and quality of handling for the 80% of nor- mal transactions by an order of magnitude, for instance, by at least 10 times. A second goal is to decrease the number of exceptions to as close to zero as possible. For instance, at Ford, one way to prevent errors in the receipt of goods from vendors was to accept only complete, exact shipments. Any item that did not match an order item caused the entire or- der to be returned. Vendors got the message quickly that Ford would not accept their shoddy work prac- tices any more and were forced to revise their pro- cedures as well.

Empowerment of the caseworkers comes from job redesign, removal of errors from the process, and from the use of any and all information and tech- nologies that assist them in performing their job. Information technologies enable reengineering. In- formation technologies (IT) are any technologies that support the storage, retrieval, organization, man- agement, or processing of data. A technology plan and goals should be developed and managed at the organization level.

In addition, data, the raw material for informa- tion, requires recognition and organizational com- mitment as a corporate resource. As a corporate resource, data requires the same careful planning and ongoing management as cash-on-hand, office equip- ment, or personnel. Data must be managed at the corporate level as a key asset of the organization.

To manage and plan for the organization struc- ture, its data, and its technology, enterprise level (i.e., the entire organization is the enterprise) plans must be devised. These plans, or 'architectures,' provide a snapshot of the current organization. An

enterprise architecture is an abstract summary of some organizational component's design. The orga- nizational strategy is the basis for deciding where the organization wants to be in three to five years. When matched to the organizational strategy, the architec- tures provide the foundation for deciding priorities for implementing the strategy.

The organization process architecture identifies the major functions of the organization, the activities that define the functions, and the processes that accomplish the activities. Examples of each of these levels are shown in Figure 5-2. It does not detail the procedures for how to do each task.

During reengineering analysis, the entire process architecture is reevaluated for its support in achiev- ing organizational goals. For processes that survive the analysis, the organization is redesigned. Theories of interdependence, linking mechanisms, and orga- nization design are applied to structuring work groups in the reengineered organization.2 These the- ories are not new. Rather, theorists and practitioners have talked about them for years with little move- ment of theory into practice. Over the same years, information technologies matured sufficiently to support the integration and data sharing required of the information organization. In the early 1990s, a ground swell of changing companies became an avalanche, with many companies trying to imple- ment the theories using information technologies to support the revised organization.

The second architecture, data architecture, iden- tifies the enduring, stable data entities (people, places, organization, events, and applications) that are critical to the organization maintaining itself as a going concern. IS theories of information modeling and information systems planning are used in data analysis. In particular, entity-relationship modeling is used for documenting data and its relationships. Entity-process analysis is used to design subject area databases. Entity-application analysis and process- application analysis are used to define automation requirements. These analyses originated in IBM's

2 Interdependence theory is Thompson's [1967]. Galbraith [1976] and Galbraith and Nathansen [1979] propose linking mechanisms with some organization design. Other organiza- tion design work is listed in the references.

116 CHAPTER 5 Organizational Reengineering and Enterprise Planning

FIGURE 5-2 Sample Process Architecture

information systems planning (lSP) methodology and are expanded in reengineering.

The network architecture identifies all locations of work and their communications requirements. It is the basis for deciding telecommunications support.

Finally, the te~hnology architecture contains information about platforms [e.g., mainframe, local area network (LAN), or personal computer (PC)], special-purpose technologies (e.g., multimedia, imaging, e-mail) and the locations of each. By map- ping the network and technology architectures, orga- nization level technology changes can be identified. New technologies, such as imaging, can be evalu- ated and positioned to provide the most leverage to the organization.

Successful reengineering is not assured. Neces- sary conditions, or absolute prerequisites, for reengi- neering include:

1. Management commitment, usually from the CEO or top manager of the organization.

2. Formally articulated organizational mission, goals, and objectives.

3. Full commitment of the reengineering team.

4. Training and support for the reengineering team.

5. The desire to change the organization and its culture.

In addition to the necessary conditions, reengineer- ing assumes the following:

1. Nothing escapes review. The reengineering team has as its mission to evaluate the orga- nization, including its structure, jobs, data, processes, and technology. Recommendations in any of the five areas of assessment may be made.

2. Enlarging jobs and empowering job holders as caseworkers rather than as assembly line workers is desirable.

3. Business and IS organizations must become partners in the redesign and technology empowerment.

4. In improving quality of processes, elimina- tion of errors via elimination of functions and superfluous processes is desirable.

5. There are no technology constraints. Recom- mendations will be made without regard to current budgetary, organizational, or other constraints. Implementation planners, based on recommendations and manager's priori- ties, will attend to constraints.

6. Data shareability is desired. While normaliz- ing data within an application environment minimizes redundancy in an application, min- imizing organizational data redundancy via data administration and across applications is the real goal. Building subject area data bases and providing data access based on need rather than on organization structure is the means to achieving organizationally mini- mized data redundancy.

This assumption of no constraints may not be realistic in that politics and survival of participants can subvert the desired objectivity in a reengineering project. One of the management challenges in re- engineering is to prevent politics from preventing the needed change.

Industry leaders and successful turnaround companies who now thrive provide the motivation for sweeping change. These companies are orga- nized differently from their competition. Industry leaders today tend to have fewer departments, fewer layers of management, and fewer people doing anal- ogous jobs than their competition. Their success is partly organizational and partly cultural. These successful companies succeed because they define their market in terms of what their customers want and demand, then they exceed those expectations. Because these companies do not have excess struc- ture, they are flexible to continuously reeval- uate what they are doing and how well they are doing it.

Planning Reengineering Projects 117

Ford Motor Company, for instance, turned around their losing company when introducing their 'Qual- ity is number one' program. They compared their organization to others, including Japanese firms, and found they had many more people performing simi- lar functions. In some cases, like the accounting area, the difference was more than 10 to I in num- bers of people. Ford threw away the book about how accounting should be done, eliminated parochial interests about where decisions should be made, made data sharing from databases universal, and reduced their staff by over 60%. The result of the extensive changes is happier people with more skills used in a given job. Individual jobs are done faster and more cheaply with almost no errors.

The philosophy of reengineering is to define stakeholders' goals and then exceed them. The phi- losophy is based on the idea that change can be good. Companies must scan the business horizon and actively change the organization as needed to lower costs, and to improve, speed, and increase the quality of service(s) in meeting its mission. They must be equally proactive about discontinuing ser- vices, departments, applications, or technologies that no longer relate to organizational goals and objec- tives. In short, the organization must be proactive rather than reactive about all aspects of its operation.

PLANNING ______ _ REENGINEERING ____ _ PROJECTS ___________ _

Schedules for reengineering projects can be based on several different scenarios. The goal of all scenarios is the same: redesign of organization, jobs, pro- cesses, data, and supporting technologies. A sec- ondary goal is that all redesign planning be completed in a short period of time. The short period should be within four months from the time the team is formed until all recommendations are presented to the senior manager sponsor(s).

Reengineering projects can be completed faster or slower depending on several factors. First is the amount of actual time spent by each team member. Ideally, all team members should be relieved of their

118 CHAPTER 5 Organizational Reengineering and Enterprise Planning

current duties and assigned full time to the reengi- neering effort. In reality, the best managers, who you want on the team, also are the most needed to run the current business. So, part-time or short duration full- time commitments might have to suffice.

In all cases, one to four senior IS staff (i.e., con- sultants, senior analysts, software engineers, or proj- ect managers) are assigned full time to the project. Much of the work performed during the reengineer- ing project is identical to that performed as part of an information systems planning exercise. IS staff who already know ISP only need to learn several types of matrix analysis and organizational design to be fully capable of performing the reengineering work.

The second major factor in determining the amount of time is the size of the organization being analyzed. A 100-person, five-department organiza- tion can be analyzed easily within four months. A 10,000-person, 200-department with four hierarchic levels can also be analyzed within four months, but requires more people and more discipline to the team. A good rule of thumb is to have one person for every 10-15 departments or every 100 jobs.

Four months is the time most authors recommend for completion of the entire reengineering project, from inception to development of the implementa- tion plan. The actual pilot testing and implemen- tation of the changes might take several years to complete. There are several good reasons for a short time schedule. First, managers cannot suspend their work indefinitely and run a company, too. If several people are allocated full time it drains the management resource. Second, with a mentality ori- ented to quarterly results in the United States, most managers will not wait longer than that to prepare for change. Third, the project is bound to be known throughout the organization soon after it begins. When reorganizations are imminent, work is re- placed by gossip and worry. The shorter the time of the reengineering study, the less lost work to the organization.

When the end date is mandated, the team does the amount of work they can accomplish within the time constraint. This approach to work is called 'level of effort.' With a level of effort approach, the team works at capacity up to the deadline and, what does not get done, does not get done. For large projects,

then, the level of effort approach assumes an incom- plete analysis.

The assumption here is that error-prone and bot- tleneck processes are the targeted activities. While a high-level description of the entire enterprise is pos- sible, only the problem activities are actually in the level-of-effort study.

Scenarios for three levels of user manager partic- ipation are provided in Figures 5-3 through 5-5. Fig- ure 5-3 shows a short burst of participation, similar to a joint requirements planning (JRP). 3 In this sce- nario, users and analysts are trained and go off-site for an intensive 4- 8 days (depending on the size of the organization) of requirements, data, process, and entity-process analysis. An alternative that mini- mizes the amount of time managers are absent from work is to hold the JRP meetings over one or two weekends. More than 90% of the data gathering can be completed using the JRP approach. In this sce- nario, most of the analyses are done by the full-time project staff, but are presented for review and deci- sions to the user-team participants. In no case do the IS staff make the decisions and recommenda- tions alone.

The second scenario assumes constant part-time participation over time (see Figure 5-4). In this sce- nario, user managers are available for meetings, interviews, and analysis sessions 1-3 hours each day. They must be committed to participating and must not waver from participation, or the project will falter. Notice the dotted lines for all activities. The dotted lines imply a part-time, longer activity. The full-time IS staff actually do most of the legwork, interviews, and preparation for analyses. But, once again, the decisions are made by the user managers, not the IS staff.

The final scenario assumes full-time commitment for the duration of the project (see Figure 5-5). With full-time users and full-time IS staff, the length of the project can be as short as three weeks and, for large organizations (e.g., 1,000 people, 50 depart- ments), as long as 16 weeks. Table 5-1 shows the major tasks and activities with expected percentages of effort for each task.

3 JRP is an innovation of IBM Corporation. It is fully discussed in the introduction to Part II.

Reengineering Methodology 119

Weeks

111111111122222 123456789012345678901123

Identify Sponsor

Assign Staff

Scope Project

Create Schedule

Identify Mission Statement

Gather Information

Develop Data Architecture

Develop Process Architecture

Develop Network/T echnology Architectures

Develop Analysis

Develop Org. Implementation Plan

Develop IT Implementation Plan

Legend:

1- 1-1- -

~ ~ P:; ~

~I- "r- 1--

~ ~

"U 1-1- II

~ Findings Presented to Users

m Recommendations Presented to Sponsor

I Implementation Plan Presented to Sponsor Part-Time Activity

Full-Time Activity

FIGURE 5-3 Reengineering with Part-Time Users

REENGINEERING ____ _ METHODOLOGY ________ _

Reengineering is most easily done within the scope of information system planning (ISP) projects. With a greater balance of process and data analysis, and several additional activities, reengineering uses the same information as the ISP. The major steps and

their results, type of questions asked, and analyses are listed in Table 5-1. The steps are summarized in Figure 5-6 which shows a significant amount of overlap between steps. The times allocated to the tasks are as individual stand-alone activities and do not reflect the amount of actual time spent on the step. For instance, the architectures are all allocated one week. But they are preceded by activities of four weeks during which they should also be developed.

120 CHAPTER 5 Organizational Reengineering and Enterprise Planning

FIGURE 5-4 Reengineering with Continuous, Part-Time Users

All of those particular steps are iterative and require three to five weeks to complete. A detailed descrip- tion of each reengineering step follows.

Identify Project Sponsor The first step of reengineering is to enlist or be en- listed by the project sponsor. The project sponsor

is a senior level manager who will pay for and champion the project. A champion is an individual with commitment, enthusiasm, credibility, and in- fluence who can act as a 'cheerleader' for the project and its outcomes. The sponsor is the overall project manager for the reengineering project and must have the authority, fortitude, and desire to change the organization and its work, based on the recommen- dations from the reengineering analyses.

Reengineering Methodology 121

Weeks

111111111122222 123456789012345678901123

Identify Sponsor

Assign Staff

Scope Project

Create Schedule

Identify Mission Statement

Gather Information

Develop Data Architecture

Develop Process Architecture

Develop NetworkiT ech nology Architectures

Develop Analysis

Develop Org. Implementation Plan

Develop IT Implementation Plan

Legend:

~ Findings Presented to Users

m Recommendations Presented to Sponsor

II Implementation Plan Presented to Sponsor Part-Time Activity

Full-Time Activity

FIGURE 5-5 Reengineering with Full-Time Users

Assign Staff Three or four user area, senior, or middle managers should be assigned to the reengineering project for a period not to exceed four months. At least one month of the initial commitment should be full time; the remainder of the work may require only part- time commitment. Two or three senior IS managers, or SEs, or data administrators, or consultants should

be assigned to the project full time for its entire duration.

All team members should attend ~ reengineer- ing workshop or class together to fully acquaint them with the techniques and goals of the activity. The individuals assigned must have commitment to this work. They must be senior enough and good enough at their own jobs to have instant credibility within their organization. Without both of these

122 CHAPTER 5 Organizational Reengineering and Enterprise Planning

TABLE 5- 1 Percentage of Reengineering Effort by Task

Activity % Effort

Obtain sponsor N/A

Initiate project N/A

Assign staff N/A

Scope project 2-5% (Concurrent with next two 2 Days activities)

Develop schedule 2-5% 3-5 Days

Identify mission statement 2-5% 1 Day

Gather data 20-25% 3-4 Weeks

Develop process architecture 6-10% 3 Days-1 Week

Develop data architecture 6-15% 3 Days-1 Week

Develop and analyze entity/ 20-25% process matrix 3-4 Weeks

Develop implementation plan 20-25% 3-4 Weeks

Develop technology 6-10% architecture 3 Days-1 Week

Total duration 100% 12-17 Weeks

requirements, the target of four months for the total effort is doubtful.

Scope the Project The key criteria for properly scoping a reengineering project are data self-sufficiency and user commit- ment. Data self-sufficiency is defined as 70% (or more) of data used in performing the business func- tions that must originate within the subject organi- zation. The goal of scoping is to identify a group of departments that create their own information and are not dependent on other departments for data to

do their work. Control over data creation equals data self-sufficiency.

The second criteria is user commitment. User commitment means that the managers participating in the reengineering project must be committed to changing the organization. This is not as difficult as it might sound. Few people enjoy their job when they know it is inefficient and hampered by ineffec- tive organization or systems designs. When the best managers in an organization that needs change are assigned, they become enthusiastic about the pros- pect of designing the work groups to fit the work. Because their positions in the company are not at risk, there is little reluctance to participate.

Determining data self-sufficiency requires de- velopment of a quick entity-relationship diagram (ERD), process hierarchy, and entity/process matrix. The results should be about 80% complete and address the major entities and processes. The analysis of the matrix is to determine where data are created, nothing else. If data are not created within the organization, the amount of data and the creat- ing (or originating) organization are identified and added to the study. In addition, the amount of data for all entities created within the organization must be identified to determine the percentage of data self- sufficiency. The percentage is derived from the for- mula shown as Figure 5-7.

The inputs to the formula (I) identify a count of transactions or other work items generated within the target reengineering organization. The outside work (0) represents a count of transactions or other work items coming into the department from else- where in the organization. Outside work is not sub- ject to review or error reduction, and the goal is to keep it to a minimum in the study. In Figure 5-7, the target organization generates 75% of its own data and is, therefore, data self-sufficient enough to ben- efit from reengineering.

Less than 70% data sufficiency implies too narrow a scope because of too great a data depen- dency on outside organizations. Lack of data self- sufficiency artificially constrains (or may mask potential) elimination of errors, organizations, or levels of management that are not needed. If the scope is too narrow, the analysts present the infor- mation to the sponsor and request a broadened scope to include the information-creating organization(s).

Identify Sponsor

Assign Staff

Scope Project

Create Schedule

Identify Mission Statement

Gather Information

Develop Data Architecture

Develop Process Architecture

Develop NetworklTechnology Architectures

Develop Analyses

Develop Org. Implementation Plan

Develop IT Implementation Plan

Reengineering Methodology 123

1 1 1 1 1 1 1 1234567890123456

FIGURE 5-6 Overlap Between Reengineering Tasks

Formula:

1/ (I + 0) * 100 = % DS

Example:

1= 750,000 records

0= 250,000 records

750,000 / (750,000 + 250,000) * 100 = 75%

Legend:

I = Data generated inside the reengineered departments

o = Data generated outside the reengineered departments that is required for them to do their work.

DS = Data Self-sufficiency

FIGURE 5-7 Formula for Determining Data Self-Sufficiency

For instance, reengineering might target an accounting function. About 90% of the informa- tion in an accounting function originates from other organizations within the company. Without also including those functions in the reengineering study, changes that address, for instance, data accu- racy or work location problems, are unlikely to be successful.

The scope might not be complete until the next several tasks are partially complete, due to a lack of information about data and responsibilities. There- fore, the initial scope should be reexamined before completion of the entity/process matrix analysis to reconfirm data self-sufficiency.

Create a Schedule The team creates a schedule for the entire reengi- neering project not to exceed four calendar months. Each step has an estimated range of time that should

124 CHAPTER 5 Organizational Reengineering and Enterprise Planning

be allotted as a percent of the project total shown in parentheses (see Table 5-1). Each task is assigned to a team member who is held a<;countable for the work.

Identify Mission Statement Identify the mission statement for the organization with quantified goals for measurement. A mIssion statement is a short paragraph summarizing the overall purpose of the organization. The details of the document should include goals and objectives, along with determinants of success (i.e., critical success factors) for each, with required data for mea- suring the extent to which the goals are met (i.e., means-end analysis).

If the organization has no mission statement, or has no quantified goals and objectives, do not attempt to develop these for the organiza- tion. Disband the reengineering group and have the managers work on perfecting the mission, goals, and objectives before reconvening the reengineer- ing effort.

Goals should have a three- to five-year horizon and should be specifically measurable (i.e., quanti- fied) (see example in Figure 5-8). There should be at least one goal for each sentence in the mission statement. Goals relate to stakeholders who are peo- ple affected by the outcome. Some stakeholders

Increase the number of new customers by 5% each year for five years.

Increase sales to existing customers by 8% per year for five years.

Increase number of rentals per store visit by providing an expert system to assist in selecting movies for rental.

Reduce sales support expenses by 10% in one year.

Reduce overhead expense by 10% each year for two years.

FIGURE 5-8 Example of Organization Goals for ABC Video

include customers, vendors, stockholders, owners, and boards of directors.

Identify critical success factors for determining that goals are met. A critical success factor (eSF) defines some essential process, data, event, or action that must be present for the outcome to be realized. For instance, if the goals in Figure 5-8 are desired, a eSF might be Ensure that sales staff are fully trained in locating movie information.

The last step of critical success factor analysis is to decide what information is required to measure goal success. In the example, goals relate to sales. The eSF also relates to 'training.' Success measures for sales and for sales staff knowledge of how to find movie information are required. Periodic evaluation with training for ill-informed sales staff is one way. Management needs to know evaluations that have taken place and misinformed staff who have been retrained. If the same person(s) are being retrained, management intervention might be warranted.

Intangible goal measurement is just as important as tangible goal measurement. An intangible goal might be increased customer satisfaction. To mea- sure this, an outside polling company can canvass customers and ask different recall or direct questions about their satisfaction with the company's services. Recall-type questions are of the form: Which vendor that you work with has the best customer service? Direct questioris are of the form: Rate the customer service of company x.

The next step is to link each eSF, critical infor- mation measure, and goal to functions, processes, technology and data in the organization. If this step calmot be completed yet, defer completion of this task until information gathering is complete. If new entities or processes are defined through eSF analy- sis, add them to the list for reengineering analysis.

Gather Information Gather information on processes, data, process prob- lems, quality problems, data problems, accessibility to data, timing of work (e.g., lags that cause idle time), time constraints for performance, and prob- lems related to timing. A sample list of questions are:

What are the major steps to accomplishing each process?

Which processes/procedures are required to accomplish the mission, goals, and objectives?

What data are used as input? Where does it come from? Who enters or creates data? uses or retrieves data? changes or updates data? deletes data?

How is the input transformed by the process to produce the results? That is, what do you do when you do your job?

What data are passed between processes? What is the current storage media for the data (e.g., computer, fax, paper, verbal, memo, etc.)?

Are the different types of data that you need for your job used sequentially or in parallel? Could you describe the procedure?

Where are time lags in your job during which you are waiting for someone else to give you work or information? How do you deal with these lags?

Where are quality problems? How do you deal with errors? What is the source of each type of problem? Where (in which process or outside organization) is each problem detected? Where are quality problems within the procedures you use to do your job? How do you try to guard against these problems?

What would you do differently if you could design your own job? How might computer technologies help you? Suppose you have all the new computer and other technologies available for your job's use. What technology would you use and how?

Information might come from forms, screens, re- ports, phone messages, fax messages, automated applications, policy and procedure books, and so on. The people actually doing the work provide this information.

Most information is obtained through an inter- view format. Interviews should be individual or in small groups (groups should have members who share common goals to minimize political conflicts). All middle and senior managers for the Qrgani~ation should be interviewed in addition to representative

Reengineering Methodology 125

white-collar, blue-collar, or clerical staff. Treat the sessions as fact-finding, not fault-finding. Address all the topic areas for which information is required.

If you think you are getting incomplete or false information, cross-check, or triangulate, the infor- mation by asking the same questions of multiple sources. For instance, Manager A says his major problem is caused by erroneous data received from Manager B's area, and Manager B did not identify the problem in your first discussion. Return to Man- ager Band reinterview him or her, specifically dis- cussing data quality as a problem identified by the other area.

To validate the complete findings, make a group presentation to all interviewees for final confirma- tion that the information is accurate and complete.

Summary of the Architectures In this section, we expand Zachman's4 information systems architecture (ISA) framework to describe how to express the reengineering information in terms of architectures. The four architectures of interest in reengineering are data, process, network, and technology. First, we define the framework and information presented at each level. Then, re- engineering information is translated into the four architectures.

Conceptual Levels of the Architecture

The information systems architecture (lSA) de- scribes distinct architectures relating business con- text to application context. The five levels are described in general terms below and are summa- rized in Figure 5-9. Only the first two levels, scoping and enterprise analysis, are used in reengineering.

Information systems application development and organizational redesign are complex engineering activities that are similar to constructing a building or an airplane. The ISA describes the intellectual levels of detail needed for complex engineering

4 John Zachman [1987]. Zachman's architecture discusses data, process, and network. ISA does not yet include a technology architecture. This idea is from reengineering consulting which requires a view of the technology as a basis for technology redesign.

126 CHAPTER 5 Organizational Reengineering and Enterprise Planning

Model Level

Scope Sponsor

~----------~----------

Enterprise User

~--------------------- Information

Systems SEiAnalyst

~---------------------

Hardware/Software SEiDesigner/Builder

~---------- ----------- Components Programmer

~---------------------

Functioning System Computer

Adapted from J. A. Zachman, 1987

FIGURE 5-9 Conceptual Levels of the Architecture

activities. Then, it links them to data, processes, net- works, and technologies-the components of com- puter applications and reengineering.

In all three businesses-aerospace, architecture, and systems-we begin with a sponsor's idea of what the item being built should look like. This is the scope of the reengineering project that defines what is in and what is not in the problem. If the item is a house, for instance, users talk about a two-story colonial with four bedrooms, three bathrooms, and a fireplace in the family room. For reengineering, the sponsor targets departments doing order processing and customer assistance. In this case, the item is the order processing department.

The user talks to an expert to describe his or her view of the item, and the expert translates the user's idea into an enterprise level, logical description of the item. A logical description is one that lists what is done without saying how. The item begins to take more shape and be less specific. The description of the item is somewhat more abstract. For the house, we now have a family room of 13.5 feet by 16 feet with a cathedral ceiling that is open to the kitchen with entries to the foyer and living room. For reengi- neering, we have an order entry process that includes order receipt, order change, order inquiry, inventory allocation, creation of shipping papers, movement of goods, invoice creation, and an interface to accounts receivable. Both of these descriptions are signifi- cantly more detailed than the first. Neither descrip- tion is complete. We still don't know the type of windows in each room, for instance. Nor do we know, for reengineering, whether the work is auto- mated, how an order is processed, or whether any of the steps can be done together. In both cases, the details are unimportant at this level.

At the next level, the expert translates the logi- cal, enterprise view of the item into terms and infor- mation that are useful to the analysts of the item. So, the expert (or different experts) translates the enter- prise view into a logical information systems design. The logical design still describes what the item will do, but in more detail than before, and in terms understood by application developers. In reengineer- ing, the logical design is very specific about the item, its parts, and how they fit with the other items and their parts. In our order processing example, we would know what data, what fields, what processes and their details, timing of processing, what appli- cations and technology are currently used to support the work. Designers can review the detailed logical design and see how it can be automated.

In the next step, designers review the logical design and translate it to specific materials, thus cre- ating a technology-based model. In reengineering studies, this translation takes redesigned work, work groups, departments, data, and technology as inputs. The inputs are translated into database schemas, applications' design specifications, network designs, and specific hardware/software platforms for sup- porting the redesigned work. In the order process-

Model

Scope Sponsor

Process

List of Business Functions

I I I I I I I

Data

List of Things Important to

Business

Function = Group : Entity = Class of of Activities : Business Thing

I I

Reengineering Methodology 127

Network

List of Business Locations

Node = Bus. Location

Technology

List of Business Technologies

Technology = Platform + Special Equipment

I I I -------------r----------------r----------------I----------------,-----------------

I I I I I I

: Process : Entity Relationship I Logistics Network

Technology Network

Enterprise User

: Hierarchy : Diagram I I I I I I I I I I I I

Function = Business Process

Entity = Business Entity

Relnship. = Bus. Constraint

Node = Business Process

Link = Comm. Need

Node = Computer Link = Network Link

Adapted from J. A. Zachman, "A Framework for Information Systems Architecture/' IBM Systems Journal, 26, #3, 1987, pp. 276-292.

FIGURE 5- 10 Reengineering Levels and Architecture Domains

ing example, the requirements specification would be translated into program specifications for specific hardware, software, and language.

Finally, at the lowest component level, schemas, specifications, and technology plans are imple- mented and translated into working computer components.

Only the scope and enterprise models are dis- cussed in this chapter; the other levels are too com- puter-oriented and not appropriate for reengineering.

Domains of the Architecture

The conceptual domains apply to four organizational domains: data, process, network, and technology. A domain is an area of interest. The data domain defines the entities of interest to the target organiza- tions and the interrelationships between them. The process domain describes the functions, activities, and processes of the target organizations, without any identification of how they are accomplished. The

128 CHAPTER 5 Organizational Reengineering and Enterprise Planning

ABC VIDEO MISSION STATEMENT The mission of ABC Video is to develop and maintain quality relationships with customers, vendors, and employees.

For customers, we provide a large selec- tion of current and classic videos for rental at a fair price. We assist them in selecting videos with courtesy, service, and a minimum of bureaucracy.

Process Data

Video Selection Customer

Service Request Video Rental (i.e., process rental) Vendor

Order Creation Order

Accts. Payable Video (= goods in inventory)

Payroll Employee

Personnel

network domain describes the organization from a geographic perspective. The technology domain describes the organization of work from a technol- ogy platform perspective.

Translating Information into Architecture There are two levels of architecture we describe in this section for reengineering: the scope and the enterprise model.

Scope

In reengineering, we assume that the mission state- ment fully expresses the scope of the organization. The mission statement is translated into network

For vendors, we order videos with reason- able lead times and timely payment of bills.

For employees, we provide a congenial atmosphere with comfortable, clean, and safe working conditions for a fair wage.

Network Technology

Location = 1 None

(inferred)

technology, process, and data scopes to initiate the reengineering effort. Example 5-1 shows a mission statement for ABC Video and how it might be trans- lated to identify the scope of the four domains. At the scope level, we should know the major entities of interest to the organization and the business func- tions and their activities.

The network and technology domains mayor may not be mentioned in the mission statement. The sponsor or user participants define these when they are not in the mission statement. The network scope defines the location of work for each activity. The technology scope defines technology platform by location. Because ABC has only one location and no technology, it is a simple example. Another ex- ample here is for a plastics subsidiary of a large international company. Figure 5-11 shows existing hardware platforms listed by location. In Figure

Hardware Platform-Scope

Location 1

Mid-Size Computer

LAN 1-25 PCs

LAN 2-15 PCs

LAN 3-42 PCs

Location 2

LAN 4-23 PCs

Location 3

Mid-Size Computer

5 Stand-alone PCs

3 CAD/CAM Platforms

Location 4

Mid-Size Process Control Computer

LAN-25 PCs

1 CAD/CAM Platform

Location 5

Mainframe

FIGURE 5-11 Plastics Company Hardware Platform Scope

5-12, the activities from the process hierarchy are reused and identified by location.

At this point in reengineering, if the mission statement were suspect in its completeness, a stake- holder analysis might be developed to determine if all constituents of the organization are represented. If they are not, the mission statement would be redrafted to include missing constituencies. While this redrafting takes place, the reengineering study ceases operation. A stakeholder is any person who interfaces with, works for, or otherwise is impacted by an organization. Stakeholders include the owner, managers, employees, suppliers, customers, credi- tors, government, community, and competitors. Ideally, representative stakeholders from each group should review the strategy and offer suggestions for improvement.

When stakeholders are identified, the goals of each stakeholder are defined and related to the orga- nization's functions and strategies. If a goal does not

Reengineering Methodology 129

Activity by Location

location 1

Finance-3 products at this location

Accounting-All products

Customer Service-All products

Product Management-3 products

Personnel/Payroll

Location 2

Finance-2 Products at this Location

Product Management-2 Products

General Manager

Location 3

R&D

Manufacturing Setup

Location 4

Manufacturing Plant

Location 5

Corporate Headquarters

FIGURE 5-12 Plastics Company Activity by Location

match a current function or strategy, management determines if the goal will, in fact, be met. The goals are translated into strategies which, in tum, are trans- lated into work. The intention of stakeholder analy- sis is that rational, reasonable goals should have both strategic and organizational functions that relate to the attainment of goals. Even if goals are omitted from the final strategy, at least all stakeholders and their desires are identified and considered.

Enterprise Models

At the enterprise level, the USer managers work with information systems (IS) project repre- sentatives to define busiriess areas in logical terms. The principle business modeling activities include entity-relationship diagrams (ERD) for data, functional decomposition diagrams for work pro- cesses, a network diagram of process communi- cation needs, and a technology network diagram

130 CHAPTER 5 Organizational Reengineering and Enterprise Planning

Place Order

Identify Items & Vendors

Call Vendor to Verify Availability

and Price

Create and Mail Order

FIGURE 5- 13 ABC Video Process Hierarchy

showing technology deployment. The ERD docu- ments major data types and their interrelationships. The functional decomposition identifies business functions and their component activities and work processes. The network architecture shows the loca- tion of work and intraorganizational communication requirements. The technology architecture shows the hardware platforms by location and the telecommu- nication linkages between them. All four architec- tures are developed piecemeal as information becomes known. (ERD and functional decomposi- tions are discussed in detail in Chapter 9 and are only summarized here.)

Monitor Order

Identify Late or Problem

Orders

Call Vendor and Inquire or Reconcile

Verify Receipts Against Orders

PROCESS ARCHITECTURE. Process architec- ture development is concurrent with data gathering. The time recommended in Table 5-1 is for comple- tion and validation of the information. The decom- position first identifies business functions, then the component activities and their processes. Figure 5-13 shows an example. A business function is a group of on-going activities that accomplish some complete job that is within the mission of the enter- prise. Functions are general and fit most organiza- tions. For instance, accounting and personnel are found in most organizations regardless of industry or business type. At the next level of detail, an activity

defines one or more related procedures that accom- plish some task. For accounting, for instance, activ- ities might be monthly close, maintaining chart of accounts, or daily transaction processing. At the lowest level of detail for this diagram, a business process identifies the details of an activity, fully defining the steps taken to accomplish the activity. Business processes within an accounting monthly close might be gathering information, validating in- formation, performing initial analysis, and so on.

The steps to developing a functional decomposi- tion diagram include:

• Identifying the functions of the target organizations

• Interviewing the representatives from each area to identify the activities performed for each function

• Further identifying the processes for each activity

During the decomposition process, business problems are identified by the interviewees. The problems are prioritized by the users with the reengi- neering team in order of their significance to the organization's quality and function. Usually the number of major problems to be identified is fixed and between five and ten. Without a limit, the prob- lem findings could overwhelm the analysts. Also, having the number of major problems fixed requires users to reach consensus about the seriousness and scope of problems.

DATA ARCHITECTURE. This activity is con- current with data gathering. One week of extra time is recommended to allow completion and validation of information. The data architecture is defined in an entity-relationship diagram. An entity is some per- son, object, concept, application, or event from the real world about which the organization maintains data. A relationship is a mutual association between entities.

For instance, a customer creates an order. Cus- tomer and order are entities; create is their mutual relationship. Figure 5-14 shows a basic ERD that summarizes this relationship. ERDs can be much more elaborate and include the number, or cardinal- ity, of the relationship, and information about

Reengineering Methodology 131

Customer

C reates

0 / "'- Order

FIGURE 5-14 Sample Entity Relationship Diagram

whether or not the relationship is required. Cardi- nality identifies one-to-one, one-to-many, and many- to-many relationships. Each customer can have many orders; therefore, this is a one-to-many rela- tionship. So in this ERD the cardinality is one-to- many. The many side of the relationship is shown with 'crow's feet' on the diagram. Orders don't come from thin air; there must be a customer to have an order. Conversely, customers are not required to always have orders. Therefore, customer is required, and order is optional in the relationship as signified on the diagram by the short bar and small oval, respectively.

The steps, then, to developing an ERD are:

• Identify data entities, including new entities required to attain and name organization goals

• Link entities to show their interrelationships • Define relationship cardinality and the

required/optional nature of relationships

NETWORK ARCHITECTURE. The enterprise level of network architecture defines organization activities from the functional decomposition performed at each location and communica- tions requirements between them. The architecture

132 CHAPTER 5 Organizational Reengineering and Enterprise Planning

\ \ \ \

\ \ \ / /

6~-y---

I I I I I I I I I I I ~-- : ~ L ____________ _

Location 3

---------

-- ~ -- Mftg ------- Location 4

\ \ \ \ \ \

---- \ ---~

I I

------' Location 5

Interdependent activities; constant contact required.

- - - - - - - Coordination and information sharing activities; intermittent contact required.

No connection-independent activities; no regular contact required.

FIGURE 5- 15 Plastics Company Network Architecture

described in this section is of the current organiza- tion. During reengineering, if the changes recom- mended affect the locations of work or the activities of work, then the network architecture is redrawn to mirror the recommended organization. When the changes are presented to the sponsor for approval, the old network and recommended network archi- tectures should be contrasted to highlight the changes.

The process hierarchy defined functions, activi- ties, and processes. The network architecture could

define any of these levels. For ABC Video, we would choose the function level because there is only one work location. For the plastics company ex- ample (see Figure 5-15), the activity level is chosen because functions located in more than one place may not include the same activities at all locations. Using the activity level gives a further level of detail, atld accuracy, to the work. If the company were very decentralized and diverse, the analysis could be at the process level.

For the architecture, each activity is placed in a

Reengineering Methodology 133

Location 1 Location 2 Location 3 Location 4 Location 5

I I I I

:1~N41--i---~~~-I--r~-~:Y; I I SIZE I SIZE I I I I I

I I I PC'S I I I LAN 51 I I I I I I I I I I I I CADI I I I CADI I I I I CAM I CAM I

Legend:

--- = Permanent Link

- - - - = Dial-Up Link

FIGURE 5- 16 Plastics Company Technology Architecture

circle within a square identifying a location (see Fig- ure 5-15). The circles are connected when the activ- ities require communication to complete their work.

TECHNOLOGY ARCHITECTURE. The tech- nology architecture creates a network diagram of ex- isting technology at each location using a network technique similar to the network architecture (see Figure 5-16). Then, the technology platforms are connected with lines to show telecommunications linkages between them. Dotted lines are used to show dial-up linkage. Solid lines are used to show permanent connections. At this level, other special hardware, such as imaging, CD-ROM, or technolo- gies such as ISDN, are connected to the platform to which it is attached.

Like the network architecture this is a snapshot of the current technology deployed throughout the organization. If the recommendations for the re- designed organization eliminate or change locations, a second technology architecture is created to depict the new view of the organization.

At this point, the team is complete in their data gathering. The team conducts a group meeting with all previously interviewed individuals to summarize their findings and present the diagrams. The purpose and sole focus of the meeting is to verify the accu-

racy of the information presented. No further analy- sis, and no suggestions on the analysis, should be discussed at this meeting.

Architecture Analysis and Redesign The analysis uses a series of matrices matching the architectures to redesign the organization, its data, applications, and technology infrastructures. The process and data architectures are the basis for the organization and data design. The current applica- tions are mapped to the redesigned organization and data to recommend changes to the application envi- ronment. The technology and network architectures are analyzed to recommend telecommunications and technology infrastructure changes that best meet the enterprise's goals. These analyses are discussed here.

Organization and Data

A process called affinity analysis is used to analyze the data and processes. Think of this as normalizing data across the organization. Affinity means 'attrac- tion' or 'closeness.' Affinity analysis clusters pro- cesses by the closeness of their functions on data

134 CHAPTER 5 Organizational Reengineering and Enterprise Planning

Entities = Purchase

Order PO Item Vendor

Item Inventory

Item Vendor

Processes = Identify Items & Vendors

Call Vendor to Verify Avail/Price

Create & Mail Order

File Order Copy by Vendor

Identify Late & Problem Orders

Call Vendor & Inquire on Order

Verify Receipts against Order

CRUD CRUD

R R

RU RU

Send Invoices to Accountant RD RD

FIGURE 5- 17 ABC Video Data/Process Matrix

entities they share in common. Because the average data/process matrix has about 400 entries, affinity analysis is best accomplished through an automated tool, such as ADWTM.5

A matrix of processes from the process hier- archy diagram and data entities from the entity- relationship diagram is created. The processes are written in rows down the left side and data entities across the top (see Figure 5-17). Use the lowest level processes, such that all elemental processes for the organization and application area are pres- ent. When writing the process name, append a prefix to identify the activity and function from the hierarchy diagram.

In each cell, identify the functions each process is allowed to perform on data. Possible functions are create (C), retrieve (R), update (U), and delete (D). One or more of the letters, as defined by the current organizational responsibilities, are entered for each

5 ADW is a trademark of Knowledgeware, Inc., Atlanta, Ga.

CRU

entity. This matrix gets its nickname from those functions; it is a CRUD Matrix.

Affinity analysis relates processes by their re- sponsibility in creating shared entity information. The create responsibility for 80+% entities shared between processes shows high affinity. An affinity matrix is iteratively refined by affinity groups or processes with entity creation responsibility. In a typical 20 x 20 matrix with 400 cells, five to seven affinity clusters will emerge. Affinity clusters may contain processes from several current organiza- tions; organizational location of responsibility is not of interest in this exercise.

Several clusters may overlap. This is normal and not a cause for worry. If only one cluster emerges, clustering continues with analysis of update respon- sibility, and, if necessary, delete and retrieval re- sponsibility. When a reasonable number of clusters emerges, the next step begins. A reasonable number may be one to five clusters for a small organization, such as ABC, or six to nine for a large organization.

Figure 5-18 shows affinity clusters for ABC. A first analysis of create responsibility would place Create & Mail Order in a group and Identify Items & Vendors in a group without classifying the other entities. The final clusters shown in the figure emerge after also analyzing update, delete, and retrieval responsibility. The lines highlight the clus- ters and simplify diagram interpretation; they do not necessarily include all actions in the clusters. Notice that the Call Vendor to Verify . .. process overlaps both clusters. It is placed in the second clus- ter because it also updates Vendor information.

The next step is to analyze organizational ade- quacy. Each process is individually analyzed first to ensure process-goal correspondence. If the process is specifically tied to the organization goals, objectives, and mission, mark it for retention. If the process is not tied to the organization goals, objectives, and mission, either link it to goals or objectives, or mark it for elimination.

Purchase

Reengineering Methodology 135

Next, for processes that are candidates for elimi- nation, determine if they also create, update, or delete data. What is the relationship of this process to 'close' processes? Is it in a sequence with other processes? If so, can those processes take on its data responsibilities (thus enlarging the scope of the process)? If the eliminated process also stands alone, where else is the data used? If the answer is no- where, mark the data for elimination. [Plan to return to the individual(s) who identified either the process or the data to confirm that you have not missed some information linking the process or data to the mis- sion.] If data is created by the process marked for elimination, but updated and deleted elsewhere, can the other processes assimilate data creation? What other information will those processes now need in order to be able to create the entity? Ask similar questions for updating and deleting the data.

Next, analyze the current organization design. First, is each data entity created only in one process?

Vendor Inventory

I EntHles = Order PO Item Item Item Vendor Processes = Create & Mail Order CRUD CRUD CRU R R

Call Vendor & Inquire on Order RU RU RU R R

Verify Receipts against Order RU RU RU R

Send Invoices to Accountant RD RD

File Order Copy by Vendor R R

Identity Late & Problem Orders R R R R RU

Identify Items & Vendors R R CRU

Call Vendor to Verity Avail/Price RU RU

FIGURE 5-18 Affinity Clusters in ABC Data Process Matrix

136 CHAPTER 5 Organizational Reengineering and Enterprise Planning

If not, is there some business reason why two processes are creating the same data? Or is there his- torically introduced redundancy? If the former, continue the analysis. If the latter, combine the processes and eventually redo the affinity analysis. Second, are the processes that cluster together in the same department? If so, the organization need not change. If not, then realign the organization bound- aries to have all processes that create the same data reporting to the same manager. Expand the scope of the pro-cesses to include as much of the create-up- date-delete processing as possible. Needs for re- trieval or access affect future plans rather than this decsion process.

When the process analysis is complete, the remaining processes are all critical to the organiza- tion mission. The next task is to tentatively redefine jobs within the context of the remaining processes. The goals of job redesign are to enlarge and enrich the jobs, and to eliminate interprocess dependencies through job design. Interprocess dependency is elim- inated or reduced by the caseworker approach to job design and by expanding data access to all who use it.

To define a job, begin with the processes in a function. Add processes to the job until either the skill mix or activity served changes. Then, define another job until either the skill mix or activity changes. Continue to define jobs until all processes are assigned. There may be jobs that span activities but they should be exceptional.

After jobs are completely defined, map them to functions by their affinity, that is, in terms of their data creation and usage. Do not pay attention to the number or types of jobs reporting to functions at this point. Again, concentrate on eliminating errors, paper, and dependencies. When all jobs are mapped to activities, the first phase of organization redesign is complete. The next phase takes place during the implementation planning.

The second analysis and redesign that results from process/data analysis is for subject area data- bases and applications to support them. This is a more subjective analysis than job redesign because there is no theory of application development and how to size applications. The current thought is that applications that support well-defined subject areas

will provide the best organizational support. The rea- son for this is that subject areas, data entities, and attributes are all fairly static. With well designed data, the processing can change without affecting the database.

First, use entity clusters to define subject area databases. Check that each entity is also linked to at least one goal or objective. If an entity is not linked, either establish the correspondence, or mark it for elimination. Conversely, analyze the processes which use the entity. If this is the only data used by the process, but the process is tied to some goal, determine the presence of data to measure progress toward the goal and, if needed, add a new entity to the list; otherwise, if the related process also stands alone, mark both the entity and the process for elimination.

The subject area databases defined by affinity analysis should be mapped to current, automated applications. If the subject areas are completely automated and the applications are integrated, no changes are needed. Rarely is this the case. Usually several applications process pieces of subject area data and both manual and automated usage of data is required. The only integration is through the expe- rience of users who know where to go for informa- tion they need.

Redefine applications to support each subject area of data. Define application changes for process changes that reduce problems. Define ad hoc query facilities for all jobs requiring retrieval access to data. Assume on-line processing for most applica- tion work. Identify and recommend technologies that streamline and speed information storage and deliv- ery. Based on the problems and solutions identified, determine the potential impact of applications for meeting goals. Prioritize applications for develop- ment to achieve the greatest impacts first.

Network/Technology Design

Before either the network or the technology designs are done, the receptiveness of the sponsor and man- agers to the changes in jobs and applications should be verified. If they support the work to date, the net- work and technology analyses can continue. If they do not support the job redesign or are reluctant about

application suggestions, those aspects of the reengi- neering must be defined acceptably before this analysis.

There is no theory of network or technology design at the enterprise level. Rather, we have rules of thumb that must be evaluated in each business context. First, if the job redesign and process analy- sis substantially change the activities being per- formed in the organization, the enterprise network model should be recast in terms of the revisions. Next, if locations are significantly different, the tech- nology model should be redrawn to reflect revised locations.

When the two network diagrams are acceptable, they are compared and analyzed to recommend new and changed technologies for supporting the new organization.

Using the technologies identified as needed to fully support jobs, develop an overview of the tech- nology for the organization. Classify types of appli- cations on mainframes, local area networks (LANs), and stand-alone personal computers. Classification should identify applications by size, 'corporateness' of data, data sharing requirements, specialized tech- nology required, and number of users.

Across the organization, rationalize the use of technology resources, minimizing the overall cost to the organization. If new technologies are recommended, develop estimates of implementa- tion costs and benefits, including average cost per expected user employee. If possible, identify incremental costs for expanding the user base once the technology is installed. Include training costs in the estimates. Identify and recommend pos- sible uses for technologies to reduce incremental costs of use.

This activity is one in which the IS representa- tives have the most value added during reengineer- ing. Being technology literate, IS representatives can work with their technology planners to determine possible technologies for consideration that have not been identified before. The IS people should take the lead in the rationalization of technologies. Deciding the type of applications that belong on various plat- forms for the organization requires the know ledge and guidance of the IS steering committee or the IS director (i.e., Chief Information Officer, MIS Man-

Reengineering Methodology 137

ager, or some similar title). Explanations of the applications mapping to technology platforms should be in business terms but based on sound understanding of the technology involved.

An example of network/technology redesign for the plastics company example is provided. The plas- tics company architectures in Figures 5-15 and 5-16 are used to create the revised network in Figure 5-19. One obvious problem is that organizations that need to communicate for work are not electronically connected. This suggests a network change to inter- connect all interdependent activities. This change means that the LANs that are only connected through a star configuration in Location 1 might be connected via a backbone to the midsize computer. Backbones in each location with multiple LANs can be connected to provide intra-location communica- tions, freeing the larger machines for inter-location connection and data processing. With this type of network design, everyone in the company can com- municate with everyone else.

After this cursory analysis, we next look at the technologies used for subject databases and appli- cations. First, the subject databases are added to the technology map. If pieces of databases are scattered, integrate them or determine distribution require- ments. This type of recommendation should be coordinated with the applications recommendations which are probably similar. Recommendations about centralization, decentralization, federation, or distri- bution of both data and processes should be consid- ered. Changes in all infrastructure software such as telecommunications monitors, database manage- ment software, terminal interfaces, and so forth should be considered for each activity at each loca- tion. Advantages and disadvantages of all technolo- gies, current and proposed, should be developed and an estimated cost-benefit analysis developed.

In the plastics company example, software and applications are added to the network/technology analysis shown in Figure 5-20. Order information is only available at Location 1, even though all sales and product management organizations (Locations 1 and 2) require access. These data differences in what currently exists to what is required show the type of findings in network/technology analysis. To deter- mine the best course of action, more information

138 CHAPTER 5 Organizational Reengineering and Enterprise Planning

Location 1 Location 2 Location 3 Location 4 Location 5

Current I I

Applications I I I

Customer Financial R&D Scheduling I Accounting I Maintenance Management I Consolidation R&D Manufacturing: Order (local only) Manufacturing Control Accounting Processing Payroll (local only)

Personnel (local only) Financial Management Management

Payroll (Loca- (local only)

tions 1, 2, 3) Payroll

Accounting (local only)

Financial Management (local only)

Required Applications

Customer Customer R&D Order Inquiry No change Maintenance Maintenance R&D Subsystem Order Order Manufacturing of Order Processing Processing Processing

Personnel Accounting Scheduling Management (local only) Manufacturing Payroll (Loca- Financial Control tions 1 & 2) Management Payroll Accounting (local only) (local only)

Financial Management (local only)

Legend:

= Permanent Link = Dial-Up Link

FIGURE 5-19 Plastics Company Network and Technology Analysis

Location 1

Operatingl Database Software

IBM MVS IBM/DB2

Location 2 I Location 3

Novell/Sybase IBM MVS IBM/DB2

Reengineering Methodology 139

Location 4

IBM MVS IBM/DB2

IBM VM/CMS IBM/DB2

Novell/Sybase Novell/Sybase Novell/Sybase I

Legend:

= Permanent Link = Dial-Up Link

FIGURE 5-20 Plastics Company Technology and Software Details

might be requested of the locations. For instance, do they need up-to-the-minute information? Why or why not? The answer to this question determines the need to redevelop the applications as on-line rather than batch. If the locations need up-to-the-minute in- formation, on-line applications are required. Let's say that the sales and product management informa- tion users need orders only as of the previous close of business and that customer service agents in Location 1 would like up-to-the-minute information because most changes are made the same day. This information about needs gives the reengineering team the details they need to make intelligent rec- ommendations about application changes. In this case, either on-line order entry with retrieval, or the entire application as on-line might be acceptable alternatives.

Next, consider new technologies to manage paper and work flow. For instance, do using groups need facsimiles of the paper forms? In some industries, such as insurance, the answer would be yes. In plas-

tics manufacturing, the answer is no. So, imaging or other micro-forms management hardware and soft- ware are not considered.

Specific operating environments should be con- sidered next. If the networks are used to pass elec- tronic mail and data files back and forth, the operating environments do not necessarily have to be the same. If, however, on-line query and file shar- ing across environments is desired, the network operating systems and database management soft- ware probably should be the same to simplify user access. This type of decision is aided by develop- ment of a cost-benefit analysis for data access using consistent software. What is cost of change? What is the risk and cost of not changing? How much added time is required, per request, to formulate and obtain information with no change, and with change? The answers to these questions are used to determine the redesigned operating environment.

In the plastics example, the current environment down-loads information nightly from Location 1 to

140 CHAPTER 5 Organizational Reengineering and Enterprise Planning

Locations 2 and 4. The managers at those locations would like access to interim data if the applications are moved to an on-line environment. In other words, they want the access if the data are more cur- rent. Customer service needs current information. We decide to move to the on-line environment and provide networkwide access to data and services on the net. If the network operating systems (NOS) and data bases are incompatible with this idea, they would need to be replaced and made compatible.

To summarize, the network and technology archi- tectures are superimposed and compared to decide company changes. Then, technology requests and application and software recommendations are superimposed on the revised technology diagram. Evaluation of requests, suggested changes from IS, and recommendations from the organization design team takes place by analyzing each change. Change evaluation includes cost-benefit analysis, develop- ment of advantages and disadvantages of change, and issue analysis with information supplied by potential users.

Implementation Planning Once the analysis and recommendations are com- plete and tentatively approved, a plan to prioritize and sequence the changes is developed. A reengi- neering study is of limited use if there is no road map for how to attain the recommendations based on where the organization is today. Implementation planning designs the map. The steps of this phase are:

1. Develop job descriptions. 2. Define the organization. 3. Plan information technology. 4. Plan training. 5. Plan implementation.

Define Job Descriptions

This is a first-cut at describing the new positions. The jobs still require human resources evaluation and refinement during the next stage: implementa- tion. To develop jobs, we reanalyze the tentative job descriptions, attending to data needs for each job.

We define jobs as including related job skills for sim- ilar, related data. For each job, list the processes, data, and skills required of an incumbent. When the subject area database changes, create a new job, but keep as a goal that each job should do some 'whole thing,' have decision power, access to all needed data, and be self-contained. Keep in mind that con- straints on job identification are data self-sufficiency, process self-sufficiency, and minimal coupling to other jobs and processes.

For each job, identify the processes and entities. Identify the technologies that would achieve the job objectives with the utmost speed and accuracy. Use suggestions (and return for more specific informa- tion if necessary) from interviewees about technol- ogy that might be used. At this point, do not worry about capital expenditures for technology. Keep technology information for the technology/network analysis.

Question all current methods of work and all process dependencies. For instance, do you need paper copies of orders? By law, you need records of orders, not paper orders. Devise schemes that elim- inate paper, eliminate creation of paper, and elimi- nate any handling of paper. Replace paper with technology whenever the information must be re- tained for legal or governmental compliance.

Concentrate on implementing change to eliminate all identified problems. Relate each process and entity to one or more problems identified; determine how to improve quality of process and eliminate the errors. Finally, concentrate on eliminating depen- dencies between functions and between processes. Interfunctional dependency is minimized by elimi- nating physical interactions or replacing them with technology based interactions. For instance, elimi- nate shipping papers by providing the shipping department with access to the order database.

For each job within each process, write job de- scriptions to align job goals with the corporate goals and objectives. The outcome of this exercise is to give every individual the means-management structure, data, and technology-of meeting those goals. Give every individual, at every level, specific measurable responsibilities. Recommend changes to the compensation plans to relate compensation to meeting/exceeding of objectives and goals.

For each newly clustered, enlarged job, analyze its relationships with other jobs to minimize inter- job linkages. Reanalyze each job to ensure data and process self-sufficiency. Finally, define defect-free work procedures. If errors must be dealt with, describe where they might occur and their proper handling.

Define the Organization

A first-cut organization structure will have three lay- ers: CEO, functional managers, and everyone else.

The implication is that self-directed work teams with either a limited hierarchy or a matrix manage- ment organization will result. Other organizational forms can result but are not specifically defined in any of the reengineering methodologies. The steps to developing a new organization design are: map jobs to functions, analyze relationships between jobs placing jobs in clusters or work groups, based on data self-sufficiency, process self-sufficiency and minimal coupling of clusters, and determine the location of work (in large organizations some jobs are centralized, some decentralized, and some cen- tralized with replication in the remote locations). If the first-cut does not result in a completely irrational organization design, it might be accepted as it is for trial. If there are too many different clusters (use 5-7 as the rule) or too many different jobs in a clus- ter (use 5-15 as the rule), additional reevaluation might be required.

Grouping of jobs is based on their interdepen- dence. There are three types of interdependence in organizations: pooled, sequential, and reciprocal. Pooled interdependence is a relatively indepen- dent, low level of interaction between depart- ments or jobs. Sequential interdependence defines a serial relationship between departments or jobs. Reciprocal interdependence defines highly interre- lated activities that are worked on jointly by multiple units requiring feedback and constant adjustment. For instance, a bank loan department might be viewed as relatively independent (i.e., pooled) from other parts of a bank in that they need customer information received from the customer for their decision with no other units involved. Purchasing, receiving, and payables are sequentially interdepen-

Reengineering Methodology 141

dent in that they all use purchase order data. Yet all these job types have different job skills; that is, they each make different decisions and perform different actions based on their access to the purchase order information. A reciprocally interdependent depart- ment is a hospital intensive care ward in which many specialists with different skills and knowledge all work toward the same goal of patient recovery.

To group jobs, three methods of organization design deal with the three types of interdependence. If the jobs relate to each other sequentially, cluster jobs with similar skills together. Affinity groupings of processes and entities are used to decide skill requirements. Clusters may be sequentially depen- dent with jobs within each cluster providing different skills. Plan to provide shared database access to link clusters; this minimizes paper movement and en- sures data access.

For example, look at the bank loan department again. Bank loan department processes are sequen- tially related after the loan is made. Once the loan commences, records are established and payments are received, posted, and analyzed. In an assembly line approach, these processes are different jobs. In a caseworker approach, all of these processes are within one job. Caseworkers could conceivably monitor loans for any customer, but usually have a case 'load' that is defined by alphabetic groupings of last initial of loan-maker names or some similar scheme.

If the processes have pooled interdependence, then job clusters contain one job type. For pooled interdependence, use subject area data as the decid- ing factor on when to create a new job. Each job, cluster, or group should have its own data self- sufficiency.

If the jobs are reciprocally interdependent and pass work back and forth, or need discussion on details regularly during the performance of work, design work groups in the same way you designed jobs. That is, design work groups to include all skills needed to perform one activity or function. Find all of the jobs that reciprocally share information; then, define the set of different jobs that would comprise a work group. Try to keep groups small with under 12 different jobs represented. For instance, engi- neers, raw materials purchasing, manufacturing, and

142 CHAPTER 5 Organizational Reengineering and Enterprise Planning

quality control may all need access to the same design drawings, specifications, and components lists. They may be able to identify alternatives, make decisions, and improve quality simply by sharing responsibility for finished products. These job types would be clustered in work groups (i.e., quality circles).

Plan Information Technology

The next step is to redefine the IS environment. The rationale for deciding priorities is to correct the ma- jor problems first, and/or meet the goals/objectives with the largest impact on net income. The steps to develop an IS redevelopment plan are:

1. Compile all subject area database and appli- cation changes, redevelopment, enhancement requirements.

2. Compile all technology and network infra- structure requirements.

3. Map technology and network needs to data- base and application needs.

4. Define software reengineering projects. 5. Define new application development

projects. 6. Determine priorities for all projects. 7. Develop a plan for two years of development

and reengineering work. Develop a tentative 3-5 year plan for the remaining projects.

To develop the technology plan, create three matrices: technology/process, process/entity, and an entity/technology matrix. The technologies are all those identified by interviewees and team members as potentially useful in the organization. Complete each matrix. In each cell of the process/technology matrix, enter whether the technology speeds deliv- ery, improves accuracy, improves service, or lowers cost. Enter all improvements that apply. This matrix is used to determine priorities for change.

In the entity/technology matrix identify which data entities are already fully or partially automated and the type of automation. Types of automation include file, application database, or subject area database.

Using the original process/entity matrix, identify the extent and type of automation for each process/ data cell. Types of automation for processes include full or partial, and batch, on-line, or real-time. These matrices may not be 100% complete, but are used to guide the implementation planning process by providing a summary of planned changes.

Plan Training

Develop a training plan to upgrade skill levels to meet new performance requirements, recommending how current jobs can be mapped onto the new jobs. This should be a skeleton plan defining sequencing of training and approaches-outside company, inside company, phased by department, phased over time, and so on. Actual training details cannot be complete until human resources' redefinition and formalization of job descriptions and levels, and estimates of number of people to be trained for each job are known. The plan should be sufficiently detailed to allow a pilot test of the training and new work approach before its complete deployment.

Plan Implementation

Develop an implementation plan that reflects some phased approach to changing the organization. The number of people in anyone job type might be dif- ficult to determine if the jobs are very different from the present. Moving from the assembly line to the caseworker or group work approaches changes the entire equation; more, rather than less, people might actually be needed. Human resources might be able to assist in this type of estimating. If estimates of numbers of people in caseworker jobs are too vague, a pilot study can be conducted to facilitate estimating of total personnel needs.

When the mapping is complete, summarize the recommended changes and determine how they can be implemented. The possible approaches are pilot organization, phased implementation (by function, location, business priority, or application), or total cut-over. Develop timing of changes. If the changes are expected to take more than six months, deter- mine how the organization, processes, data, or tech- nology can be streamlined, changed, added to, or

eliminated now to provide immediate improvement and correction of some problem(s).

ENTERPRISE _____ _ ANALYSIS ______ _ WITHOUT ______ _ ORGANIZATION ____ _ DESIGN _______ _

Even without the extensive organization and tech- nology redesign of reengineering, an enterprise analysis helps managers establish applications prior- ities and develop a plan for introducing new applica- tions and technologies into their organizations.

The same analyses for entities and processes are performed. Current automation of the affinity clus- ters are summarized on the diagram. Recommended changes are mapped to organization goals and strate- gies to decide priorities for change. The changes from enterprise analysis are incremental and relate to applications and subject area databases. Sweeping technology and network reassessment are miss- ing from this activity. Likewise, organization prob- lems and finding obsolete functions are not goals of this analysis.

When organization problems are identified, they can be referred to the sponsor for consideration. One example of organization problems is identified from the entity/process matrix after affinity analysis is performed. Each process should have a prefix identi- fying its original function and activity relationships. If the function/activity prefix for each creating pro- cess for each entity is not the same, an anomaly is found in that multiple managers have responsibility for creating the same data. The idea is that processes which do share responsibility for creating some entity should report to the same manager. The same manager can minimize conflicts and maximize coordination and control over data creation.

A second type of organization problem is found in the process hierarchy diagram. Because the dia- gram is built to describe its information without regard to current organization, some overlap or

Summary 143

duplication of activities may be found. When this occurs, an effective technique for showing duplica- tion, for example, is to draw shadow boxes, behind the process (or activity or function) duplicated. Then, on each box, identify the organization having the responsibility, one box for each organization. This effectively communicates organizational over- lap without a need for additional comment, and is less inflammatory than verbal or text descriptions because it is presenting organizational facts.

AUTOMATED _____ _ SUPPORT TOOLS FOR __ _ ORGANIZATIONAL ___ _ REENGINEERING ____ _ AND ENTERPRISE ____ _ ANALYSIS ______ _

The tools needed to support organization reengi- neering are similar to those for project plan- ning, but include process hierarchy diagrams, entity-relationship diagrams, network architectures, and technology architectures. Many tools support one or more of these requirements. Few tools on the market currently support all of these requirements. The automated support tools are summarized in Table 5-2.

SUMMARY ________ _

Reengineering of an organization reevaluates data, processes, technologies, and communications needs to ensure that an enterprise meets its goals as stated in its mission statement. The activities of reengineering include the data collection, analy- sis, and development of recommendations to meet organizational goals through radical redesign of work.

Reengineering is intended to alter the shape and operations of an organization. Frequently, organiza- tions and managers do not want sweeping change. When incremental change is desired, enterprise level

144 CHAPTER 5 Organizational Reengineering and Enterprise Planning

TABLE 5-2 Automated Support for Organizational Reengineering and Enterprise Analysis

Product

Analyst/Designer Toolkit

Anatool, Blue/60 MacDesigner

Bachman

Company

Yourdon, Inc. NewYork,NY

Advanced Logical SW Beverly Hills, CA

Bachman Info Systems Cambridge, MA

Technique

Entity -relationship diagram (ERD)

ERD

Bachman ERD

CA-products Computer Associates International, Inc. Data modeling Strategic planning

CorVision

Deft

ER-Designer

Excelerator

Cortex Corp. Waltham, MA

Deft Ontario, Canada

Chen & Assoc. Baton Rouge, LA

Index Tech. Cambridge, MA

ERD

ERD Structure chart

Foundation Arthur Anderson & Co. Chicago,IL

ERD Project management Project planning

IEF

IEW,ADW (PS/2 Version)

Texas Instruments Dallas, TX

Knowledgeware Atlanta, GA

analysis uses a subset of the analyses of reengineer- ing to develop applications and subject area database development recommendations.

REFERENCES __________ __ Davenport, Thomas H., Process Innovation: Reengineer-

ing Work through Information Technology. Boston, MA: Harvard Business School Press, 1993.

ERD Enterprise analysis and planning Process hierarchy

ERD Enterprise analysis and Planning Functional decomposition

Dunckel, Jacqueline, Good Ethics, Good Business: Your Plan for Success. North Vancouver, British Columbia: Self-Counsel Press, 1989.

French, W. L., and C. H. Bell, Jr., Organization Develop- ment. Englewood Cliffs, N.J.: Prentice-Hall, Inc., 1984.

Galbraith, Jay R., and Daniel A. Nathanson, Strategy Implementation: The Role of Structure and Process. St. Paul, MN: West Publishing Co., 1978.

Galbraith, J. R., Organization Design. Reading, MA: Addison-Wesley Publishing Co., 1977.

References 145

TABLE 5-2 Automated Support for Organizational Reengineering and Enterprise Analysis (Continued)

Product

MacAnalyst, MacDesigner

Maestro

Multi-Cam

PacBase

ProKit Workbench

Company

Excel Software Marshalltown, IA

SoftLab San Francisco, CA

AGS Mgmt Systems King of Prussia, PA

CGI Systems, Inc. Pearl River, NY

McDonnell Douglas St. Louis, MO

Technique

Decision table Entity class hierarchy ERD

ERD

ERD Enterprise analysis and planning Project management

Enterprise analysis and planning Process decomposition

ERD

Silverrun Computer Systems Advisers, Inc. Woodcliff Lake, NJ

ERD

SW Thru Pictures Interactive Dev. Env. San Francisco, CA

ERD

System Architect Popkin Software and Systems, Inc. NY,NY

ERD

System Engineer LBMS ERD Houston, TX

Teamwork Cadre Technologies Inc Providence, RI

ERD

Telon, and other products Pansophic Systems, Inc. Lisle, IL

ERD

The Developer ASYST Technology, Inc. Napierville,IL

ERD Structure chart Organization chart

Greiner, L. E., and R O. Metzger, Consulting to Man- agement. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1983.

Hage, J., and M. Aiken, Social Change in Complex Organizations. New York: Random House, 1970.

Hackman, J. R, ed., Groups That Work (and Those That Don't): Creating Conditions/or Effective Teamwork. San Francisco, CA: Jossey-Bass, 1990.

Hackman, J. R, and G. R Oldham, Work Redesign. Reading, MA: Addison-Wesley, 1980.

Hammer, M., "Reengineering work: Don't automate, obliterate," Harvard Business Review, July-August, 1990, pp. 104-112.

Hammer, M., "From cow paths to data paths," Computer- world, December 25, 1989-January 1, 1990, pp. 16-17.

IBM Corporation, Business Systems Planning Informa- tion Systems Planning Guide, IBM Document # GE 20- 0527-1, Armonck, NY, 1978, pp. 1-92.

King, William R, "Strategy set transformation," MIS Quarterly, March, 1978.

146 CHAPTER 5 Organizational Reengineering and Enterprise Planning

King, W. R., and D. I. Cleland, eds., Strategic Planning and Management Handbook. NY: Van Nostrand Rein- hold,1988.

Huse, E. E, Organization Development. New York: West Publishing Co., 1980.

Kouzes, James M., and Barry Z. Posner, The Leadership Challenge: How to Get Extraordinary Things Done in Organizations. San Francisco, CA: Jossey-Bass, 1990.

Lindenfeld, E, and J. Rothschild-Whitt, eds., Workplace Democracy and Social Change. NY: Porter, Sargent, 1982.

Rockart, John, "Critical success factors," Harvard Busi- ness Review, March-April, 1979, pp. 81-91.

Singh, Arvind, Comments from A Business Reengineer- ing Workshop given in NY to TIAA, Performance Development Corporation, Princeton, NJ, January, 1992.

Sowa, J. E, and J. A Zachman, "Extending and formaliz- ing the framework for information systems architec- ture," IBM Systems Journal, Vol. 31, #3, 1992, pp. 590-616.

Thompson, J. D., Organizations in Action. New York: McGraw-Hill, 1967.

Zachman, J. A, "A framework for information systems architecture," IBM Systems Journal, Vol. 26, #3, 1987, pp. 276-292.

KEY TERMS ___ --. __

affinity affinity analysis architecture business activity business function business process caseworker champion critical success factor (CSF) CRUD matrix data data architecture data domain data self-sufficiency domain enterprise architecture information systems

architecture (ISA) framework

information technologies

level of effort logical description mission statement network architecture network domain network scope organizational

reengineering pooled interdependence process architecture process domain project sponsor reciprocal interdependence reengineering scope sequential interdependence stakeholder technology architecture technology domain technology scope user commitment

EXERCISES _______ _

1. Look at the questions suggested for data gather- ing on page 125. Think of other possible ques- tions and why they might be good additions to those suggested. Discuss your suggestions with class members.

2. Describe how the information provided for the four architectures can be used in multiple ways as the basis for IS and organization redesign.

3. Discuss the differences in outcomes of an orga- nizational reengineering project if one or more of the assumptions in the list on pages 116-117 are not met.

4. Try to develop process and data architectures for the Abacus Printing Co. case in the Appendix. Try to do an affinity analysis of the information. Develop a list of questions you need answered to do a complete analysis.

STUDY QUESTIONS ___ _

1. Define the following terms:

data architecture enterprise analysis information

technologies network architecture

organizational reengineering

process architecture technology

architecture

2. What is the motivation for organizational reengineering?

3. What are the steps to organizational reengineering?

4. Why are caseworker or quality circle work groups preferred to the assembly line approach to work?

5. What is the 80-20 rule and how does it apply to reengineering?

6. What is an architecture and why is it important to reengineering?

7. What types of architectures are used in reengineering? What is the purpose of each architecture?

8. What is an entity and how is it used in the data architecture?

9. What is a platform and how is it used in the technology architecture?

10. List three prerequisites of reengineering. Why are they necessary conditions for a successful project?

11. What are four assumptions of reengineering? 12. Why are different scheduling scenarios neces-

sary for the organization of reengineering projects?

13. What is a level-of-effort approach to work? Why is it used with reengineering?

14. Why is there overlap between reengineering tasks? Why is overlap necessary?

15. What is the role of the project sponsor? 16. List the types and roles of people who should

be assigned to a reengineering project. 17. Why is data self-sufficiency the major criterion

for scoping a reengineering project? 18. Describe a good mission statement. What

makes the difference between a good mission statement and a bad one?

19. How are critical success factors used in reengineering?

20. List five information sources and the type of data that the team gets from each one.

21. Discuss the conceptual levels of Zachman's IS architecture. Which two relate to reengineer- ing? Why are the others not used here?

22. What is the purpose of mapping the two levels of architecture into different domains? Why the domains chosen?

23. Who is a stakeholder? Why is a stakeholder important?

24. Describe a CRUD matrix and its use. 25. Why is affinity analysis important? What are

the reengineering results that are basetl on affinity analysis?

26. List three rules of thumb for deyeloping the network and technology recomII1endations.

27. Why is implementation planriIng important to a reengineering effort? When changes are

Extra-Credit Questions 147

dramatic, what is a good approach to imple- menting change in the organization?

28. How does enterprise analysis differ from orga- nizational reengineering? Are these differences significant? Why not do enterprise analysis only?

29. Which automated support tools provide all de- sired functionality for reengineering support?

30. What are the functions desired of an automated support tool for reengineering?

31. What are the key criteria for proper scoping of a reengineering project? Explain.

* EXTRA-CREDIT QUESTIONS 1. You have been named to lead an organization

reengineering effort for a small, one-location company. The company has functions for ac- counting, purchasing, inventory management, shipping, and sales. The business of the com- pany is retail sales of furniture. The current computer system supports the billing, shipping, and invoicing process. No one but employees in the accounting department use or access the computer at present. Develop a plan and sample questions you might ask the employees and the owner for an organization reengineering project.

2. What are factors that can cause a reengineering project to complete faster or slower? Explain.

3. Imagine that you work in a company that has all types of computer hardware and networks: mainframes, mid-size, PCs, wide-area main- frame networks, and local area networks. What are the issues in defining what data and applica- tions should be on each type of hardware? Develop and discuss possible guidelines for data and application location selection.

CHAPT ER6

APPLICATION ----------------------.. ________ .r--~ FEASIBILITY

----------------------------------~ ANALYSIS AND PLANNING ____ ---------, _____ ------.

INTRODUCTION ____ _

Feasibility is the first stage of application develop- ment. The purpose of the feasibility study is to ensure that the organization can accommodate the technology, organization changes, and cost of the new application. During feasibility analysis the major tasks are: define the scope and boundaries of the problem, generate technical alternatives, assess costs, benefits and risks, and recommend an applica- tion deve!opwent strategy. The procedures described in this chapter are used for large, full life-cycle projects; selective and abbreviated forms of the analysis are used for iterative development and for small projects.

DEFINITION OF ____ _ FEASIBILITy _____ _ TERMS _____________ _

The feasibility analysis tasks and the terminology of each are briefly described. The stages of work during feasibility are: gather information, develop alternatives, evaluate alternatives, and plan and doc- ument the recommended approach to development.

148

During the information gathering stage, the goal is to develop a request from a vague, general state- ment into a specific request with boundaries and scope completely defined. Key business and applica- tion leverage points are defined during the scoping activity. A business leverage point is some activity from which a competitive advantage can be gained. An application leverage point is some automated function that might provide a competitive advantage to the using business unite s). Application leverage points frequently relate to improvements of better, faster, and more to work. Some business and appli- cation leverage points are:

Increase market share Increase linkage to vendors or customers Provide desired information that is not currently

available.

Business and application leverage points are used as the starting point for developing the benefits that would result from a change in the current method of work. Benefits can be tangible or intan- gible. Both benefit types are important for man- agement to decide whether or not to do the recommended changes. Tangible benefits are mea- surable improvements to a specific work product or process. For instance, reducing staff by 10 people

and the resulting cost savings are tangible benefits. Intangible benefits are not directly measurable. For instance, improved customer service through inte- grated database access has tangible and intangible benefits:

Tangible Benefits

Decrease operating cost by 10% in first year Increase market share by 5 % per year for three

years

Intangible Benefits

Improve company image Increase customer satisfaction Improve employee job satisfaction Provide faster and more accurate information to

customer services representatives

Another tangible benefit might be faster response time for inquiry requests from five minutes to 15 sec- onds. An intangible benefit from the same action might be improved customer satisfaction. More satisfied customers are less likely to go elsewhere for their products, but proving that customer sat- isfaction is improved is difficult to quantify, and is intangible.

Also in information gathering, the business envi- ronment, competitive environment, and current method of performing the work that would be revised are described in sufficient detail to allow determination of appropriate changes. The functions and procedures that are needed in the new applica- tion are identified, as are problems with current pro- cedures and new functions that are not part of current procedures.

After the current problem domain is understood, alternative approaches to the problems are devel- oped. Alternative approaches to an application are different configurations of work, hardware, firm- ware, or software. Alternatives can begin with non- automation alternatives, such as change in work flow, and progress to different platforms, software, and designs. Usually between two and five alterna- tives are considered. Alternative definitions include the technology, benefits, and risks of each approach. A benefit, as discussed above, is some improvement in the work product or process that results from a

Definition of Feasibility Terms 149

specific alternative. Risks are events that would pre- vent the completion of the alternative in the manner or time desired.

Risk assessment determines possible sources of events that might jeopardize completion of the application. In general, the goal is to develop the project on time, within budget, and without errors. Risk assessment and contingency planning help you meet this goal. Contingency planning is the identi- fication of tasks designed to prevent risky events and tasks to deal with the events if they should occur. The goal is to minimize the possibility of the event occurring, but to also have a plan just in case the worst happens. Having a contingency plan prevents having to force decisions under pressure.

When the alternatives have been defined, they are evaluated. The number of requirements met by the approach is assessed, and the benefits and risks of each are weighed to identify the alternative with the least risk and most benefit. If an alternative exists that meets all required and optional requirements, meets all benefits, and has the least risk, it would be the recommended option. Most often, there is a mix of requirements met and risk incurred, that prevent selection of an alternative based on technical merits alone. Rather, several competing alternatives might be further evaluated to differentiate between them. To decide between the alternatives, development plans, costs, and financial analysis are developed.

A project plan is a schedule of tasks and esti- mated completion times for application develop- ment. A project plan includes tasks to be completed, tentative task assignments, staffing plans, and com- puter resources needed for the project. From the staff and resource estimates, costs of development are determined. If there are multiple alternatives, the costs of each are computed. The costs are used in the financial analysis which occurs next.

Several different types of financial analysis might be performed; the two most common ones are cost/ benefit, and makelbuy. Cost/benefit analysis is the computation of net present value for each alterna- tive. Net present value (NPV) equalizes the cost estimates by accounting for the time value of money for multiperiod investments. A make/buy analysis chooses between alternatives for providing an item, such as a software application. The make analysis

150 CHAPTER 6 Application Feasibility Analysis and Planning

estimates the cost of building a customized applica- tion, while the buy analysis estimates the cost of pur- chasing a package.

Other financial analyses, such as internal rate of return and payback period, might also be computed. Internal rate of return analysis determines the interest rate which equates cash investment outflow with positive cash flow. Payback period analysis determines the number of years required to recover the cash outlays based on the projected monetary benefits.

After all the analyses are performed, a final rec- ommended alternative is defined. Technical and monetary considerations are balanced and a recom- mendation is based on some mix of them. For in- stance, a recommendation might be based on the fastest payback coupled with most requirements met. Alternatively, the decision might be based on the lowest NPV and the extent to which leverage can be maximized. When the alternatives are virtually equal in comparison, multiple approaches to the applica- tion are presented and the user, IS managers, and project team decide together what approach is best. This is often the case.

Finally, a feasibility document is created to sum- marize the feasibility analysis and the recommenda- tion. The document is a summary of all of the preceding steps and analyses taken during the feasi- bility phase. Next, we discuss each feasibility activ- ity in detail.

FEASIBILITy _____ _ ACTIVITIES ______ _

Feasibility analysis is an activity that ranges from several days to several weeks in duration. In general, a feasibility should be completed in fewer than 12 weeks; after that point, one of two problems exists. Either the problem domain is too large and should be broken into smaller problem areas, or the feasibility team is going into too much detail and should summarize at a higher level. The informa- tion at the end of feasibility should be accurate enough to allow managers to decide on the worth of pursuing a project, but high level enough that an analysis phase to clarify details of requirements is

needed. The information is incomplete with about 95% confidence in the accuracy of the information. Similarly, a budget and project plan produced at this high level of abstraction should have about an 80% level of confidence attached to it. This means that the budget and time schedule are ±20% inaccurate, and implies budget adjustment later in the project. In this section, we detail the actions of feasibility analysis and project planning outlined in the previous section. For each topic, guidelines for completing the work are presented and followed by an example of the activity for ABC Video.

Gather Information Guidelines for Gathering Information

The four major tasks during information gathering are:

1. Define the business and work environments 2. Describe current system of work 3. Identify key benefits and leverage points 4. Identify broad system requirements

The activities are done in parallel rather than sequen- tially. As information is collected, leverage points and requirements emerge from discussions on which old procedures to keep and what new technology, procedures, data, or interfaces are needed.

If an enterprise level plan exists, the data gather- ing begins with the architectures to obtain an overall view of the current data, processes, and tech- nology of the target business area( s) (see Figure 6-1). The process decomposition is used to identify and match the affected jobs and tasks with those sug- gested by the requesting application sponsor. The data architecture is used to identify what data are involved and the extent to which the data are already automated. The technology architecture is checked to identify hardware, software, and applications sup- porting the work functions today, and to identify potential platforms as operational sites for the new application. For each job affected, the technology architecture matches jobs (from the process archi- tecture) with applications capabilities.

The architectures, if present, are the basis for obtaining information from the user departments

Feasibility Activities 151

Model Process Data Network Technology

Process Hierarchy

: Entity Relationship Dia- Logistics Network

Technology Network

I I gram I I I I I I I Enterprise

Level Analysis

: CJ4O-C:J I

Function = Entity = Business Entity Node = Business Node = Computer Business Process Relnship = Bus. Con- Process Link = Network Link

straint Link = Comm. Need

Feasibility Identify and match Identify data and Identify potential Identify hardware,

affected jobs extent of data operational platforms software, and Study Use

and tasks automation applications

Adapted from Zachman 1987

FIGURE 6-1 Enterprise Architectures in Feasibility Study

involved. Recall that the methods of data gathering (from Chapter 4) might include interviewing, docu- ment review, observation, talking to other compa- nies, temporary work assignment, and questionnaire surveys. During feasibility, interviews, document review, and other companies are the primary infor- mation sources. Although the other methods could be used, they take more time and elicit more detail than required for feasibility analysis.

Assume you are doing the information gathering using interviews. You might work in two-person teams for interviews so that the project has a built- in backup for every person, should someone get sick, called on jury duty, or be reassigned. One person asks the questions while the other person acts as scribe taking notes. This method of interviewing results in fewer misconceptions and errors from for- getting than interviews by one person. At the end of every session, follow-up steps should be identified for both you and the interviewee. For instance, you might document the interview and ask the inter- viewee to review and correct your documentation. You commit to having the material back by a specific

date and request the review within a set time. In this manner, you conclude the meeting with a commit- ment from the interviewee to do the review by a cer- tain date.

During the writing of interview materials, graph- ical techniques for both data and processes can be used to synthesize the findings. The most common graphical techniques are entity-relationship dia- grams (ERDs) for data, and process decomposition and process data flow diagrams for processes (PDFDs). Development of these diagrams is detailed in Chapter 9. An older variant of PDFDs called data flow diagrams (DFDs) are also used; they are de- tailed in Chapter 7. In general, ERDs capture infor- mation about the data entities that are within the scope of the study problem domain. An entity is any person, place, thing, or event about which the orga- nization needs to keep data. The relationships between entities define some business-related asso- ciation that is within the problem scope. The process decomposition diagram depicts the organization tasks that are being studied. The problem area is compared to the process hierarchy and ERD to

152 CHAPTER 6 Application Feasibility Analysis and Planning

ensure correct scoping. PDFDs summarize the pro- cesses of the problem and relate them to each other, the outside world, and to data entities.

In addition to diagrams which summarize the pro- cedures and data of the target problem domain, you also create text documents that describe the current process, the aspects of the current process to be retained, and the changes and motivation for changes. In general, text should be minimized because it is easily misinterpreted. Diagrams and graphics are preferred to text. Lists of items are pre- ferred to paragraph form text. Requirements for the new application should be as specific as possible. For instance, a requirement might be stated 'reduce turnaround time from receipt of an order through invoicing from 14 days to 2 days.' During the sys- tems analysis phase, the actual details of functions to implement this requirement are developed.

As we said above, key business and application leverage points are defined during the data collection activity. Leverage points are context specific. What might be a leverage point in one company and industry might be standard procedure in another company and industry.

An example of leverage points is provided by examining imaging technology. Imaging technology automates facsimiles of business forms. Image files are databases of forms with indexes for retrieval and linkage to data databases. Applications can be de- veloped to integrate data and image information for users at terminals. The technology provides both business and application leverage by improving work flow and allowing the management of paper flow through an organization.

The leverage provided by imaging is highest in organizations that are information and paper inten- sive, for instance, insurance and financial services. These paper intensive industries are required, by law, to provide original document search capabilities. Before imaging technology, these companies either used microforms or paper, both of which have only rudimentary indexing capabilities. Microforms require their own viewing equipment that is neither intelligent, nor capable of integration to an applica- tion. Paper, if kept, is so voluminous that whole buildings are dedicated to document storage. Trying to retrieve specific documents and files requires

armies of clerks and dedication to accurate refiling. Simply applying imaging technology by itself buys marginal improvement to paper management. The big payoff is in integrating imaging with software to manage work.

Work flow management software is integrated with imaging technology to schedule work for clerks, monitor document locations, and monitor work progress through any number of departments (see Figure 6-2). All of these actions can be done without fear of losing the document because it is an electronic image. Printing of the image is pos- sible if a paper copy is needed by a clerk for some reason.

Imaging and work flow management together can flatten hierarchies, reduce the number of clerks involved in image production, and eliminate the need for clerks to manage files. Staff reduction is a business leverage point and a benefit of the activity. For individual jobs, frustration is reduced because information can not be 'removed for use' from an image file. Clerks are more productive and their jobs can be upgraded because the emphasis now can be placed on understanding and interpreting the infor- mation rather than on simply collecting all the information correctly. Thus, an application leverage point is present in enhancing jobs of the people in the workflow.

Leverage points identify benefits of the proposed application. Other benefits might be present and should be identified; they may not have a direct strategic impact. For instance, in keeping with the idea that most proposed applications are to improve work, benefits about more, faster access, integrated, or improved quality data might be defined. Similarly, automation of more tasks, faster report generation, integration of processing, or improved timing of response might all be benefits. Conversely, the new application might be expected to reduce staff, link- ages between departments, work errors, and so on. These benefits are all tangible and measurable and should be identified.

Intangible benefits are equally important, but are harder to quantify. Intangible benefits are indirect, unmeasurable benefits with a high degree of uncer- tainty. For instance, one benefit of personal work sta- tions with access to software has been a rethinking,

User

8) n

DBMS

Imaging Software

Imaging Hardware

Feasibility Activities 153

Database

Image Base

FIGURE 6-2 Logical View of Work Flow Management Software

by many people, of how they do their work. Tbey now type their own documents directly and use sec- retarial support for changes and formatting. They do their own analyses and perform many different types of analysis that they could not do, and there- fore never thought of doing, before they had desk- top computer access. This type of change is an indirect benefit that increases the effectiveness of a person's work, while the tangible benefits deal mostly with efficiency improvements. Both types of benefits are important in application qecisions.

The SE works with the users to define the tangible and intangible benefits relating to a project. Benefits identified are listed in the documentation of the pro- posed application, and a value is attached to each one. Tangible benefits are quantifiable by determin- ing the change expected to result from the new application. Intangible benefits usually are listed with a possible range of benefit. In presenting this information to decision makers, you must be able to justify why intangible benefits exist. Managers will ask and expect the reasoning behind any expected financial gains, whether tangible or intangible.

Now let us turn to ABC Video to discuss how to perform the data collection activities.

ABC Video Information Gathering

Of the methods of data gathering available, several can be eliminated immediately. First, questionnaires for a total of six employees would be impractical. All employees are available for discussions during nonpeak times. Also, studying documentation is not possible because the manual methods are not documented. Observation and temporary work assignment can give some information about the cur- rent problems to be solved through automation, but are of limited value in actually designing Ule new application.

Talking to competitors is not feasible because they do not want to help the competition; however, to define benefits that might accrue for the ABC application, l<:nowledge of competitor clerical assignments and computer systems is valuable. Observation of competitors is a good way to get some insight to the benefits Vic might get from

154 CHAPTER 6 Application Feasibility Analysis and Planning

automation. The remaining data collection method, interviews, should be used extensively for Vic and the clerks to determine the work flow, problems, and possibilities for the ABC application. To supple- ment the interviews, we should observe competitors by using their services for a period of time to get information about their work assignments and applications.

For ABC, we define the current environment, proposed environment, leverage points, and benefits. Through Vic's interviews we find that ABC operates in a highly competitive environment. Large chain video rental businesses are crowding small one-shop businesses, like ABC, out of the market. ABC must remain competitive to stay in business and to grow as Vic expects. Vic sees the future to be in services offered to customers. In terms of video rental pro- cessing, service translates into minimal bureaucracy with as many variations on service to customers as possible.

Currently, ABC uses a manual method of video rentals. The customer chooses a video and presents the video cover (or title) to a clerk. The clerk locates the video, locates a rental card for the customer, and writes the current rental on the card. Charges for late fees are computed from the card if any are owed, and the customer pays for the current and any late rentals. The customer signs the rental card which is filed by the clerk. During the peak business period, from 6 P.M. to 10 P.M., the rental cards are placed in a pile for later refiling. Frequently, cards are misplaced and the customer is then not charged late fees. If a tape is never returned and the accompanying card is lost, Vic has no way to trace who has what tape(s). This method is error prone and subject to whims of clerks who have been seen changing return dates for friends who return tapes late. Also, the time involved in locating a given customer's rental card ranges from 30 seconds to several minutes during nonrush time, and can be as high as 10 minutes during the peak rush time because clerks are waiting to access the card file.

Vic's requirement for the new application is to provide a fast, simple method of providing rental processing and accounting without introducing any new bureaucracy into the process. The system must be on-line, accommodate at least five clerks working

simultaneously, provide for growth in video inven- tory, and expansion of the business to other related sale/rental items. At a summary level, the data enti- ties in ABC rental processing are customers, video inventory, and rentals. Figure 6-3 is an ERD show- ing the relationships between these entities. Also at a summary level, the major processes of rental pro- cessing are customer maintenance, video mainte- nance, and rent/return processing. These processes are summarized in Figures 6-4 and 6-5.

Figure 6-4 is a hierarchic process decomposition diagram for the business, showing many more func- tions than just the rental processing. The rental pro- cessing area has bold lines to highlight it from the rest of the diagram. This diagram is developed at the enterprise level to ensure that the correct depart- ments and processes are accounted for in an appli- cation development effort.

Figure 6-5 is a high level process data flow dia- gram for the rental activities only. The diagram shows the inputs, processes, and outputs of the rental activity. Inputs are rent/return requests, payments, process requests, new customer information, and new video information. Processes are maintenance, reporting, and rental/return. At the feasibility level, this is an acceptable level of detail for data and pro- cedure knowledge and documentation.

To determine leverage points for ABC's applica- tion, we examine what the application does for ABC in the context of its industry and competitive envi- ronment. To do this we ask and answer several ques- tions. First, can this application give ABC a competitive position in the industry? The answer to this question must be no. ABC is a one-shop orga- nization that might grow to several branches but is not expected to grow to national prominence. There- fore, the application might give ABC a presence in the local market, but the application's strategic impact on the industry is zero.

Second, does the application give ABC competi- tive advantage in the local industry? All other things being equal, the application could give some local advantage over other video stores in Dunwoody, Georgia, the town where the company is located. The impact on the local industry, in terms of subur- ban Atlanta, is close to zero. The other 'things' that must be equal or better for ABC to obtain a local

Feasibility Activities 155

Customer

-I-

R equests

(D / '" Refers to

Open Rental "- ,.... v '-'

FIGURE 6-3 ABC Entity-Relationship Diagram

advantage include the number and variety of videos available, desirability of the location, and attitude of clerks to customers. For this discussion, we assume that location, attitude and variety of videos are at least equal.

Observation of the applications of the rival video stores is required to assess the potential impact of the subject application. There is a national chain store down the street, approximately .8 miles away. That store is evaluated since it is the closest competition. The chain store sells and rents Nintendo TM, Sega Genesis ™ , and computer software as well as videos; plus, the chain store sells tickets to local rock con- certs and events, and sells records, CDs, and audio tapes. Thus, the chain store is a recreational elec- tronics store while ABC is simply a video rental store.

The fact that ABC is specialized and the chain store is general works in ABC's favor because of rel-

I Video I Copy

v (b

Is Described by

-"-

Video

ative staffing levels. There are usually four clerks working in the chain store. Of the four clerks, two are at cash registers at which lines average three waiting patrons during peak periods. One of the other clerks roams the store assisting customers while the other clerk processes ticket orders. There are frequently lines at the ticket counter, especially when a famous rock group's tickets go on sale. Sometimes there are several hundred people on line. On average, there are 12 customers in the store at all times, with a peak average of 20. The peak times are the same as ABC's-6 P.M. to 11 P.M. Of the 20 customers during peak time, about 10 people actually rent or purchase something. The average age of a rental customer is about 19.

Contrast this situation with ABC. Five clerks work at ABC during the peak hours of 6 P.M. to 11 P.M. The remainder of the time, three clerks are on hand. The clerks, in general, do not roam the store

156 CHAPTER 6 Application Feasibility Analysis and Planning

FIGU RE 6-4 ABC Hierarchic Process Decomposition Diagram

assisting customers; they are all behind the counter doing payment processing for customer rentals. The lines, if any, form in the peak times and average two people per clerk. If a customer has a question, she or he waits until a clerk is free, then gets assistance and rental payment at the same time. On average, there are five people in the store at all times, with an average of 25 during the peak times. Of the 25 peak customers, 18 rent videos and seven leave empty- handed.

ABC's rental 'hit rate' of .72 (i.e., 18 of 25) is much higher than the industry average of .50. 1 Their single purpose may work against them for some cus- tomers who want full service electronic entertain- ment, and may work for them for other customers who only rent videos. The average age of an ABC rental customer is 22. Thus, the customer is slightly,

1 The industry average is located by doing library research on the industry.

but not significantly, older than the chain store's customers.

So far, the company contrast neither favors nor disfavors ABC over the chain store. Next, we com- pare the company's procedures for rental processing. The chain store requires a subscription to their com- pany's services that includes the presentation of a valid driver's license and credit card to establish an account. To use the account, each family member is assigned their own number and given his or her own ID card. The ID card is presented at the time of rental and paymeht of all current and past charges is required for a rental to take place. The presence of a family member ID allows parents who get stuck pay- ing their children's fees to track the guilty party. If two family members make rentals in the same day, the clerk mayor may not mention that a rental already exists to the later person. There is no proce- dure for clerks to help customers control the num- ber of rentals in one day, nor is there a way for previous rentals to be known.

<::D a:m. CD C o ....

FIGURE 6-5 ABC Process Data Flow Diagram

ABC's expected rental processing is detailed in Chapter 2. Vic's vision of ABC's rental application does slightly favor ABC over the chain store. ABC will also assign family members their own IDs, but an ID card is not required of a customer. Rather, Vic envisions using the telephone number as the ID and asking the person for their name at the time of rental. A list will appear on the screen of all authorized renters for a given phone number with a sequential number the clerk selects beside each name. The pro- cedures to accompany rental processing assume that customers want to know if a previous rental that day has occurred. Also, Vic envisions keeping track, electronically, of the previous rentals for a family

Feasibility Activities 157

Video

Customer

Video

Open Rental

Customer

and giving them the chance to stop a rental transac- tion on a previously viewed video. Thus, Vic's sce- nario has less bureaucracy, more service, and more customer-oriented clerical procedures than the chain store. These three improvements are the leverage points for ABC in its local market.

Next, we define other noncompetitive benefits of the application. The application eliminates many of the errors that can happen in a manual system of work. For instance, clerks can no longer decide who pays late fees by changing return dates. Cus- tomer cards, which can be lost, are eliminated and replaced by automated file records which can only be deleted by Vic. Both videos and customers must

158 CHAPTER 6 Application Feasibility Analysis and Planning

be on an automated file to be eligible for rental processing.

The application will provide for automatic gen- eration of end-of-day reports on receipts and trans- actions by clerk, by register, or by customer. If a discrepancy is found between receipts and money in the cash register, having a log of transactions that can be printed will assist the accountant in tracing errors. Both of these types of reports provide signif- icant improvement over the current manual methods. Under the current method, receipts are tied to money in each register by sorting the paper copies of trans- actions and adding the totals. If there is an error, it is almost impossible to trace since no money is actually tied to an individual transaction. At the pres- ent, the accountant writes off errors.

Developing a list of the benefits for ABC's application is fairly easy because automation so improves a manual operation level task. Take the adjectives faster, better, more and, for each, define all the tasks or data that will be improved in some way relating to the adjective. For instance, process- ing an individual transaction will be faster because manual card lookup is gone, data entry is minimized to Customer ID and Video ID( s) with the computer retrieving and displaying all other information about each entity. Individual transactions will have im- proved data integrity by eliminating manual errors, such as writing the wrong amount, entering a wrong amount at the register, writing the wrong tape ID, re- trieving the wrong customer card, and so forth. More information will be available for management use. For instance, end-of-day reports provide the accoun- tant more information and Vic might develop ad hoc reports of all automated information. The benefits for ABC's rental application are summarized next.

Simplify customer IDs-Less bureaucracy than competition

Provide help to customers in finding tapes- More service than competition

Give customers information on previous rentals the same day and on videos they have previously rented-More customer-oriented clerical procedures than competition

Increase data accuracy for customers, videos, rentals

Allow tracking of late rentals Allow accurate computation of late fees Increase speed of customer and video informa-

tion retrieval Improve customer service Provide accounting record of transactions Allow tracking of transaction errors Decrease time for individual transactions

through minimal typing Increase speed and accuracy of fee processing Decrease file update time Provide more accurate and timely end-of-day

reports Improve customer satisfaction with overall

rental process through the above changes

After general benefits are identified, they are made specific and quantified for evaluation of costs. The benefits listed above are specific enough to quantify directly (see Table 6-1). Quantification, though, requires detailed knowledge of the business and expected benefits. Vic is the business expert and he participates in the quantification activity. For each benefit, he is asked how much revenue (or expense) is related to each item for one occurrence of each benefit. For each, Vic is also asked the degree of cer- tainty for the benefit and his estimate. The numbers provided are multiplied for the total number of each benefit expected. The degree of certainty (ranging from 0.0 to 1.0) is then multiplied by each total amount to provide a range of estimates for each. In the example shown in Figure 6-6, the benefit of more

Total revenues $500,000

Losses from 2% of revenues inaccurate data

Dollar loss from $10,000 bad data

Certainty factor 80%

Benefit of more .8 * 10000 = $8,000-$10,000 accurate data

FIGURE 6-6 Example of Benefit Computation

Feasibility Activities 159

TABLE 6- 1 ABC Quantified Benefits

Benefit Expected Increase in Revenue

Simplify customer IDs-Less bureaucracy than competition

Provide help to customers in finding tapes-More service than competition

Give customers information on previous rentals the same day and on videos they have previously rented-More customer-oriented clerical procedures than competition

Increase data accuracy for customers, videos, rentals

Allow tracking of late rentals Allow accurate computation of late fees

Increase speed of customer and video information retrieval

Improve customer service

Provide accounting record of transactions Allow tracking of transaction errors Provide more accurate and timely end-of-day reports

Decrease time for individual transactions through minimal typing

Increase speed and accuracy of fee processing

Decrease file update time

Improve customer satisfaction with overall rental process through the above changes

accurate data entry, Vic figures his current losses at 2% of total revenues of $500,000, or $10,000. He feels the $10,000 estimate is about 80% accurate. Stated another way, by eliminating errors in data entry, Vic will gain $10,000 with 80% certainty. Thus, the benefit to be gained from more accurate data entry is $8,000-$10,000.

Table 6-1 shows the benefits from the list on p. 156 with dollar values associated with them. For the benefits resulting in $1,000 increases in revenue, Vic was unsure that there was much tangible out- come, but estimated about $3, or one rental, per day. For the higher dollar estimates, he worked through the estimates in the same way shown above for increased accuracy.

$1,000

$ 500

$8,000-10,000

$10,000-15,000

$1,000

$3,000-5,000

$1,000

$1,500

$5,000

$2,500

Develop Alternative Solutions The activities in developing alternatives include definitions of technical alternatives, and benefits and risks of each alternative.

Define Technical Alternatives

There are no specific, theory-based guidelines for developing technical alternatives. Rather, the tech- nical alternatives within a specific business are explored to determine what is possible and practi- cal. First, define the application concept (see Table 6-2). How up-to-date does information maintained by the application need to be? If the answer is four

160 CHAPTER 6 Application Feasibility Analysis and Planning

TABLE 6-2 Steps in Developing the Technical Alternatives

• Define the overall application concept

• Evaluate usefulness of existing hardware/software

• If new equipment or software is needed:

• Determine data sharing requirements • Determine the criticality of data to the company

• If shared or critical data, select equipment (either LAN or mainframe) and software that allow centralized control over data.

• If noncritical and nonshared data, select the smallest equipment that allows necessary level of control. In multi location settings, consider decentralizing or distributing the application by duplicating equipment, application, or data in several locations.

• Define special hardware requirements and ensure that the special hardware works with the selected hardware/software platform( s).

hours or more, a batch application is sufficient. If the answer is between two and four hours, interactive data entry with batch updates throughout the day might be acceptable. If the answer is in the range from seconds old to four hours, an on-line applica- tion is also sufficient. If the answer is that the system user must react to all transactions as they occur, a real-time application is needed. On-line is the most frequently selected option.

Next, for individual processes, determine the con- cept at the lower level of detail. For instance, for reporting, should answers be developed as a report request is entered or can they be run overnight? Some reports might need to be on-line, others might be run in batch mode. The volume of print, estimated time for processing, and urgency of data all are used to select the concept for individual processes. For instance, an ad hoc report that generates 10,000 lines of print should not be sent to a display screen; rather, it should be printed. Also, a long report might be cre- ated at the time of the request, but sent to a print queue for convenience of printing. The decisions made during feasibility are not expected to be per- manent at this point, rather, you are estimating the

concept to help in the evaluation of complexity of design.

After the concept is developed, hardware and software are evaluated. If there is hardware and soft- ware already installed, investigate their use first. Can the application be developed for operation on the existing equipment? Can the existing software accommodate the application? Can the application coexist with other applications currently used? If the answers to these questions are "yes," the plat- form recommended is the existing equipment and software. If a "no" answer is given, then investigate new hardware or software as needed.

If no hardware or software are currently used, or the current equipment cannot be used to do the application, select the likely hardware platforms. First, determine whether the application users need to share information or not and how up-to-date the information must be. For instance, can copies of the application run in different locations with daily update of files, or must the users share all informa- tion throughout the day? Second, determine the 'corporateness' of the data. How critical to the organization is the application data? If the company depends on the data to stay in business, then a more centralized, controlled environment is required than if the data is not critical to the company.

The need for centralized control over data that is critical to the organization is one factor to con- sider in recommending a platform and environment for an application. The extent to which the company relies on application operability, the importance of data integrity, audit trails and security, and the ability of the environment to accommodate these needs are all assessed. Although there are no clear dif- ferences in application management between a LAN and a mainframe, software does make a dif- ference. The levels of security, number of simul- taneous users, size of database, locking of records for simultaneous update, and many other technical considerations differ widely across networks, oper- ating systems, databases, and languages. When dis- tribution is an alternative, the centralization issue becomes even more important to evaluate and resolve. Full discussion of the decision criteria for distributing data and applications are deferred until Chapter 10.

To determine hardware alternatives identify the smallest size computer possible that can accommo- date the task, providing for data sharing and central- ized control as needed. The cheapest and smallest platforms that meet the criteria are alternatives. For hardware we then ask if any other special purpose hardware is needed for this application. If other spe- cial purpose hardware is needed, enough research on the hardware should be done to determine what is required and whether or not it can be used with the identified alternatives.

From the hardware identification activity, the most likely platforms should be narrowed to two or three. The key factors in narrowing the selected plat- forms are reliability and flexibility. Portability might also be important, depending on the environment. Reliability is the extent to which the hardware, soft- ware, and application will be operational. Flexibility is the extent to which the hardware, software, or application can be modified easily. Hardware flexi- bility relates to the extent to which upgrades can be made, for example the number of additional boards, the maximum memory upgrade, the type bus, and type disk channel, to name a few. Software flexibil- ity relates to package design and how often the ven- dor releases updates of new functions. Application flexibility relates to methodology, implementation language, and skill of the developers. Reliability and flexibility are important issues in, for example, selecting a PC workstation, because of the diversity and quantity of alternatives available. If you evaluate five different vendors of IBM PC-compatible equip- ment, you will have different reliabilities and flexi- bilities for each. But even more confusing is that five different configurations of a PC from the same vendor might also have five different reliabilities and flexibilities.

Portability is the extent to which the software can be moved to another hardware/operating system environment without change. The fewer changes when moving the application, the mote portable it is. Portability is an issue when the application is devel- oped in one environment (e.g., a LAN) and is ported or moved to another environment for operations (e.g., a mainframe). Portability is also important when an application is developed in one location and is implemented in multiple locations which may not

Feasibility Activities 161

have the same configuration. Multiple locations with heterogeneous environments are the norm in distrib- uted applications.

Hardware alone rarely determines the recom- mended alternative. In addition to picking hardware platforms that can accommodate the needs for multi- ple, simultaneous users, you also choose the soft- ware most likely to be used in each environment. Again, these selections might change as the design progresses, but their purpose during feasibility is to allow assessment of skills, training needs, cost, and application design complexity.

In choosing software, you identify a program- ming language, database environment, and any special software needed. Each alternative is devel- oped to solve the entire problem, meeting all re- quirements and as many optional requests as possible. Only the best alternative(s) for a given environment is considered. Two sets of alternatives illustrate this statement.

The first set of alternatives is for a mainframe environment using different operating environments. The first alternative (see Figure 6-7a) identifies an IBM mainframe, running the MVS operating sys- tem, and using IBM's DB2 for database and IMS/ DC for telecommunications control. The second alternative (see Figure 6-7b) identifies an IBM mainframe, running the conversational VM/CMS operating system, and using a Focus database. Telecommunications control is hidden from the

Figure 6-7a. Alternative 1

Hardware: Operating System: Database: Telecomm Control:

Figure 6-7b. Alternative 2

Hardware: Operating System:

Database: Telecomm Control:

IBM Mainframe 309x MVS DB2

IMS/DC

IBM Mainframe 309x VM/CMS Focus SNA through VM

FIGURE 6-7 Two Alternatives Using Different Software

162 CHAPTER 6 Application Feasibility Analysis and Planning

Figure 6-8a. Alternative 1

Hardware: Operating System: Database: Telecomm Control:

Figure 6-8b. Alternative 2

Hardware: Operating System: Database: Telecomm Control:

IBM Mainframe 309x VM/CMS Focus SNA through VM

IBM PC-Compatible MS/DOS, Windows Focus Novell Ethernet

FIGURE 6-8 Alternatives Using Different Operating Environments

application and is through VM (i.e., using VTAM and SNA). Both of these scenarios might be pro- posed, with the deciding factors relating to time of development and expertise of staff, rather than to the desirability of one environment over the other.

The second set of scenarios is for a network ver- sion of an application (see Figure 6-8a) versus a mainframe version (see Figure 6-8b). Both environ- ments would use a database which is already avail- able in-house. In this case, the decision relates to environmental and cost factors since both alterna- tives use similar database software. Then, reliabil- ity, flexibility, and portability are issues.

Estimate Benefits of Recommended Alternatives

Two kinds of benefit estimates are developed. First, the general benefits defined are analyzed to deter- mine that they are (or are not) met by each pro- posed alternative. Second, new benefits that relate to a specific proposed alternative are defined. Again, benefits are context specific, relating to a given alternative for a given company at a given time.

The first benefits estimate is a tally of the num- ber of general application benefits met and, if it can be determined, the effectiveness of implementation within the proposed alternative. Effectiveness, for our purpose, is the extent to which an alternative will implement the application requirements more,

better, and faster. To measure the number of re- quirements met by each alternative, we simply count which are met in an implementation of each alternative.

To measure effectiveness, we need to determine the extent to which each requirement will be devel- oped. This extent can only be defined in a specific context for a specific application. For instance, two requirements for ABC might be "Provide minimal data entry for customer and video identification" and "Use a scanner for data entry whenever possible" (see Figure 6-9). One alternative might assume the entry of scanned data only. A second alternative assumes the entry of scanned data while providing for keyboard entry in case of scanner hardware fail- ure. A third alternative might assume the keyboard- ing of a minimal number of characters for each type of data. The first two alternatives meet both criteria. The third alternative does not meet the second requirement. Only the second alternative, how- ever, provides both the requirement and a backup. The second alternative would be rated more effec- tive in meeting the requirement than the others, while both the first and second alternatives meet the benefits. On a scale of one to three, the alternatives would be rated two, one, and three, respectively. In a different company with a different context, the same alternatives might be rated one, three, two respectively.

Define Risks

The purpose of risk assessment is to determine all the things that can go wrong. If you have heard of Murphy's Laws, you know they apply to applica-

Alternative 1: Scan Data Entry

Alternative 2: Scan Data Entry or Keyboard Data Entry with Minimal Typing

Alternative 3: Keyboard Data Entry with Minimal Typing

FIGURE 6-9 Sample Evaluation of Alternative Effectiveness

Feasibility Activities 163

TABLE 6-3 Sources of Risk

Source of Risk Risks

Hardware Not installed when needed

Cannot do the job

Software

Group

Project management

User

Computer resources

Does not work as advertised

Installation not prepared in time

Installation requirements (e.g., air-conditioning, room size, or electrical) insufficient

Wiring not correct

Hardware delivered incorrectly

Hardware delivered with damage

Not installed when needed

Cannot do the job

Does not work as advertised

Contains 'undocumented features' that cause compromise on application requirements

Vendor support inadequate

Resource requirements are over budgeted, allocated amounts

Key person(s) quit, are promoted elsewhere, go on jury duty, have long-term illness

Skill levels inadequate

Training not in time to benefit the project

Schedule not accurate

Budget not sufficient

Manager change

Quits, transfers, is replaced

Not cooperative

Not supportive

Does not spend as much time as original commitment requested

Test time insufficient

Test time not same as commitment

Inadequate disk space

Insufficient logon IDs

Insufficient interactive time

tion development. The three most common of Mur- phy's Laws are:

Table 6-3 is a list of possible sources of risk. For each item on the list, you determine the likelihood of it occurring for this project. For instance, if you are using only existing equipment, you could skip the risks dealing with hardware installation problems. As. sources of risk are identified, they should be

1. If anything can go wrong, it will. 2. Things go wrong at the worst possible time. 3. Everything takes longer than it should.

164 CHAPTER 6 Application Feasibility Analysis and Planning

placed in a separate table and rated for likelihood of occurrence for each alternative. In addition, other possible risks for the project might be added to the list. For instance, if revenue for current year drops 25%, the company might not be able to afford the project.

ABC Video Alternatives

First, technical alternatives for developing ABC's rental application are developed. Next, benefits and risks relating to each alternative are estimated.

To develop technical alternatives, the application requirements should be listed as follows:

1. Provide add, change, delete, inquiry functions for customer, video, and rental information

2. Automate processing of rental transactions, including

• Interactive processing and data display for all outstanding video rentals, including fees owing

• The maintenance of customer history of rentals, rental history for each video tape, creation and change of rental transaction records

• Monitoring of outstanding rentals by customer

• Computation of late fees owing from prior transactions

• The ability to create new customers as part of rental processing

• The ability to add new videos to the sys- tem as part of rental processing

• Query of any rental related infor- mation

3. Minimize data entry in rental processing by using bar codes or similar technology

4. Provide interactive, on-line updating capa- bilities for all files

5. Provide transaction logging for database integrity

6. Do daily backup of all files and application programs

7. Provide ad hoc reporting capability for all files and legal combinations of files (e.g.,

customer with video rentals with customer rental history)

8. Provide end-of-day reports of activity by transaction with summaries by transaction type (i.e., rental, late fees, other fees)

9. Provide for future growth of 15% per year per file

10. Provide for future growth in number of sys- tem users to be one every 18 months for five years. A total of nine concurrent users should be supported.

11. Provide SQL compatibility for future growth and compatibility between software applications

12. Provide mean time between failures (MTBF) of 1 year for hardware selection and mean time to repair (MTTR) of 1 hour in hardware maintenance contracts

13. Provide on-line processing for all functions from 8 A.M. to 11 P.M. daily

ABC has specific requirements that imply an on-line application, significant ad hoc reporting, and inter- active processing with immediate file update throughout the day. Batch processing should be fea- sible as a background task to on-line processing since the on-line portion of the day is so extensive and there might be a problem trying to staff the batch hours. Beginning with a hardware platform, then continuing to software and applications, the pro- posed alternatives are defined. Only alternatives that can meet all requirements should be identified; how- ever, if that is not possible, any feasible alternatives are identified and evaluated later. In ABC's case, only alternatives that can meet all requirements are identified.

In a small business, the two most likely hardware platforms are multiuser minicomputers or client- server local area networks. These are considered here. The competing hardware platforms are an IBM AS/400 minicomputer versus a token ring local area network (LAN). Each of these decisions requires a minianalysis of the alternatives in their respective environments that are beyond the scope of this text. To specify the LAN, for instance, requires compari- son of options and costs of probabilistic versus de- terministic networks, cabling requirements, network operating systems (NOS), network interface card

TABLE 6-4 Hardware Platform Estimatesl

Client/Server Alternative

Item Cost

Workstation (6)

Server

Software

Cable-Shielded Twisted Pair (STP)

Network Interface Cards (7)

Network Operating System (Ethernet), 6-10 stations

Total

$ 4,8001

$ 2,000

$ 3,500

$ 1,900

$ 1,000

$ 2,500

$15,700

Minicomputer Alternative

Item

Workstation (6)

Minicomputer

Software

Cable-STP

Total

Cost

$ 4,800

$15,000

$ 5,000 Plus $2001 month

$ 1,900

$26,700 Plus $2001 month

1 Keep in mind that these are estimates for the sake of discussion and not real dollar estimates.

(NIC), compatible software, and so on. Both hard- ware platforms can be implemented successfully in ABC's environment, can support the volume of transactions, and can support the expected company and applications growth.

Once the platforms are identified, the hardware cost of implementing the application on the alterna- tive platforms is estimated (see Table 6-4). From these estimates, the most likely (e.g., the cheapest) two to three alternatives are selected. Also, if there is doubt about the economic feasibility of the applica- tion, the client/user can determine whether to con- tinue with the analysis or not. As Table 6-4 shows, the client/server LAN is cheaper than the minicom-

Feasibility Activities 165

puter hardware alternative. Both alternative defini- tions exclude software for rental processing which is estimated separately because the option to purchase software versus custom development of software should be evaluated.

The client/server alternative is recommended to Vic and he approves although he is concerned about the cost. As a small business person, his company nets under $1,000,000 per year and, in ABC's case, is closer to $500,000. A rule of thumb in automa- tion expenditures is to spend under 10% of net in- come. Vic's concern is that the total cost may exceed $50,000 and his financial risk becomes a problem.

The remaining estimates use only the client! server solution to develop software application alter- natives. The choices are between purchasing a soft- ware package and developing a customized package for rental processing. Mary researches available soft- ware packages and finds that the cheapest one is VidRent2 which costs $7,500 plus $1,500 mainte- nance per year (see Table 6-5). VidRent will be com- pared to building a customized applications using either SQL Server3 or Focus. SQL Server is selected as representing software specifically designed to

2 VidRent is a fictitious name.

3 SQL Server™ is a trademark of Sybase and Microsoft Corporations.

TABLE 6-5 Alternative Software Packages

Maximum Number of

Software Initial Cost Maintenance Users

SQL $17,500a $1,800/year Up to 20

Server™

LAN $12,000 $1,200/year Up to eight

FOCUS™ with SQL

VidRent $ 7,500 $1,500/year Any number of users on one LAN

aKeep in mind that these are estimates for the sake of discussion and not real dollar estimates.

166 CHAPTER 6 Application Feasibility Analysis and Planning

take advantage of client/server environments. Focus is selected as representing software with which Mary's team has extensive experience. The costs of each alternative are completely different and provide for different numbers of users. These factors are kept in mind, but the requirements must be analyzed to determine if one software should be favored func- tionally over the others.

The requirements are reevaluated and rated for each development alternative as shown in Table 6-6. First, consider the softwares' capabilities. VidRent provides neither query capabilities nor his- torical customer or video processing. It also cannot create new customer or video records as part of rental processing. VidRent also does not provide transaction logging. If this package were chosen, these requirements would go unmet. Through dis- cussion with the vendor, Mary determines that query processing can be done by using any software that can access ASCII files. Thus, the addition of dBase™ or Orac1e™ or some other single-user pack- age to provide Vic with query capabilities is a cheap alternative that adds about $1,200 to the alternative. This alternative is still limited in that querying would be limited to an off-line function when the on-line application was not in operation. This requirement is caused by the record locking scheme in VidRent. Also, the software package could be modified by Mary's group to provide the history process- ing desired by Vic, without violating the vendor warranty. Thus, VidRent's cost increases, and it is capable of doing most requisite processing (see Table 6-7).

Both Focus and SOL Server are fully capable of supporting the application. Both require complete, custom development of the application, but both pro- vide application generators and have built-in query capability. A quick estimate by Mary based on her experience and without a detailed project plan is that the total development work would take about six- person months. At $150 per day, for a 26-day month, the custom software development will be about $23,400 (i.e., 6 * 150 * 26). Except for cost, there is no advantage or disadvantage to either package based on application requirements. SOL Server's license allows 15 concurrent users which is more than Focus.

Next, consider the organizational impacts of each package. Mary's team requires training for either VidRent or SOL Server. Training for SOL Server, which is supplied by the vendor, would not be charged to Vic since the knowledge is useful to the team after the rental application is complete. VidRent training, also from the vendor, would be paid by Vic. Training costs must be added to its cost (see Table 6-7).

Next, consider vendor reputation and market sta- bility. SyBase and Microsoft, the vendors of SOL Server, are both relatively young companies, with Microsoft the current leader in software for the PC market. Focus' company, Information Builders, Inc., is over 15 years old and has enjoyed steady growth. Therefore, both vendors are expected to remain viable market forces for the foreseeable future. VidRent's vendor, VidSoft, is 5 years old and still is run from the owner's home. The company has grown steadily by selling to the single video store firms such as ABC, but the owner, Mark Denton, does not publicize his earnings.

In summary, SOL Server and Focus both meet all software requirements of the application; VidRent could be made to provide most requirements. Cost favors the VidRent proposal with a total esti- mated software cost of $22,000. At this point, Vic must decide how much he wants the custom fea- tures of his application and whether the compro- mises on querying and ease of processing are worth $13,000.

Vic and Mary discuss the alternatives frankly. Mary recommends not going with VidRent because of the company size, lack of features, and need for customizing for any features not already in the pack- age. Vic is staggered by the cost of custom software development and is inclined to purchase VidRent and forget his grand plans. Mary reminds him that if he does not develop his application as envisioned, the competitive advantages might disappear. Vic eventually decides that he does want the application as currently defined and that he is not willing to com- promise his vision in any way. Therefore, only SOL Server and Focus alternatives are developed further to determine the benefits and risks of the softwares.

Only general benefits are evaluated for each alternative; there are no apparent benefits of one

Feasibility Activities 167

TABLE 6-6 Rating Software Development Alternatives

Function

Provide add, change, delete, inquiry functions for customer, video, and rental information

Interactive processing and data display for all outstanding video rentals, including fees owing

On-line processing from 8 A.M. to 11 P.M. daily

The maintenance of customer history of rentals, rental history for each video tape, creation, and change of rental transaction records

Monitoring of outstanding rentals by customer

Computation of late fees owing from prior transactions

The ability to create new customers as part of rental processing

The ability to add new videos to the system as part of rental processing

Query of any rental-related information

Minimize data entry in rental processing by using bar codes or similar technology

Provide immediate file update

Provide transaction logging for database integrity

Do daily backup of all files and application programs

Provide ad hoc reporting capability for all files and legal combinations of files

Provide end-of-day reports

Provide for growth of 15 % per year per file

Provide for nine concurrent users

Provide SQL compatibility

Total requirements met out of 18

software over the other. The benefits of the applica- tion identified in an earlier step are compared to each proposed software alternative. As you can see from Table 6-8, the benefits are identical for each implementation.

Finally, risks of the alternatives are defined. The list of possible risks is customized for the applica- tion and each risk is assessed for probability of

SQL Server Focus VidRent

Yes Yes Yes

On-line On-line Off-line

Yes Yes Yes

Yes Yes No

On-line On-line Off-line

Yes Yes Yes

Yes Yes No

Yes Yes Yes

Yes Yes Only with another package

Yes Yes Yes

15 10 Any number

Yes Yes For ASCII files

18 18 15

occurrence with a specific alternative (see Table 6-9). The table of risks is repeated here with an analysis of the two language environments. Hard- ware risks apply equally to both alternatives. Soft- ware risks vary because of differences in product knowledge by the development team, product functionality, and expected cost, all of which favor Focus.

168 CHAPTER 6 Application Feasibility Analysis and Planning

TABLE 6-7 Total Estimated Cost of Software Alternatives

Software

SQL Server™

LAN FOCUS™ with SQL

VidRent

Initial Cost

$17,5001

$23,400

$12,000 $23,400

$ 7,500 $ 2,500 $ 5,000 $ 7,000

Purpose

License fee Custom software

Total $37,900

License fee Custom software

Total $35,000

License fee Database query software Training Customizing

Total $22,000

1 Keep in mind that these are estimates for the sake of discussion and not real dollar estimates.

TABLE 6-8 Benefits of SQL Server and Focus Alternatives

Benefits

Simplify customer IDs

Provide help to customers in finding tapes

Give customers information on previous rentals the same day and on videos they have previously rented

Provide data accuracy for customers, videos, rentals

Track and display late rentals

Compute and display late fees

Increase speed of customer and video information retrieval

Improve customer service

Provide accounting record of transactions

Allow tracking of transaction errors

Provide accurate and timely end-of-day reports

Decrease time for individual transactions through minimal typing

Increase speed and accuracy of fee processing

Decrease file update time

Improve customer satisfaction with overall rental process through the above changes

Total benefits met out of 15

SQL Server Focus

Yes Yes

Procedure Procedure

Yes Yes

15 15

Feasibility Activities 169

TABLE 6-9 ABC Risks of Software Development Alternatives

Risks

Hardware not installed when needed

Hardware cannot do the job

Hardware does not work as advertised

Hardware installation not prepared in time

Hardware installation requirements (air conditioning or electrical) insufficient

Wiring not correct

Hardware delivered incorrectly

Hardware delivered with damage

Software not installed when needed

Software cannot do the job

Software does not work as advertised

Software contains 'undocumented features' that cause compromise on application requirements

Software vendor support inadequate

Software resource requirements are over budgeted, allocated amounts

Key person(s) quit, are promoted elsewhere, go on jury duty, have long-term illness

Group skill levels inadequate

Training not in time to benefit the project

Schedule not accurate

Budget not sufficient

Manager change

Vic quits, transfers, is replaced

Vic/clerks not cooperative

Vic/clerks not supportive

Vic does not spend as much time as original commitment requested

Test time insufficient

Test time not same as commitment

Inadequate disk space

Insufficient logon IDs

Insufficient interactive tiIhe

SQLServer

Low

Low-Medium

Low

Low-Medium

Medium

Low-Medium

Low

Low-Medium

Low

N/A

Focus

Low

Low-Medium

Low

N/A

Low

N/A

Low

N/A

170 CHAPTER 6 Application Feasibility Analysis and Planning

Once the benefits, risks, and alternatives are de- fined, they are evaluated to narrow the field to one (or two) proposed alternative(s).

Evaluate Alternative Solutions The recommended alternatives are evaluated for technical adequacy, organizational feasibility, ex- tent to which benefits are met, and severity of asso- ciated risks. In general, we select the alternative that meets the most requirements, yields the greatest ben- efit, and has the lowest associated risk. When these characteristics do not relate to the same technical alternative, one or two are selected for further analy- sis and the remaining alternatives are eliminated from consideration. In this section, we discuss tech- nical, organization, benefit, and risk evaluations for narrowing the decision to one or two alternatives.

Evaluate Technical Feasibility

Technical feasibility assesses the technology, its maturity in the market, its availability to the com- pany, and the likelihood of successful use. Techni- cal feasibility is most important when using new technologies that are leading edge. You want to be leading the competition, not bleeding, when using new technologies!

The key questions used to evaluate technical fea- sibility are:

Is the technology in use elsewhere? Is the technology used elsewhere for similar

applications? How mature is the technology? How much industry experience is there with this

technology? Are staff with experience using this technology

easy to find? How does each alternative manage the applica-

tion sources of complexity? Does the proposed alternative require any

compromise of application requirements? What type of compromise and which requirement( s)?

Each question is evaluated for each technical alter- native proposed. Any issues about a technology's

ability to perform as required for an application should be identified. Objective answers that may not be what managers want to hear are required to ade- quately assess technical feasibility. Maintaining objectivity is difficult when market pressure to develop an application exists and managers want to develop an application.

To perform technical feasibility analysis, the technical alternatives are listed and compared across alternatives. Then, the application requirements are listed and evaluated for number of requirements met across the alternatives. The alternative meeting the most requirements is favored during this analysis. If there is a difference in the extent to which a requirement would be met, that information is noted in the analysis.

Evaluate Organizational Feasibility

Organizational feasibility is the extent to which the organization is ready to implement the proposed application. First, using the questions below, orga- nization structure is assessed to define organizational changes required.

Does the organization structure need to be changed?

Do all groups that create the same information report to the same manager?

Do user jobs require new procedures? Do user jobs require new work organization?

For instance, do they move from individual assembly line-type arrangement to work groups?

Do users have the required level of computer literacy?

Do users have the required level of typing skills?

Will users require training for the new application?

Can training be done by other users? Are users involved in screen design, accep-

tance test design, and/or general application development?

Does the IS staff know the problem domain? Does the IS staff know the software being used? Does the IS staff know the operating environ-

ment being used?

Organization structure is evaluated to determine if the people who have creation authority for data all report to the same management and that all departments and jobs that will be needed in the new application are defined or currently exist. Second, expected users are evaluated to determine the extent to which training is required to implement the pro- posed application. For instance, some computer lit- eracy and typing skills might be required. If users must know how to turn the machine on and activate an application, but do not currently use computers, you might need to do a short questionnaire or inter- view users to determine their level of computer literacy. Any needs identified are added to the im- plementation plan as a task (and cost) of the pro- posed application. The goal of this first type of organization analysis is to identify user department changes and user requirements for training, both of which must be satisfied before the organization can effectively use the proposed application.

A second type of organizational feasibility assesses the readiness of the IS organization to develop the proposed application. When a custom development is being done by consultants, you eval- uate their skills with the technology and similar problems to determine their readiness. The assess- ment determines staff skill with the hardware, operating environment, programming language, database, and similar environments. As with the user organization, feasibility, level of expertise and train- ing requirements are determined. Technical staff training requirements defined during this assessment are added to development plans for cost analysis.

The last type of feasibility assessment, financial feasibility, is performed after a plan for the recom- mended alternative(s) is developed. Financial feasi- bility is discussed in a following section.

Assess Benefits

Benefits defined for the application in general, and for specific implementation alternatives, are assessed to determine which proposed alternative yields the outcome with the highest reward to the organization.

Benefits are tallied for each alternative. First, a simple count of the benefits for each alternative is done. Then, for benefits assigned monetary values, the amounts of increased revenues or avoided

Feasibility Activities 171

expenses are summed to provide a single dollar- value benefit for each alternative. If there are no alternative-specific benefits, the number and value of benefits are the same for all alternatives. If there are alternative-specific benefits, then one or several alternatives might be preferred. These are identified by this analysis.

Assess Risks

Similar to the benefits analysis, the risks of each pro- posed alternative are assessed to determine the alternative with the least risk. First, a simple count of the risks for each alternative is done. Then, for alternative-specific risks, the extent to which they are likely to occur is assessed. If there are no alter- native-specific risks, the risks are the same for all alternatives. When the risks are not the same, alter- natives with lower, less likely risks are preferred to alternatives with a high likelihood of occurrence. If a dollar value of exposure is assigned to the risk, it is considered, with lower values of risk preferred to significant potential losses.

Propose New Application

Next, the recommended solution(s) are defined in sufficient detail to allow project planning and finan- cial analysis. The development plans include hard- ware, software, operating environment, development concept, technical feasibility, organization feasibil- ity, benefits, and risks.

The proposal of the new application might docu- ment the recommendations formally to begin to develop the feasibility report, or may still be an informal collection of information that supports the remaining analyses. The formality of this gathering of information is decided by the Project Manager and SE, based on their confidence in their decisions. If they are fairly confident that no major changes will take place, they might develop final versions of doc- umentation and begin an informal review of their findings and recommendations with users.

ABC Video Evaluation of Alternatives

The alternatives first are assessed in terms of the technical and organizational feasibility. Then, the benefits and risks of each are assessed. Based on

172 CHAPTER 6 Application Feasibility Analysis and Planning

the differences between alternatives, a recommended solution is selected.

Both packages, SOL Server and Focus, appear capable of providing the complete application as envisioned by Vic. The implementation would prob- ably be smoother with Focus given the high skill level of Mary and her staff with the product. SOL Server might have intangible benefits in that, if another store were opened, the software could eas- ily communicate between stores, having been built specifically for distributed processing. This benefit is not immediate, however, and the current technical solution favored is Focus. Focus has a longer history, and is thus, a more mature product, has a large com- pany backing it, provides all technical requirements for current and future plans; and is cheaper than SOL Server in the example.

From an organization perspective, neither product offers any distinct advantages or disadvantages. The staff at ABC would have to learn both products. Both vendors offer classes in the Atlanta area. The company does not need reorganization to accommo- date the application regardless of software chosen. From the perspective of Mary's staff, Focus is pre- ferred since they already have experience using it, but she feels confident that they could also build the application using SOL Server if desired.

The benefits analysis is simple in this case. The benefits do not favor either implementation scenario since they all apply to both. Thus, all benefits ~re expected to accrue from either implementation.

The risk analysis favors Focus over SOL Server slightly. The main difference in risk exposure is from the lack of usage experience of Mary's group with SOL Server. This lack of knowledge can only be par- tially removed by training. Experience in using the product is really required to develop knowledge of the 'undocumented features' and unanticipated lim- itations of the software. In this case, Focus is known to Mary's team and is therefore preferred.

In the example for ABC, both packages could probably be used with success in developing the ABC rental application. Both softwares appear capa- ble of future growth and have apparent company sta- bility. The cost differences favor a Focus solution, while the specific client/server orientation provides an as yet unneeded benefit to SOL Server. Vic

decides in favor of the Focus solution, but is clearly unhappy with the overall cost of $50,700. Vic wants to continue with the planning and financial analysis for the application, but is also interested in some way to reduce or defer the development costs of Mary's team services for customized software. In any case, the Focus, LAN solution will be planned and evalu- ated financially in detail. Before we continue with ABC's problem, we first talk about project planning.

Plan the Implementation Estimating Techniques

Users are easy to deal with when they feel you understand their problem, when they think you can improve their situation through automation, you can estimate how long the job will take, and you can estimate their costs. These are not easy items to know or to develop. When users are comfortable that they can afford and use the proposed application within a reasonable amount of time, they become the champions of the project, fighting for its develop- ment in the political environment of the business. Research shows that a champion provides a major contribution to application development success. In this section, we discuss the last two important issues to making the user feel comfortable: planning and costing the project.4

Accurate estimates are important to

• allow cost-benefit and other financial analyses • allow hardware/software trade-off analysis • provide a basis for management evaluation of

multiple projects • act as the basis for schedule, staffing, project

management, and structure definition • avoid problems such as contract renegotiation,

overtime, user cost increases; or project costs increases

At the feasibility level, estimates should be within 20% accurate. This means that the estimates might be overstated or understated by 20%. Planning

4 All the methods in this section are based on methods discussed in Barry Boehm's book, Software Engineering Economics. Englewood Cliffs, NJ: Prentice-Hall, 1981.

should be redone at the end of the analysis phase, at which time the estimates should be within 10%. Again, planning at the end of design should refine the estimates to within 5%. The redefinition of costs is one activity that meets with resistance from man- agers who tend to cast in concrete the first estimate they hear. Part of the Project Manager's role is to educate the managers and users involved to under- stand that as the degree of uncertainty about project activities decreases, the certainty of time estimates and costs increases. Therefore, the plans should be redone at the end of every major phase of activity.

The planning methods discussed in the next section are ways to generate time estimates for the person-days of project work. These are then con- verted into costs by allocating an amount of money for each person required. Ultimately, the Project Manager and SE rely on their knowledge of the organization and salaries of individuals. Additional costs are allocated for computer resources, acquisi- tion of hardware, software, or consultants, and other supplies needed to complete the application.

There are many different approaches to planning which are discussed in the first section below. After that, we take a practical, experience-based approach to developing a critical path plan. The experience- based estimates are then reality checked against two sets of algorithmic planning formulae. The two plan- ning methods used are function points and the CoCoMo model. Both have known flaws. By com- bining planning methods rather. than using only one, you improve the likelihood of more accurate estimates.

Planning methods are usually classified into cat- egories for algorithmic methods, expert judgment, analogy, Parkinson, price-to-win, top-down, bottom- up, or function points. These are defined here, and several methods are discussed in detail because they are the most frequently used. Advantages and disadvantages of each method are summarized in Table 6-10.

ALGORITHMIC METHODS. An algorithmic estimating relies on one or more key formulae to develop an estimate of person-power required for project work. There are five types of algorithmic planning methods. The sequence in which they were

Feasibility Activities 173

developed and found to be inadequate is linear (see Figure 6-10), multiplicative (see Figure 6-11), ana- lytic (see Figure 6-12), tabular (see Figure 6-13), and composite, which combines the others. All but the composite method are rarely used because they offer too simplistic a model of project work. The noncomposite methods do not support adjustment of the model for expertise of staff, tools used to aid development, or other factors that might alter the time and cost of development. All algorithmic meth- ods suffer the same fatal flaw that they rely on some initial estimate that is difficult to guess and on which the accuracy of the entire estimate rests.

There are two key variables in the Composite Cost Model (CoCoMo): number of delivered source instructions and project mode. Delivered source in- structions refers to lines of code used in a produc- tion version of an application and omits any modules or programs written to support the development effort. Since any sizable project has thousands of instructions, this term is expressed as thousands of delivered source instructions or KDSI. Delivered instructions are those that actually are in the finished product and excludes any code that is generated to facilitate project development. For instance, in a DBMS application, you frequently write programs to do a formatted print of the file that are not part of the finished application. These modules would be omitted from the estimate. The second important word is source. Source code means uncompiled, unlinked lines of code in whatever language is used. The implication is that some compiled language such as Cobol, Fortran, Pascal, or PLll, is used. Control language code is omitted from KDSI, while the num- ber of Cobol statements is reduced by a factor of .33 to compensate for the high percentage of nonexe- cutable code.

The model is based on three critical assumptions. First, it assumes that KDSI can be estimated with some accuracy. Second, it assumes that the water- fall life cycle approach is used. Third, the language of application development (Cobol, Pljl, APL, and so on) is assumed to have no discernible impact on the amount of effort or staffing for a project. The lat- ter two assumptions can be corrected for by the mul- tipliers. The first assumption, that accurate estimates of KDSI are possible, is only true when projects are

174 CHAPTER 6 Application Feasibility Analysis and Planning

TABLE 6- 10 Advantages and Disadvantages of Estimating Techniques *

Method

Algorithmic

Advantages

Objective, repeatable, efficient, analyzable formula

Disadvantages

Subjective inputs

Good for sensitivity analysis

Objectively calibrated to experience

Does not accommodate exceptional circumstances

Assumes history predicts future applications

Expert Judgment Assessment of representativeness, interactions, and exceptional circum- stances can be factored into the judgment

No better than participants

Biases, incomplete recall

Representativeness of experience

Analogy

Parkinson

Price to Win

Top-Down

Bottom-Up

Based on experience

Might relate to experience

Often wins the contract

System level focus

Efficient use of resources

More detailed basis

More stable than top-down

No better than participants

Biases, incomplete recall

Representativeness of experience

Reinforces poor practice

Produces large overruns

Unethical misrepresentation of information

Less detailed and stable than other methods

Overlooks technical complexity

May overlook system level complexity and costs

Fosters individual commitment when individual estimates own work

Requires more effort than most other methods

Function Points Objective, repeatable, objective inputs Based on history

Must be calibrated

Focuses on application externals

* Adapted from Boehm, Barry W., Software Engineering Economics. Englewood Cliffs, NJ: Prentice-Hall, 1981, p. 342.

similar over time, and accurate statistics of past proj- ect KDSI are maintained.

Project mode refers to a combination of size, staff, and technology. The three main project modes are organic, semidetached, and embedded (see Table 6-11). An organic project is developed by in-house staff, is small to medium in size, and uses existing, familiar technology.

A semidetached project is one that is developed by in-house staff and contractors, is intermediate to

large in size, and uses technology that is familiar to some of the project team.

An embedded project is one that is developed by contractors, is medium to very large in size, and uses state-of-the-art technology which is new and unfa- miliar to all project members.

The five project sizes referenced by CoCoMo are small, intermediate, medium, large, and very large. Each size has an average number of thousands of source instructions to which it relates (see Table

Feasibility Activities 175

Effort = Ao + A 1X1 + ... AnXn Where An = Weight

Xn = Source of Cost n (e.g., Personnel time) Where An = Source of Cost n (e.g., Personnel time)

xn = -1 , 0 or 1 depending on presence of cost Ex.:

Effort = -3.6 +9 (2)

+10.7 (2) +55.7 (1)

+15 (1) +29.55 (1)

+2.2 (.6) +.52 (.4)

= 137.58

High Uncertainty of Requirements

Unstable Design Concurrent Hardware

Development New Technology Multiple Target Hardware

Platforms Percent I/O Percent Match Instructions

Person Months

Ex.:

Effort = .6 * *.951

* 1001

= 1359

High Uncertainty of Requirements

Unstable Design Concurrent Hardware

Development New Technology Multiple Target Hardware

Platforms Person Months Test Code

Person Months

FIGURE 6-10 Linear Estimating Formula and Example

FIGURE 6-11 Multiplicative Estimating Formula and Example

NIN 2N log 2N / 2SN2

Where: NI N2 N

S N2 N

Example: If NI N2 N S N2 N

Number of Program operators (e.g., Add) Number of Program operands (e.g., Data Fields) NI +N2 Approximately 18 l:N2 usage, i.e., the number of time the operands are used in instructions l:NI + l:N2 usage

30 1000

NI + N2 = 30 + 1000 = 1030 Approximately 18 l:N2 usage = 2500 l:NI + l:N2 = 1000 + 2500 = 3500

then NIN 2N log 2N / 2SN2 30 * 2500 * 3500 IOg2 1030/2 * 18 * 1000 75000 * 4.5 / 36000 9.1 Person Months

FIGURE 6- 12 Analytic Estimating Formula and Example

176 CHAPTER 6 Application Feasibility Analysis and Planning

Estimate number of functions by type. Estimate number of laC for each function. Table lookup of productivity. Sum all time. Distribute according to table formula.

Type

Math Report logic

MM/1000 lOC*

6MM 8MM

12 MM Signal/Process Control Real-Time Control

20MM 40MM

Example:

5 Math functions 2000 laC 15 Reports 8000 laC 25 logic functions 5000 laC

6 Signal control functions 1200 laC 0 Real-time control o laC

(2*6) + (8*8) + (12*5) + (20*1.2) 12 + 64 + 60 + 24 160 MM

*MM = Person Months laC = Lines of Code

FIGURE 6-13 Tabular Estimating Formula and Example

TABLE 6-11 Three CoCoMo Project Modes

Organic

Semidetached

Embedded

In-house developed

Small-medium size

Existing, familiar technology

Partially in-house and partially contractor developed

Intermediate-large size

Existing, familiar technology

Contractor developed

Medium-very large size

State-of-the-art, unfamiliar technology

TABLE 6- 12 Five CoCoMo Project Sizes

Size Thousands of Lines of Source Code

Small

Intermediate

Medium

Large

Very Large

128

512+

From Boehm, Barry W., Software Engineering Economics.

Englewood Cliffs, NJ: Prentice-Hall, Inc., 1981, p. 75.

6-12). Tables of the estimates, completed for each of the standard sizes, are provided in Boehm's book. These sizes provide a guide for calibrating nonstan- dard KDSI estimates.

To use CoCoMo, the mode is defined, KDSI are estimated, the formula for the matching project mode is computed. Table 6-13 shows the CoCoMo 'basic' formulae for each mode. The appeal of such a simple model is obvious. The model is reusable, objective, and simple to learn and use. The model's major source of uncertainty is in the need for an accurate estimate of KDSI. This difficulty of accu- rately estimating KDSI should not be minimized.

TABLE 6- 13 CoCoMo Basic Formulae

Mode Effort Schedule

Organic MM= TDEV= 2.4(KDSIl.05) 2.5(MMo.38)

Semidetached MM= TDEV= 3.0(KDSI1.12) 2.5(MMo.35)

Embedded MM= TDEV= 3.6(KDSI1.20) 2.5(MMo.32)

MM = Person Months TDEV = Time of Development

From Boehm, Barry W., Software Engineering Economics.

Englewood Cliffs, NJ: Prentice-Hall, Inc., 1981, p. 75.

Next the multipliers are evaluated and used to modify the person-month estimate based on project specific factors (see Table 6-14). Risks, uncertain- ties, constraints, and staff experience are all evalu- ated to determine their potential impact on the schedule. The basic person-months estimate is mul- tiplied by each relevant subjective multiplier to adjust for project contingencies.

Total months of effort is not very useful for a multiperson project unless there is also some way to

TABLE 6- 14 Sample Co CoMo Multipliers

Range of Type Variance Multiplier

Product Reliability .75-1.4 Data Base Size .94-1.16 Software Complexity .70-1.65

Computer Execution Time 1.00-1.66 Memory Constraints 1.00-1.56 OS Volatility .87-1.3 Turnaround Time .87-1.15

Project Modern .82-1.24

Practice Use of Software Tools .83-1.24 Schedule Constraints 1.10-1.23

Personnel Analyst Capability .71-1.46 Programmer Capability .70-1.42

Application Experience .82-1.29

Operating System Experience .90-1.21

Programming Language Experience .95-1.14

Rate Each Cost Driver on a scale of 0 (Not applicable) to 5 (Highly applicable)

Multiply rating times multiplier to obtain final multiplier

Multiply MM Computation by final multiplier

From Boehm, Barry W., Software Engineering Economics.

Englewood Cliffs, NJ: Prentice-Hall, Inc., 1981.

Feasibility Activities 177

know how much elapsed time the project should take and when to phase people onto and off of the project. Co CoMo provides these estimates. The second set of formulae are used to estimate total development time (TDEV) which accounts for multiple people work- ing on the project. (Table 6-13 also shows the algo- rithms used to compute development effort.) To use these algorithms, you simply plug in the person- months value from the first formula into the TDEV formula matching the project mode.

Finally, the Co CoMo model includes a formula to estimate staffing levels over time in the shape of a Rayleigh (pronounced RAY-lee) curve. A Rayleigh curve (Figure 6-14) starts at some point above zero, increases to a high point, and gradually decreases to near zero. The formula for developing the number of people at any time requires an estimate of the time of the highest staffing level for the project (see Figure 6-14). This formula assumes a peak about one-third of the way into the elapsed time (TDEV).

The advantages of any formula for estimating is that it is objective and repeatable (see Table 6-10). Further, they are easily understood and require little effort to use. The disadvantages are that the formulae all require some initial estimate that is hard to develop and frequently inaccurate. The formula might not fit the project and may be complicated to learn.

EXPERT JUDGMENT. Expert judgment esti- mating is a technique by which the Project Man- ager and SE use their experience to guide the development of the time estimates. Each task is defined in terms of the program types likely to result from the task. Then, using their experience, the PM and SE assign times to each program, adding design time and analysis time.

For instance, assume there are 15 report pro- grams. If a batch Cobol report interfacing with a DBMS averages one week to code and unit test, 3-5 days of design, and 2-4 days of analyses, then 15 reports will average 15 weeks for programming and one week is allocated per program. The other phase estimates are similar. A range of 30-60 days of analysis and of 45-75 days for design are allo- cated for the 15 reports. Similar estimates are made

178 CHAPTER 6 Application Feasibility Analysis and Planning

Full Time Equivalent Staff 8 7

5 4

1 234 5 6 7 8

# Periods

(0.15TDEV = 0.7t)2 *

( O.15TDEV + O.7t) - 0.33(TDEV)2

FTEt = MM e O.33(TDEV)2

where MM is man-months TDEV is total development time

is the period for which the estimate is made

*Adapted from Boehm, 1981. FTEt is Full Time Equivalent staff in time t

FIGURE 6-14 Rayleigh Curve of Staffing Estimates

for batch updates, on-line queries, on-line updates, and so forth.

When all program estimates are complete, the entire group is summarized to develop a project estimate. These are then presented as a range of estimates with the lowest number representing the optimistic schedule, the average number represent- ing the most likely schedule, and the highest num- ber representing a pessimistic schedule.

Costs are similarly assigned. Each program type is used to define the skill level of the desired pro- grammer. For instance, a junior programmer might be assigned to batch reports, a senior programmer assigned to on-line processing, and a mid-level programmer to on-line reports. The times for each program type and programmer type are summed and multiplied by the cost of that level person. Similarly, the level of analyst or programmer-analyst needed for analysis and design of the tasks is estimated.

Finally, all costs are summed to develop a total cost for the project.

The advantages of expert judgment are the ability to factor experience into estimates, to tailor estimates to assigned personnel, and to develop estimates quickly and efficiently (see Table 6-10). The disadvantages are that the estimates are no bet- ter than the expertise of the PM and SE, they may be biased, are hard to rationalize, and not objectively repeatable. That is, the experience cannot be taught to others so two PM/SE teams estimating the same project will develop different estimates for the same problem. Finally, expert judgment is not useful in novel situations using new technology, methodol- ogy, or languages.

ANALOGY. Analogical estimating is similar to applying experience. In estimating by analogy, a recently completed similar project is selected to act

as a prototype baseline for developing cost estimates for a current proposed project. Costs are determined based on the match or mismatch of tasks and pro- grams to the baseline. In other words, if a task is essentially the same, then the actual time of the task in the baseline project is used to estimate the actual time of the task for the proposed project. Analogy is applied to time, staff skill levels, and, eventually, resource, hardware, software, and other costs.

The advantage of analogy is that it is based on an actual, recent experience which can be studied for specific differences and only those differences require new cost estimates (see Table 6-10). The dis- advantages of analogy are that the analogous pro- ject may not be representative of the proposed project, constraints, techniques, or functions. Some of the disadvantages can be reduced by matching project functions. This technique might work in large compallies with many similar projects, but is not particularly useful in small companies, unique projects, or projects using new technology, method- ology, or languages.

PARKINSON'S LAW. Parkinson's LawS states that "Work expands to fill the available time." Based on this law, any time can be allocated and that is the time the project will take (see Table 6-10). For instance, there are 6 people available for 6 months, therefore the project will take 36 person-months. This is a cynical view of estimating that reinforces poor development practices by random assignment of time and people.

There are obvious flaws to Parkinson's Law. This method is likely to be grossly inaccurate in estimates generated (see Table 6-10). If people are allocated because they are available and not because they are needed, their skills are likely to be wasted and the project is more likely to be late. This method is not recommended.

PRICE-TO-WIN. Price-to-win is a consultant strategy that uses a low estimate to obtain ajob, with the implication that the time and cost will later be renegotiated. Like Parkinson's Law, this strategy is

5 Parkinson's Law was first published in 1957.

Feasibility Activities 179

not recommended. Price-to-win leads to forced user compromise on application requirements to try to meet a cost/time estimate, gives the consulting com- pany bad public relations, always requires staff over- time, and most always results in cost overruns for both time and money.

You might ask, Why would anyone ever use a price-to-win strategy? Unfortunately, historical estimates by IS personnel are not very accurate unless combinations of modern techniques such as CoCoMo and function points are used and few prob- lems occur on the project. Following this logic, peo- ple who use a price-to-win strategy usually believe any estimate is good as long as they get the job, since there is little relationship between real and estimated costs anyway. Frequently, in government projects especially, the lowest bid wins the job. This logic of choosing the lowest bid leads to price-to-win estimates. This has led to problems for several gov- ernment entities.

TOP-DOWN. Used with one or more of the other estimating techniques, top-down estimates use project properties to derive an estimate. Then total cost is split among the components. After a time estimate is derived, the 40-20-40 rule is applied to the estimate. According to the rule, 40% of project time is spent on analysis and design, 20% is spent on coding and unit testing, and 40% is spent on project testing.

The advantage of using a top-down approach is that, by focusing on global properties of the appli- cation, an estimate can be developed quickly-in a day or two. Using analogy to assess global proper- ties, the proposed project is assumed similar to some other whole project. For instance, ABC's applica- tion is an on-line database application with create, change, delete, and query capabilities for all data, and an overall query facility for grouped data; sys- tem functions include start-up, shutdown, and monthly file maintenance processing.

The major disadvantage of a top-down approach is that the above description fits most on-line data- base applications (see Table 6-10). Such a high level focus cannot identify low level technical problems that drive up costs. For instance, in a complex data- base application, one particular data access need

180 CHAPTER 6 Application Feasibility Analysis and Planning

might require a month of design and prototyping time to prove that the concept works. This type of special process would be missed in a top-down estimate. Whole software components might be missed in the global assessment that, when devel- oped, account for a disproportionate amount of time and cost. On balance, top-down estimates are less stable than more specific estimates.

BOTTOM-UP. The bottom-up approach takes the opposite view of an application from the top-down approach. Using a bottom-up estimating approach, each software component is identified and estimated, often by the person who would do the development. All individual component costs are summed to arrive at the estimated cost of the entire software product.

The bottom-up approach is as likely to miss com- ponents for development as the top-down approach (see Table 6-10). At the low level, integration work to combine modules and programs may not be esti- mated or is easily underestimated. Also, the bottom- up approach requires significantly more effort to develop because every module, progr~m, screen, database interaction, and so on must be identified for estimating.

The advantages of the bottom-up approach are that the estimates are based on a more de- tailed understanding of the project than the other methods, and, when estimated by the person doing the work, the estimates are back~d by a profes- sional's commitment.

Application Item Count Simple

# Inputs (Le., Trans Types) 3

# Outputs (Le., Reports, Screens) 4

# Programmed Inquiries 3

# Files / Relations 7

# Application Interfaces 5

FUNCTION POINTS. The function point method takes an organizational history approach to estimat- ing. Function points are a measure of complexity based on global application characteristics. A base- line developed by analyzing all previous applica- tions is developed for each type application. The baseline number of function points is divided by the actual cost/time of development to get an estimate for one function point per application type (or lan- guage, or person-month). New applications are ana- lyzed to determine an estimate of the number of function points in the project. Then, the base time and cost estimates for one function point are multi- plied by the number of estimated function points for the proposed application to develop a total time and cost estimate.

Function point analysis rests on the ability of the project team to predict the inputs, outputs, queries, interfaces, and files. Figure 6-15 shows the counts and weights assigned for each type of I/O. Each item is counted and weighted for complexity. The weighted counts are summed.

Then a series of 14 questions to determine differ- ent types of application complexity are evaluated on a scale of zero to five to measure increasing importance of the item to the application (see Table 6-15). The answers to the 14 questions are also summed. The summed complexity weights and weighted counts are combined in one formula shown below to compute the total function points for a project.

Average Complex FP = Count * Weight

4 6

5 7

4 6

10 15

7 10

From Pressman, Roger S., Software Engineering: A Practitioner's Approach, third edition. NY: McGraw-Hili, 1992, p. 49.

FIGURE 6- 15 Function Point Weighted Count Table

TABLE 6- 15 Function Point Questions and Rating Scale *

Rating Scale from 0 (No influence) to 5 (Essential)

Factor Questions:

1. Is reliable backup and recovery required?

2. Are data communications required?

3. Are any functions distributed?

4. Is performance critical?

5. Is operational environment volume high?

6. Is on-line data entry required?

7. Does on-line data entry require multiple screens or operations?

8. Is on-line files update used?

9. Are queries, screens, reports, or files complex?

10. Is processing complex?

11. Is code design for reuse?

12. Does implementation include conversion and installation?

13. Are multiple installations and/or multiple organiza- tions involved?

14. Does application design facilitate user changes? How integral is ease of use?

*From Pressman, Roger S., Software Engineering: A Practi- tioner's Approach, third edition. NY: McGraw-Hill, 1992, p. 50.

FP = Total weighted count * (.65 + (0.1 * L(complexity adjustments)) )

Function points have become popular enough that several companies and software packages are avail- able for developing function point estimates. In ad- dition, tables of function points per number of lines of code are also available. For instance, 100 lines of Cobol is equal to 20 lines of Focus is equal to one function point. Translating function points into lines of code, then, requires a simple table lookup.

The appeal of function points is similar to that of CoCoMo. Any algorithmic method is likely to be easy to use, understand, and repeat (see Table 6-10). An algorithm gives the appearance of objectivity

Feasibility Activities 181

that other methods do not. Of course, the function point estimate has flaws similar to those of CoCoMo, too. Function points must be calibrated for the orga- nization based on its history of project development. It assumes that history predicts the future. Further, it assumes similar technology and skills across proj- ects. The model assumes that methodology and CASE have no impact on project development time.

To summarize, there are several useful methods of project person-month or lines-of-code estimat- ing. The most popular are expert judgment, analogy, CoCoMo, function points, top-down, and bottom-up. All of these methods have advantages and disadvan- tages. If a history of projects and function points is kept, that appears to be the most accurate estimat- ing technique at the moment. If function points are not calibrated to the company's history, no one esti- mating technique is better than any other. Rather, the methods might be paired or used several at a time to develop estimates that are closer to reality than esti- mates developed using anyone method alone.

Planning Guidelines

In the absence of calibrated function points for ABC, we will discuss the use of several methods in devel- oping a plan for an application. By combining the methods, the schedule and plan developed should be better than using anyone plan on its own.

Several variations for combining estimating tech- niques are feasible. They are:

1. Estimate inputs, outputs, interfaces, queries, and files according to function point directions.

2. Answer 14 questions and estimate project complexity.

3. Compute function points. 4. Lookup lines of code per function point (FP)

in language table and compute total lines of code (LOC) for the project.

5. Decide the CoCoMo mode. 6. Using FP LOC as input to the CoCoMo

model, compute person months of effort. 7. Analyze multipliers and adjust the estimate. 8. Compute total development time and project

staffing estimates using the other CoCoMo formula.

182 CHAPTER 6 Application Feasibility Analysis and Planning

If the company uses function point analysis for its baseline, function point planning is the first type per- formed. Then, the plan can be compared to the Co- CoMo model estimates to verify its goodness of fit. Alternatively, the project manager can develop a top- down plan while the SE and any other project staff working on the feasibility develop a bottom-up plan by using the following steps:

1. PM and SE together estimate the develop- ment approach and all functions in the application.

2. PM uses top-down analysis to develop a list of activities to be performed and the times for each.

3. From this list, deliverable products and a schedule are developed.

4. The list is analyzed to determine task depen- dencies, and a first-cut critical path chart is developed.

5. Concurrently with steps two to four, the SE analyzes each function bottom-up to determine the complexity, possible problems, nondeliverable programs, and amount of effort to be assigned to each technical task.

6. Any new tasks identified by either the PM or SE are added to the plan and estimated. The SE and PM compare and adjust their time es- timates until they agree.

Another alternative is to combine expert judg- ment, analogy, top-down, and bottom-up to develop a first set of estimates. Then, these estimates are compared to the standard function point estimate for a reality check. If the expert estimate is more than 15 % lower than the function point estimate, then the plan should probably be revised upward. In this sec- tion, we use expert judgment and analogy, using a top-down approach to develop the estimate, then do a bottom-up analysis of each piece to ensure they are all present.

The steps to developing a plan are:

1. Decide the Development Life Cycle (DLC), approach, and methodology.

2. For each phase, list the deliverable products that mark completion of the phase.

3. Decide on information gathering tech- nique(s) and use of lAD, prototyping, or other variants to DLC.

4. Decide which products the technical project team members will develop and which the users will develop.

5. Define dependencies and develop CPM chart.

6. Assign times to tasks and compute total project time.

7. Estimate inputs, outputs, interfaces, queries, and files according to function point directions.

8. Answer 14 questions and estimate project complexity.

9. Compute function points. 10. Lookup lines of code per function point

(FP) in language table and compute total lines of code (LOC) for the project.

11. Estimate productivity in LOC/month. 12. Compare FP number of person months to

the estimated total time. 13. Adjust time estimates, as required, and com-

plete the CPM diagram by adding times.

For instance, assume the waterfall is followed and the phases include Feasibility, Analysis, Design, Program Design, Code/Unit Test, System Testing, Acceptance Testing, and Installation. Then, list de- liverable products. Phases might have more than one deliverable product. Products usually coincide with the ending of life cycle phases. Products for these phases include a feasibility report, functional re- quirements specification, design specification, pro- gram specifications, plans for testing, conversion, training, and implementation, operational documen- tation, and user documentation.

From the choices in Chapter 4, decide the ap- proach to information gathering. If you use lAD, for instance, the amount of time allocated to analysis is less than if you use interviews over time. Decide the overall system design approach. Is prototyping needed? How involved will users be in the develop- ment process? How extensive will user training be? Will CASE be used? Which tool? (Some tools add analysis and design time, some reduce it). How ex- tensive are documents expected to be? Is on-line

help software going to replace user manuals? Who is responsible for planning and executing the conver- sion? How much data scrubbing to remove errors from existing data is required? The answers to these questions increase or decrease the time allocated for each task.

Next decide which products the technical project team members will develop and which the users will develop. These tasks are estimated just as the techni- cal team tasks are estimated, but they are also sin- gled out for several reasons. First, the dependencies should clearly show the split of assignments for the technical team and users. Second, users should be allowed to comment on tasks for which they are responsible. The technical team usually takes responsibility for the tasks if the users will not take it.

Develop a list of tasks and define dependencies, developing a critical path chart for the project. Assign times to tasks. Compute function points. Using an estimate of LOC per month per person on the project, compute a total project time, and com- pare the FP estimate to your estimate. Adjust your estimate as required if it is more than 15% less than the FP estimate. In general, always use a higher estimate rather than a lower one. Project schedules have a way of losing time for meetings, nonproject responsibilities, and other legitimate, but nonpro- ductive uses of time.

Now, let's go through each step to using com- bined techniques for estimating. To develop a critical path diagram, list the tasks on a sheet of paper. Begin with high level tasks, or tasks of a single phase, adding lower level tasks as they come to mind. Development of the task list requires some experi- ence and is always done more easily by several peo- ple rather than one who is likely to forget some critical task. The task list, in critical path method terms, is called a work breakdown.

Define durations for each task. Durations may be an absolute number or a range of time. The critical path method recommends the identification of opti- mistic, likely, and pessimistic estimates. Then, the weighted formula ((Optimistic + 4(Likely) + Pes- simistic) I 6) is applied to develop one number for use in financial analysis and software planning tools. Use either method for developing the time. Planning

Feasibility Activities 183

software packages allow early, most likely, and latest possible dates to be entered. For some software you enter the project completion date and the software computes the early and late dates for tasks based on their durations.

Extend the times to develop dates at which each task is expected. A work breakdown shows the earli- est start and end dates for each task, plus the latest start and end dates per task. The early dates assume that each preceding task took the minimum esti- mated number of days. The latest start and end dates assume that each preceding task took the maximum estimated number of days.

Next, create the CPM chart (see Figure 6-16). List all tasks on a piece of paper. Draw lines from later tasks to early tasks on which they are dependent. By dependent tasks, we mean those tasks that cannot be begun until information (or products or ap- provals) from the previous task are complete. The early task feeds the later one.

When the diagram is complete, compute the time to complete each leg of the diagram. The leg with the longest time is the critical path, that is, the tasks on which meeting the deadline for the project depends. If anyone of the critical path tasks is late, the proj- ect will be late. When monitoring the project, the critical tasks get priority. When assigning staff to tasks, the critical tasks should be assigned the most experienced and skilled personnel.

Some sensitivity analysis on critical path and on task dependencies might be done, if using an auto- mated tool for the analysis. Manual analysis is so time-consuming that it may not be worth the effort. The impact of different end dates is analyzed. For instance, if the user were to mandate a date two months earlier than the estimated end date, what is the impact on the project and tasks? Does the criti- cal path change? Can other tasks, not fully analyzed, be made more parallel? Can any dependencies be removed by altering the plan or tasks? If the project suffers penalties (loss of revenue) from not meeting deadlines, the risks for each task might be reassessed to ensure that nothing is missed. The project man- ager continues this type of analysis until he or she is comfortable with the result.

After the critical path is identified, staff should be assigned to each task to complete project planning.

184 CHAPTER 6 Application Feasibility Analysis and Planning

---------------~

FIGURE 6- 16 Sample Critical Path Method Chart

Assign people to minimize the amount of slack time for which they have no assignments, but allow some slack time in case problems arise. Assign the criti- cal tasks first, allocating them to the best, most experienced people. A general rule of thumb is that, in absence of artificially short deadlines, people can be assigned to develop a whole leg of the critical path. The purpose for assigning sequential tasks in a leg are to leverage the knowledge gained from early tasks to later tasks, and to provide each indi- vidual a sense of contribution to the overall project by allowing them to take responsibility for a large chunk of work.

When the estimates are complete, develop a func- tion point estimate, or have someone else do it in parallel. Weight the FP estimate by the answers to the 14 questions. Lookup the lines of code (LaC)

Lines of Code/FP 25 25

100

language 4 Gl Sal

Cobol

FIGURE 6-17 Example of LOCIFP for Different Languages

per function point (FP) in a table (see Figure 6-17). 6

Estimate your productivity in LaC per month; for instance, 1000 LaC/Month for a 4GL is not uncom- mon. If your company keeps statistics, use its his- torical numbers for project type and language. Compute total person-months for the project using the formulae in Figure 6-18. Compare the FP esti- mate to your estimate and adjust as needed. Don't just blindly take the higher number. Rather, a dif- ference means that information was interpreted dif-

6 Refer to Capers Jones' 1986 book, Programming Productivity, for extensive tables with this information.

Number of Lines of Code per Function Point * Number of Function Points = Total Lines of Code

Example 25 lOC/FP (4Gl) * 100 = 2500 laC

Total Lines of Code I Lines of Code per Month = Number of Person Months

Example 2500 laC I 1000 per Month = 2.5 Person Months

FIGURE 6- 18 Function Point Computations for Total Person Months

ferently by the two methods of estimating. See if you can find what is different and which estimate is more realistic.

Use the 40-20-40 rule to check if the effort looks like it is reasonable across the phases. Analysis/ design should be about 40% of effort if manual and 55% if using CASE. Code/unit test is about 20% effort if manual and 5% if a CASE tool generates code. System testing shoulq be 30-40%. Testing estimates are usually low. If testing is the diff~rence, ask if there is some reason to be optimistic, for instance, a skilled programmer. If the difference cannot be found, and the percentages are allocated about right, then changing your estimate is a judg- ment call.

For manual allocation of staff to a project, a list of tasks in CPM legs should be created and a person's

Scheduled Task 1 2 3 4 5 6 7

Interviews J. Smith SB SB S C. Jones SB SB B M. Mayhew SB SB S

Develop DFDs S- --- -- --- --- S

Define Data and ERD B- -- --- --- --- B

Review and Revise DFD

and ERD SB SB

Begin Data Dict.

Define Problems w/Current System

Define Business Opportunities

Legend:

B = Barbara James, SE S = Stan Smits, PM/SE

SB- -- --- --- --- --- --

Feasibility Activities 185

name assigned to each task. This allows easy track- ing of assignments and dates at which people rotate on anq off the project. If using an automated tool, allocation of staff usually requires entry of the per- son's name ap.d assignment of tasks by CPM ID. In either case, as people are assigned to tasks, note who they are and when they begin (and end) project work. Make sure you do not change the critical path by the assigmllent of personnel to overlapping or conflicting duration tasks.

Upoq co~pletion of task assignments, a Gantt Chart is developed to summarize the project. A Gantt Chart shows the entire set of project tasks, people assigned, and completion times estimated for the development effort (see Figure 6-19). A list of people and amount of time assigned to the project is created for use in the costing activity.

Day

9 10 11 12 13 14 15 16 17 18 19

--- --- -- --- --- --- --- -- SB

Where initials alternate, both Barbara and Stan p~rticipate in the activity.

FIGURE 6-19 Sample Gantt Chart

186 CHAPTER 6 Application Feasibility Analysis and Planning

ABC Video Implementation Plan

ABC's rental application is a fairly average project with no obvious complexities, no state-of-the-art technologies, and a single, small organization. Mary, the PM, and Sam, the SE, decide to use a combina- tion of analogy, top-down, and bottom-up and to check their estimate with function points based on the estimate of 25 LOC/FP for a 4GL. Before Mary and Sam begin, they first decide their approach and assumptions on which the estimates are based.

The project is expected to be implemented on a Novell ethernet LAN using PCs as workstations and a superserver (50 Mhz, 486-based machine). The software environment will be some SQL language with custom application software. There will be four main files, corresponding to the four main entities in the ERD. The main processing centers around

Application Item Count

# Inputs (i.e., Trans Types)

# Outputs (i.e., Reports, Screens)

# Programmed Inquiries

# Files / Relations

# Application Interfaces (9)

Factor Questions

Reliable backup and recovery Data communications

Distributed functions

Critical performance

High volume operations environment On-line data entry

Multiple data entry screens or operations On-line file update Complex queries, screens, reports, or files Complex processing

Reusable code design Conversion and installation

Multiple installations and/or multiple organizations

User change; ease of use

Total

Simple

FIGURE 6-20 ABC Function Point Estimate

rental activity with standard maintenance procedures for the other files. Other files, which will be main- tained during rental processing, include history and an end-of-day summary of transactions. The appli- cation will accommodate up to ten concurrent users for all processing.

If two people are estimating, as Sam and Mary are, a good approach is to split the two types of esti- mates between the individuals. Sam would do one and Mary the other. Then they compare and rational- ize their work.

First, we develop a function point estimate for the work. The function point estimate (see Figure 6-20) shows that the project is not very complex in any of the key inputs or outputs. The weighting questions identify the on-line, interactive, and multiuser char- acteristics as contributing the greatest complexity to the application. The total function points are esti-

Score

4 o o 4

5 5 4

o 4 o 4

o 3

Average Complex FP = Count * Weight

0 6 20 @ 7 30

4 6 18

@ 15 80 7 10 0

Total 148

FP = Total Weighted Count * (.65 + (.01 * :L(Complexity Adjustments)))

= 148 * (.65 + (.01 * 37)) = 148 * (.65 + .37) = 151 Function Points

mated at 151. Carrying the FP analysis through, at 25/LOC per function point, there are about 3775 LOC (i.e., 25 * 151) for the project. At a productiv- ity rate of 2000 per month, the total number of per- son months for the project is about 1.9 months (i.e., 3775/2000). The estimate of 2,000 LOC/month is a company statistic based on the average productivity of each of the project participants.

Mary, in parallel, creates a task list which she converts into a work breakdown. The work break- down identifies the tasks to complete the project, and the optimistic, likely, and pessimistic times for each task (see Table 6-16). The most likely time for each task is then computed and a total time for the project is estimated.

At this point, the two sets of estimates should be compared. The FP estimate suggests 1.9 person- months, while the work breakdown estimate of 172 hours translates into slightly under one month (25 days). The FP estimate is almost twice as high. Let's see where the differences might lie. At the end of Table 6-16, the total times for each phase are shown with percentages of the total computed for each number. The percentages do not follow the 40-20-40 rule closely. The realistic estimate shows 46% of time for analysis and design, 32% for cod- ing and unit testing, and 22% for system testing. The estimate for system testing is low relative to the rule while the other estimates are somewhat inflated. Mary knows she and Sam are the only two people who are expected to work on the project and she based her estimates on their ability to debug and test quickly. But even she cannot defend this low number to Sam. Sam also points out that, if Vic wants much documentation, her estimates for all the tasks might be low. Mary has assumed that Vic, being a small company owner, will opt for less documentation to save on the expense.

On the other hand, Sam identified several com- plexities with which Mary takes issue, in particular with the difficulty of on-line update and the difficulty of interactive programming. Both of these were given a '5' rating of complexity. Mary feels that if the application were on a mainframe and using mainframe software and tools, the fives would be justified. Since the application platform is a LAN with which they have extensive experience, she feels

Feasibility Activities 187

that the highest rating should be a four. This would then reduce the FP estimate. Both Mary and Sam discuss their estimates, defending their reasoning processes and subjecting them to criticism by their partner. In the end, they confirm with Vic that he does want only minimal documentation, and they decide to split the difference on their estimates adding a total of 90 hours to the project. Of that time, 18 hours (20%) is allocated to code/unit test and the remaining 72 hours (80%) to testing of the project. The final estimates would then show code/unit test time of 73 hours (28% of total) and testing time of 110 hours (42% of total). While these percentages are now slightly skewed away from analysis and design, which is now 30% of the total, these per- centages are in line with the 4GL need to do less analysis and design. The total estimated project time used in the financial estimates will be 262 hours or 1.5 person-months.

The final work breakdown is converted into a CPM diagram to identify the critical path of work (see Figure 6-21 for the Analysis CPM). Based on the critical path, contingencies are planned to ensure meeting of the schedule. Figure 6-22 is a Gantt chart for analysis showing how Mary and Sam split their responsibilities.

If project planning software were used, the CPM is built first, then selection of an option converts the CPM into the work breakdown. To create either dia- gram, the tasks and durations must be known. Sophisticated software supports the insertion of a start date for the project and, based on the optimistic and pessimistic task durations, and on the depen- dencies from the CPM, the software computes all the dates for the project.

Evaluate Financial Feasibility Financial Feasibility Analysis

Financial feasibility analysis evaluates the firm's ability to pay for a project, and compares recom- mended alternatives to determine which is more economically attractive. In general, projects are eco- nomically feasible when the sum of all IS projects plus the proposed project is less than 10% of firm net

188 CHAPTER 6 Application Feasibility Analysis and Planning

TABLE 6-16 ABC Work Breakdown with Durations

Task: Analysis Optimistic Likely Pessimistic (O+4L+P)/6

Define Customer Maintenance Processing 2 3 4 3

Define Video Maintenance Processing 2 3 4 3

Define Rental Process 1 2 3 2

Define Return Process 1 2 3 2

Define How Intertwined 2 3 4 3

Define History 1 2 3 2

Define EODay, Audit, Trans Log 2 3 4 3

Define Cust Create, Video Create in Rental 1 2 3 2

Define Error Msgs, Abort Procedures 1 2 3 2

Define Screen Contents 2 4 6 4

Define Flow of Processing 1 2 3 2

Define Start-up/Shutdown 1 2 3 2

Define File Purge .5 1 1.5 1

Define Backup/Recovery .5 1 1.5 1

Define Conversion/Training 1 2 3 2

Analysis Total Time 19 34 49 34

Task: Design Optimistic Likely Pessimistic (O+4L+P)/6

Cust Maint Process 2

Video Maint Process 2

RentlReturn Includes: Display, Data entry, Retrieval, Payment, Accounting, File Update, History, EOD, Audit, Controls 7

Screens 10

Start-up/Shutdown 4

Backup/Recovery 1

Conversion, Training 2

Design Total Time 28

income. This uses industry averages as the guideline. To compare alternatives, several methods discussed in this section are used.

Cost-benefit analysis is the comparison of the fi- nancial gains and payments that would result from

3 4 3

11 21 12

14 16 15

6 12 6

1 1 1

5 8 5

43 66 45

selection of some alternative. The analysis facilitates comparison of alternatives for one project or alterna- tive projects.

Criteria used in alternative comparisons might be maximizing benefits, ratio of benefits to costs, net

Feasibility Activities 189

TABLE 6-16 ABC Work Breakdown with Durations (Continued)

Task: CodelUnit Test Optimistic Likely Pessimistic (O+4L+P)l6

Cust Maint Process 2 4 6 4

Video Maint Process 2 4 6 4

Rent/Return Includes: Display, Data entry, Retrieval, Payment, Accounting, File Update, History, EOD, Audit, Controls 8 14 28 15

Screens 5 10 15 10

Start-up/Shutdown 8 10 12 10

Backup/Recovery 1 2 3 2

Conversion, Training 5 10 15 10

Code/Unit Test Total Time 31 54 85 55

Task: Testing Optimistic Likely Pessimistic (O+4L+P)/6

Scaffolding 2 4 5 4

Screen test 2 4 6 4

Subsystem Test 7 14 21 15

System Test 7 14 21 15

Testing Total Time 18 36 53 38

Project Totals by Phase Optimistic Likely Pessimistic (O+4L+P)/6

Analysis Total Time 19 19%

Design Total Time 28 29%

CodelUnit Test Total Time 31 32%

Testing Total Time 18 19%

Project Total Time 96 100%

benefits, minimizing costs for given level of benefit, or maximizing project internal rate of return. The most popular criterion is maximizing net benefits, which requires analysis of the present value of ben- efits and costs.

34 49 44 20% 19% 20%

43 66 45 26% 26% 26%

54 85 55 32% 34% 32%

36 53 38 22% 21% 22%

167 253 172 100% 100% 100%

Three types of costs are considered: acquisition, development, and operating costs are all considered in the development of the cost-benefit analysis. Sev- eral different sources of costs relate to each of these cost types:

190 CHAPTER 6 Application Feasibility Analysis and Planning

Define Customer

Maintenance

Define Video

Maintenance

Define Startup /

Shutdown Processing

Define Rental/Return

Processing Relationship

Video in Rental Works

Define Screen Contents and Process Flow

Define Backup and Recovery Requirements

*Bold line indicates the critical path.

Define History

Processing

Define Error Messages/

Abort

Define Conversion

and Training Requirements

Realistic

DUin 27

\ Task

Define File Purge / History File Creation

Not Done in Rental

Milestone

2 I Review

Requirements with Vic

FIGURE 6-21 ABC CPM Chart for Analysis Activities

Acquisition Costs Consulting Equipment Software Site preparation Installation Capital Management staff assigned to acquisition

Development Costs Application development

Education of personnel Testing Conversion Losses relating to changeover, downtime,

reruns Aggravation cost

Operating Costs Personnel allocated for maintenance Hardware operating expense (e.g., air

conditioning, electricity, etc.)

1/2

Sam

Mary

Vic

1/3 1/4

Assigned time (white area)

Slack Time (gray area)

1/5

Feasibility Activities 191

1/6 117 1/8 1/9

Vic

FIGURE 6-22 ABC Gantt Chart for Analysis Phase

Lease/rental costs Depreciation on related capital acquisitions Operating personnel overhead

In general, any time you spend money, a cost is gen- erated. Whether the money is for salaries, personnel benefits, copy machine rental, PC acquisition, oper-

ating system acquisition, DBMS acquisition, and so forth, a cost is generated. The breakdown of costs into acquisition, development, and operating cate- gories allows managers to do sensitivity analysis on alternatives. For instance, Alternative A might have a high acquisition cost relating to hardware site preparation and expense, whereas Alternative B has

192 CHAPTER 6 Application Feasibility Analysis and Planning

none. If the benefits are greater with Alternative A, we might ask if the acquisition of hardware is justi- fied by the extra benefits relating to Alternative A.

All of the costs of each alternative are assembled according to type for the analysis. Depreciation schedules, leasing schedules, and any ancillary information relating to how costs are generated over time are also used in the analysis.

Similarly, all information about benefits expected from the application are assembled for the analysis. Benefits are identified as 'one time' or as continuous improvements. If a stream of revenues is generated over time by the application, these are identified as annual revenues.

The net present value formula is applied to the benefits and costs to develop a net present value for the application (see Figure 6-23). The formula accounts for the time value of money in computing the net benefits over costs. If inflation or fluctuating interest are expected, the interest rates might be changed for each time period to account for such

where: t is the time period, varying from 1 to n d is the discounted interest rate B is the value of period benefits C is the value of period costs

Example: d = .08

o 10,000

30,000

50,000

5,000

NPV = -(50,000/1) + 5,000/1.08 + 25,000/1.1664 + 45,000/1.2597

= -50,000 + 4,629 + 21,433 + 35,722 =$11,784

(1 + d)t

1.0000

1.0800

1.1664

1.2597

FIGURE 6-23 Net Present Value Formula and Computation

fluctuations. Keep in mind that exactly the same analysis is required for all competing alternatives to ensure consistent NPVs. The example shown in Fig- ure 6-23 shows a project for which the benefits out- weigh the costs; such a project would be desirable.

The problems arise when a project does not gen- erate a favorable NPV, but numbers alone do not express project value. Benefits may be insufficient to pay for the project. For instance, in complying with government regulations, there may be no specific benefits to the company. Similarly, when responding to a competitive need, the benefits might not out- weigh the costs, but the cost of not doing the project might be the loss of the business. Start-up companies frequently build applications to support anticipated work; the applications might not be profitable until they are several years old. Benefits from such appli- cations are difficult, if not impossible, to quantify because of the uncertainty associated with a new business. Finally, companies wishing to gain signif- icant competitive advantage must frequently under- take a financially unjustifiable project to obtain their goals. American Airlines, for instance, in developing their $1 billion airline reservation system was bet- ting that their ability to gain market share would out- weigh their expenses. The financial analysis could not justify the project because of the high level of intangible benefits and the difficulty in estimating their worth. The risk paid off, but could just as easily have backfired. That is the nature of risk and why good managers develop skill in knowing when such a risky project is worth attempting.

Make/Buy and Other Types of Analysis

Other types of analysis that might be developed are make/buy, internal rate of return, and payback period. Each of these uses NPV as a starting point for determining the value of a project. Each develops a different analysis. Make versus buy decisions eval- uate two types of development alternatives. First, makelbuy compares the value of a customized appli- cation to the purchase of a software package. This sounds like a simple comparison, when in fact, it is not. Purchasing software for a complex appli- cation usually requires customizing and alteration. Packages are rarely used off-the-shelf. Consequently,

the analysis concentrates on the extent to which changes to the package are required and the cost of purchase plus changes versus the cost of custom development.

Second, makelbuy is also used to compare the competitive value of building a software product internally versus development by a consulting firm. Occasionally companies which charge for iIi-house IS development services begin to overcharge their users. Users are then justified in obtaining competi- tive bids from consulting companies and using their services when the cost is less.

Internal rate of return (IRR) is a financial analysis of NPV such that positive cash flows (i.e., benefits) are equated to negative cash flows (i.e., costs). This means that the d, discount rate, in the NPV formula is found. This gives the true cost of funds for this particular project. When projects have similar NPVs, an IRR analysis identifies differences in cost of money based on when the cash flows are generated that might differentiate the alternatives.

Payback period is the number of years required to recover the investment (acquisition and develop- ment) costs from projected benefit cash flows. The payback period might decrease revenues for the time value of money or might use a simple analysis of payback. Payback analysis is popular because it is easily understood. It can discriminate against proj- ects which have a long lead time to realizing bene- fits, but should not be the primary criterion for project selection decisions. In the example shown in Figure 6-23, the payback period would be 3 years and 2.4 months. This number is arrived at by identi- fying $10,000 in year 4 as contributing to the pay- back along with all benefits in years 2 and 3. 10,000 is 20% of 50,000, the fourth year's projected return. Therefore, 20% of 12 months is 2.4 months. The payback, rounded, is 3 years and 3 months.

Document the Recommendations The documentation of the feasibility study pulls together all information relevant in developing the final recommendation. The purpose of the summary document is to provide managers a basis for decid-

Feasibility Activities 193

ing whether or not to continue with the development effort. With this thought in mind, the feasibility doc- ument should contain mainly supporting diagrams, lists, and summary analyses. Text should be kept to a minimum to explain the attached diagrams and analyses. An outline of a feasibility document is pro- vided in Tab Ie 6-17.

TABLE 6- 17 Feasibility Report Outline

1.0 Management Summary

2.0 Current Environment

2.1 Business Environment

2.2 Work Procedures

2.3 Evaluation of Strengths and Weaknesses of Current Procedures

3.0 Proposed Solution

3.1 Scope of Proposed Solution

3.2 Functional Requirements Overview

4.0 Technical Alternatives

4.1 Alternative 1 4.1.1 Description of Alternative 4.1.2 Benefits of Alternative 4.1.3 Risks of Alternative

4.2 Alternative 2 ...

4.n Alternative n

5.0 Recommended Technical Solution

5.1 Comparison of Alternatives 5.1.1 Technical Comparison 5.1.2 Benefits Comparison 5.1.3 Risk Comparison 5.1.4 Recommendation and Risk

Contingency Plan

6.0 Project Plan

6.1

6.2

7.0 Costs

7.1

7.2

7.3

Critical Path Chart

Staffing Plan

Cost of Recommended Alternative Hardware/Software

Projected Staffing Cost

Analysis of Alternatives (if necessary)

194 CHAPTER 6 Application Feasibility Ar:alysis and Planning

The Management Summary section is the most important because it is the only item read by most of the audience. Therefore, it should be brief, less than two pages, and should summarize the remainder of the document. In particular, the cost, NPV, other financial analyses, scope, purpose, technical recom- mendation, and importance of the project to the organization are highlighted in the summary section. All organizations involved in the development effort and the nature of their involvement should be highlighted.

The remaining sections summarize each of the main activities completed during the feasibility study. The current environment and proposed alter- natives are described in sufficient detail to give the reader an understanding of the differences proposed. This section identifies hardware, operating environ- ment, software, items for custom development, and requirements met by the alternative. Benefits and risks associated with each alternative are also listed and discussed to trace the reasoning leading to a selection.

The section on the recommended technical solu- tion is more detailed than the alternatives discussion and discusses different topics. The tasks, key fea- tures, and development life cycle, methodology, and concept are highlighted in the proposed application section. In addition, the discussion lists constraints, assumptions, level of security, recovery, and audit- ability for the recommended solution. A contingency plan for minimizing the probability and for dealing with risks of the recommended alternative are detailed. Potential impediments to successful devel- opment, such as decisions or information not cur- rently available, are identified. Ideally, the person responsible for resolving the outstanding issues is named and dates for resolution are identified.

The project plan section summarizes the plan- ning effort. A critical path chart and staffing plan are presented with any attendant assumptions and requirements. Finally, the costs of the recommended alternative(s) and the financial analysis are detailed. Any assumptions, for instance, the discount rate for NPV, are listed. If sensitivity analysis was per- formed, the extent to which the estimates are sensi- tive and the source of sensitivity are identified. Sources of sensitivity might include, for instance,

interest rates, economic fluctuation, or the presence of a key salesperson.

AUTOMATED __________ __ SUPPORT TOOLS ____ _ FOR FEASIBILITY ____ _ ANALYSIS ____________ _

There are two classes of tools that support the work performed during feasibility analysis: planning tools and analysis tools. Analysis tools can span any of the three methodologies covered in this text and are discussed in the respective methodology analysis section.

The planning tools might include project estimat- ing products, project scheduling products, risk analysis products, or spreadsheets for financial analysis. Spreadsheets are general purpose and are not discussed here. Estimating products are based on an algorithmic method from those discussed above. Products based on CoCoMo estimating, Rayleigh curve, and function point techniques are included in the list. The tools assume that the underlying input information, for instance, KLOC, is known by some other, unspecified technique.

Planning products assume that a work breakdown with task duration assignment exists. The work breakdown planning tools support the definition of tasks, task interrelationships, assignment of staff, determination of early and late start dates, expected end dates, and cost of resources. From this informa- tion, the tool can generate Gantt Charts, critical path networks, cost summaries, and manpower planning guides. There are many good project management software products of both types on the market, sev- eral of which are listed in Table 6-18.

Two risk analysis products are included in the summary list. These products walk you through the assignment of risk types, probability of risk occur- rence, and cost of the risk to develop a monetary value of risks r~lated to the project. The cost of risk is factored into the financial analysis. More products of this type should be expected to be avail- able as companie& become more sophisticated in

Summary 195

TABLE 6-18 Automated Tools to Support Project Planning

Product

DEC Plan

ESTIMACS

Harvard Project Manager

MacProject

ProMap V

RISNET

SLIM

SPQR/20

Time Line

WINGS

Company

Digital Equipment Corp. Maynard, MA

Computer Associates, Inc. Long Island, New York

Harvard Graphics Corp. Boston, MA

Apple Computer Cupertino, CA

LOG/AN, Inc.

J. M. Cockerman Associates

Quantitative Software Management

Software Productivity Research, Inc. Cambridge, MA

Symantec Software Cupertino, CA

AGS, Inc. New York, NY

Technique

Co CoMo Based Estimation Tool

Function Point estimates extrapolated to include staffing, cost, risk, hardware configuration, and cost estimating

Pert, CPM, and Gantt Charts Resource Allocation and Tracking

Risk Analysis

Costing Software based on Rayleigh curve and LOC

Multiple choice approach to function point estimation

Pert, CPM, and Gantt Charts Resource Allocation and Tracking

their assessment of the risk associated with capital projects.

are: collect data, define scope and functions, define technical alternatives, define benefits and risks of each alternative, analyze organizational and techni- cal feasibility, select technical alternative(s), define project plan, assess financial feasibility, and select final alternative. Data collection most frequently uses interviews or JAD-like sessions to define cur- rent work environment, problems, and desires for the new application. From the information collected, the team and user define the scope of the activity, including all departments involved. Then, the func- tions to be kept from the current work environment

SUMMARy ____________ __

Feasibility analysis is an important activity that gives a development project a scope and refined defini- tion of application purpose, while providing infor- mation that allows the determination of technical, organizational, and financial readiness of the organi- zation. The steps to performing feasibility analysis

196 CHAPTER 6 Application Feasibility Analysis and Planning

and functions to be added to provide the new func- tionality are defined at a high level.

Technical alternative definition begins with an assessment of the project's criticality to the organi- zation and the need for different departments to share data. Based on that information, existing com- puter resources are analyzed to determine their use- fulness for the proposed application. If existing resources are not adequate, new computer equip- ment, software, or packages are defined for acquisi- tion. In general, the smallest size computer (orLAN) that can do the work and provide a migration path for growth is selected. Distributed resources might be identified as an option but are not fully analyzed at this time. Several technical alternatives are devel- oped and analyzed to select one or two that meet the most requirements, provide for the greatest benefits, and pose the fewest risks.

The next activity is to define a project plan. There are many different estimating techniques for project- ing time to complete a project: algorithmic, top- down, bottom-up, price-to-win, Parkinson's Law, expert judgment, function point analysis, and anal- ogy. Of these, CoCoMo and function point are the most popular when a history of project development is maintained by a company. Function point analy- sis complements CoCoMo in developing an estimate of LaC. Co CoMo can use the LaC estimate as input to its formulae to develop total person-month, total development time, and project staffing estimates. Parkinson and price-to-win are not recommended. When other techniques are used, they are best used in combination. So, top-down, bottom-up, and expert judgment might be combined to develop best guesses of the time and effort involved in a devel- opment project. The project plan is used to develop personnel costs and computer resource usage. These and the other costs are factored into the financial fea- sibility assessment.

Financial feasibility techniques most commonly used include net present value analysis which accounts for the time value of money, internal rate of return which identifies the real interest rate of a proj- ect, and payback analysis which identifies the time at which net revenues equals net costs of project. Financial analysis also supports the comparison of

make versus buy alternatives for a project. Two types of make!buy analysis can be developed. First, custom development of software versus purchase of a package can be evaluated. Second, in-house versus contractor development can be evaluated. Finally, alternative selection is based on financial value of the alternative( s) when more than one technical alternative for a project exists. Also, from the finan- cial analysis, managers can evaluate several different projects using an objective method and can identify the project with the fastest, strongest returns.

REFERENCES __________ __

Albrecht, Albert J., and James E. Gaffney, "Software function, source lines of code and development effort prediction: A software science validation," IEEE Transactions on Software Engineering, November, 1983,pp.639-648.

Boehm, Barry W., Software Engineering Economics. Englewood Cliffs, NJ: Prentice-Hall, 1981.

Charette, Robert N., Software Engineering Risk Analysis and Management. NY: McGraw-Hill, 1989.

Collins, Eliza G. c., and Mary Anne Devanna, eds., The Portable MBA. NY: John Wiley & Sons, 1990.

De Marco, Tom, "An algorithm for sizing software products," Performance Evaluation Review, ACM SIGMetrics Publication, Vol. 12, #2, Spring-Summer, 1984, pp. 13-22.

Gause, Donald c., and Gerald M. Weinberg, Exploring Requirements Quality Before Design. NY: Dorset House Publishing, 1989.

Jones, Capers, "Program Quality and Programmer Productivity: A Survey of the State of the Art," Pre- sentation through Software Productivity Research, Inc., Boston, MA: March 15, 1989.

Jones, Capers, Programming Productivity. NY: McGraw- Hill, 1986.

Kendall, Ken E., and Julie E. Kendall, Systems Analysis and Design, 2nd Ed. Englewood Cliffs, NJ: Prentice- Hall, Inc. 1992.

King, John L., and Edward L. Schrems, "Cost-benefit analysis in information systems development and operation," Computing Surveys, Vol. 10, #1, March, 1978, pp. 20-34.

Rubin, Martin S., Documentation Standards and Proce- dures for On-line Systems. NY: Van Nostrand Rein- hold Company, 1979.

KEy TERMS _______ _

algorithmic estimating alternative approaches analogical estimating application leverage

point benefit bottom up estimating business leverage point Co CoMo estimating competitive environment contingency planning cost/benefit analysis critical success factor customer environment delivered source

instructions Delphi method of

estimating discounted cash flow embedded project expert judgment

estimating feasibility financial feasibility flexibility function point function point analysis Gantt Chart goal imaging

industry environment intangible benefits internal rate of return

analysis KDSI leverage point make/buyanalysis net present value (NPV) objective organic project organizational feasibility Parkinson's Law payback period analysis pert chart platform portability price to win strategy project mode project plan quick analysis reliability risk risk assessment semidetached project tangible benefits technical feasibility top down estimating vendor environment work breakdown work flow management

EXERCISES _______ _

1. Using Table 6-6 as a guide, develop a CPM for the design phase of ABC's project. While you do the diagram, reason through the dependen- cies. Assuming Sam and Mary do the project alone, how should the work be allocated between them to (a) allow Mary to do project management tasks, and (b) leverage the work they did during analysis?

2. Using Table 6-6 as a guide, develop a more detailed task list for some phase or portion of a phase (e.g., all rental/return processes, or con- version/training). Then, develop an estimate of the work based on your expertise and the idea

Study Questions 197

that you would perform the work. How does your estimate differ from Table 6-6? Why? Are the differences completely justifiable? Present your estimates to a group of classmates and pro- vide your reasoning for the changes.

STUDY QUESTIONS ____ _

1. Define the following terms: Benefits Function point Net present value Leverage point Risk Technical feasibility

2. Why is feasibility analysis performed? 3. What are the three main types of feasibility and

why are they important? 4. List the steps to performing feasibility analysis. 5. What are the main data collection techniques

used during feasibility analysis? 6. What is a leverage point? 7. How do business and application leverage

points differ? How do they complement each other?

8. List five sources of benefits. 9. Discuss the differences between tangible and

intangible benefits. 10. List five sources of risk and give an example of

each. 11. Why is risk analysis performed? What do you

do with the risks once they are identified? 12. How are technical alternatives generated? 13. Once technical alternatives are complete, how

are they assessed? What is the basis for select- ing one alternative as the preferred one?

14. Compare the advantages and disadvantages of algorithmic, function point, and combined top- down, bottom-up estimating.

15. What is the major weakness of Co CoMo estimating?

16. What is the major weakness of function point estimating?

17. Why do we have so many estimating tech- niques? Is one better than another?

18. What is the major financial analysis used to analyze project alternatives? Why is it the pre- ferred method?

198 CHAPTER 6 Application Feasibility Analysis and Planning

19. What is the purpose of make/buy analysis? 20. Describe the two types of make/buy analysis.

* EXTRA-CREDIT QUESTION 1. The Office Information System described in the

Appendix is an application that automates the support division of a large company. The units involved include a typing pool, copy center, print shop, and graphic arts department. Other projects are being developed in the IS Depart- ment that will cost approximately $2.4 million per year, and an additional $1.5 million in oper- ating expenses.

The proposed budget for the OIS is $200,000 for a Cobol, mainframe application using a DBMS to store the data. Is this a reasonable amount? Develop one to three alternatives that are more financially attractive. One of the alter- natives might be on the mainframe but can use different resources; at least one alternative should use different technology. Who should develop the application? Under what circum- stances would you recommend to do/not do the application?

PAR

ANALYSIS ----------------------------~---------------__ ~D--------------------______ ~---

DESIGN ----- --------------------------------------------. ............ ~-------

INTRODUCTION ________ ___

Analysis is the act of defining what an application will do. Design is the act of defining how the re- quirements defined during analysis will be imple- mented in a specific hardware/software environment. The next eight chapters define and describe func- tional analysis and design. Each set of analysis- design chapters uses major representation techniques from the methodology class it presents. In a tradi- tional application development, there are many more analysis and design activities than we address here (see Tables III-1 and III-2). Most of these topics should already be part of your knowledge base from a systems analysis and design course. Many activi- ties we do cover in this text are also in a systems analysis and design course. The difference is that here we develop three methodologies instead of one as in systems analysis. In this text, we concentrate on the activities which differ across the methodologies. Chapter 13 summarizes the similarities, differences, and automated support across the methodologies. It also discusses the future based on current research in methodologies. Chapter 14 discusses the forgotten activities in most methodology-related books and

many systems analysis texts. These activities include human interface design, input/output design, conver- sion design, and user documentation design.

At the end of the next eight chapters, you should be able to do the following:

1. Understand the conceptual foundations of the three classes of methodologies and how they are similar and how they differ.

2. Represent the functional requirements of an application using each of the three methodologies.

3. Be able to translate a functional requirements definition into a SOL-based design for an application using each of the three method- ologies.

4. Compare the advantages, values and dis- advantages of methodologies' uses for analysis.

5. Develop a critical understanding of the diffi- culties of translating what users want into representations that convey meaning.

6. Know some computer-aided and organiza- tional supports for completing analysis and design work.

199

200 PART III Analysis and Design

TABLE 111-1 Representative Project Development Analysis Activities

Recurring activities/tasks Initiate phase Plan next phase Prepare report Review phase products

Analysis Phase Activities

Initiate hardware/software evaluation (as required) Initiate prototype development (as required)

Define current system (as required) Document and files Data elements Compile data dictionary Processing Controls Volumes and timing Interfaces with other systems Responsibilities Work distribution Operating costs

Assess current system Review project objectives and scope Compare system in operation with recommended

solution Identify opportunities for immediate improvements Assess organizational design appropriateness for

application

Define proposed application's business requirements System concept and overview

Major functions Scope User organizations involved Interface organizations Interface application systems Context diagram System concept-technology (i.e., DBMS, LAN,

distribution plan, etc.) Major issues, unresolved problems that might hinder

application development Schedule summary by phase Staffing summary by phase

Assess proposed system requirements Identify alternatives for system design [e.g., data-

base environment(s), hardware platform, software platform, special technology, packaged software, 4GLs, user software (e.g., Lotus)]

Discuss and, as necessary, reassess technical, orga- nizational, and economic feasibility as each re- lates to the alternatives identified

Define processing requirements DFD (or analogous graphic for the methodology) Steps (i.e., procedures to be followed; should match

methodology) Required sequences of processing only Constraints (e.g., timing, memory, concurrency,

other applications, etc.) Accuracy (e.g., to x decimal place, or timing as of

y minutes) Formulae Performance criteria (e.g., volume, timing, response

time) Inputs-name, source, frequency, volume, data

elements, media Outputs-name, purpose, frequency, screen format,

copies, elements, sequence, media Database-data requirements as expressed in

methodology, relations, user views, organization, required reviews, access, security

Reports-name, purpose, destination, frequency, form/screen, data elements, sequence

User acceptance criteria

Define interface requirements Identification-name of interface, sending system/

organization, receiving system/organization Responsibility/approvals Interface schedule-testing schedule and responsi-

bilities, conversion schedule and responsibilities, delivery to production

Requirements Inputs-name, purpose source, frequency, media,

form #, components using each input, data ele- ments, data controls, data descriptions, formu- lae for computation Input layout-data direction, terminal devices,

comm software, time outs, modem require- ments, line use, data characteristics, line characteristics, line protocol

Output-name, purpose, frequency, format/screen #, copies, elements, sort sequence, media, com- ponent generating the output, source of data and name, data description, layout (transmitted output should have same information as input layout above)

Introduction 201

TABLE 111-1 Representative Project Development Analysis Activities (Continued)

Files-system name, system ID, file name, file ID, type of file (I/O), purpose, source, update cycle, sequence, frequency, volume, growth, media, usage (R, W, R/W), retention character- istics, security, blocking factor, file records types, components using file, file control char- acteristics

Record description-record name, file ID, record type (fixed, variable, spanned), record size, update cycle, form # for input, data elements and characteristics (definition, purpose, use in computation, formulae, precision, edit criteria, defaults, required/optional data, etc.)

Define control requirements Batch totals, item counts Hash totals, record counts Operation intervention and inquiry logs Exception reporting and responsibilities Processing controls-equipment failure Document control (e.g., for prenumbered checks) Transaction logging and on-line controls

Define security and backup requirements Recovery requirements data criticality, recovery plan

in event of emergency Password and internal security checks

Define conversion requirements Data clean-up Clerical effort Systems effort-automated and manual files to be

converted Volume and growth of files as it impacts conversion Alternatives for implementation Overall conversion timing requirements Conversion impact on user areas Conversion impact on operations Facilities alteration/site preparation

Changes or additions to desks, tables, work spaces, cabinets, charts, etc.

Forms, tapes, manuals, etc. Construction-walls, floors, ducts, etc. Cabling and electrical-outlets, switches, cables,

lighting, other wiring, etc. Safety-extinguishers, alarms, first aid kits, etc. Security-badge-entry, guard service, etc. Environmental-air conditioning, humidification,

dust, etc.

Maintenance-cleaning, equipment maintenance, etc. Contingency--disaster plans, backup procedures, etc.

Define training Type of training, recipients, and details for all training,

including but not limited to on-line data entry, remote location data input, native language manuals, general introduction to new system

Define system acceptance criteria Test data input by user Parallel runs Pilot runs Phased cutover Depending on acceptance criteria, include the

following: Amount of test data to be entered, and number of

clerks involved Size of pilot parallel (e.g., number of accounts,

cycles, etc.) Length of time Performance criteria Impact on clerical staff Impact on operations

Define hardware Acceptable limits of downtime Average or maximum terminals down at the same time Inquiry response time Update response time Batch turnaround time Maximum percent of transmission errors Backup 'firedrills' plan and frequency Maintenance/reliability Peak and average time requirements Geographic constraints on terminal location Purchased hardware required costlbenefit analysis and

RFP selection process List of hardware for this system, type, location, 'own-

ership,' system role, backup, criticality (This list should include terminals, PCs, controllers, modems, transmission lines, mini-computers, workstations, mainframes, peripherals, disks, CDs, tapes, etc.)

Define software/system/misc. Volume of each transaction type Growth Delivery time constraints

(Continued on next page)

~------------ --

202 PART III Analysis and Design

TABLE 111-1 Representative Project Development Analysis Activities (Continued)

Number of reruns Backup 'firedrills' plan and frequency Distribution of output messages List of hardware for this system, type, location, 'own-

ership,' system on which it runs, backup, criticality [This list should include DBMS, operating system, LAN, communications, remote access (e.g., Carbon Copy), on-line help, etc.]

Define initiate request for proposal (RFP) Determine criteria for decision List requirements for proposal Select vendors Prepare RFP report

In this section, we introduce the general charac- teristics of analysis and design that all methodolo- gies have in common.

APPLICATION _____ _ DEVELOPMENT AS ___ __ A TRANSLATION ____ _ ACTIVITY ---------------- The process of building applications is a series of translations. Historically, we first examine and trans- late the current physical system to develop an abstract, logical definition of the current system (see Figure III-I). Then, with the application users, we define the requirements of the new logical system which retains the aspects of the old system while incorporating the new requirements defined by users. The new logical system definition is the basis for translating to a working physical application.

This historical strategy is useful only sometimes. The strategy works when a new application will maintain 50% or more of the old application's func- tions. For example, we might redevelop an account- ing application to move from batch to on-line, but to perform all the same functions. Another use of this strategy is when study of the old application can save time in providing code tables. For instance, state abbreviations, zip codes, and customer name

Define data requirements Data dictionary should be an appendix to documenta-

tion if it is not automated. For automated application documentation, print the information from the dic- tionary. For manual applications, include the follow- ing for each data element:

Field name, alternative name, description, pur- pose, use in computation, use in determining conditions (with other fields), code reference, length, decimal positions, type, unit of mea- sure, optional/required, allowable values (range, code structure, meaning of values), de- fault value, external data source

abbreviations all might be retained from an old application.

In many situations, however, the existing applica- tion is antiquated, full of obsolete design or riddled with errors. To study it is to learn erroneous design and procedures that must be unlearned. Why learn it in the first place? Rather, a frequently better approach is to begin analyzing the requirements of the new application. This is called 'essential' sys- tem analysis l and requires only that you, the ana- lyst, attend to what relates to the new application. The old application or procedures may be studied for specific information, code tables, or crucial steps in the process; but in general, the old application and procedures are ignored.

The essential approach is used in this text. We ignore the details of the manual method of perform- ing rental processing because the computerized method will completely replace the manual method. The major value of studying, for instance, what man- ual forms are filed and when they get retrieved, is to help get a sense of file processing in the new appli- cation. When the old procedures are being replaced, you may want to use the old methods as a way to confirm your thinking after you have developed the application concepts.

Whichever analysis method you use, translations performed during analysis all have the following five

1 See McMenamin and Palmer, 1984.

Application Development as a Translation Activity 203

TABLE 111-2 Representative Design Phase Activities

Recurring activities/tasks Initiate phase Plan next phase Prepare documentation Review phase products

Design Phase Activities

Initiate business system design

Design functional outline Review business functions Review interface requirements Develop alternative functional outlines Select best alternative Design data structure/database

Normalize, optimize, then ... denormalize as required

Design interprogram flows and controls

Design input, output, and data Design output screens/documents Design input requirements/screens Design screen dialogue and system navigation

Design processing Design computer processing Design non computer processing

Design controls Describe business control procedures Define security and backup procedures

Design business system test plan Identify acceptance criteria Prepare tentative user acceptance strategy Identify critical resource requirements Prepare testing overview Develop system test plan

Complete business system design Complete data dictionary with elements, processes,

messages, objects, modules, files/relations, data flows Define proposed organization Review conversion requirements Prepare operating schedule Perform program design as outlined below

Evaluate business system design Assure technical, operational, and economic feasibility Review risks

User procedures Define manual procedures

Define user manual procedures Define computer operations manual procedures Prepare manual procedures test plan

Complete forms, documents, and screens Prototype forms, screens, reports Complete input documents/screens Complete output forms/screens Complete screen designs, error codes, screen inter-

action process

Develop training Determine pedagogical training requirements Determine training methods Prepare training sessions and software Prepare training schedule Pilot test training

Prepare for installation Prepare and test user manual Verify readiness of user environment Train user personnel Test manual, backup, and disaster procedures

Design the physical database Define user views Define logical DB to DBMS Map logical DB to media, deciding specific access

method, extra space allocation, algorithms, etc. Build and test a sample DB Work with test planners to build the test DB

environment Work with conversion team to implement the produc-

tion DB environment

Build conversion subsystem Work with user to translate and validate current data Specify, write, and test conversion programs Train conversion personnel Execute conversion plan to build permanent DB

Program design Develop modular program structure

Study data structure Develop logical program structure

Complete methodology-related graphics Specify subprograms, modules, functions

Document programs/modules individually and as a collection. Pay special attention to document inter- modular relationships and message passing between programs

(Continued on next page)

204 PART III Analysis and Design

TABLE 111-2 Representative Design Phase Activities (Continued)

Develop and unit test physical code Implement programs top-down using stubs Prototype as needed

Plan program testing Prepare program test plan Create program test data Create test dialog for single user, multiple users,

multiple functions

Similar plans for subsystem, system, stress, multiuser, and acceptance testing are required and planned at this point. (If the application is on a tight deadline, testing and immediate conversion to production can be planned and implemented together.)

Define program development plan Determine development method

common subactivities (the activities are summarized in Table 111-3).

1. Identification-Find the focal things that belong. Identification, for instance, in the definition of the new logical system requires finding requirements. Things to be found include, for instance, entities, objects, relationships, functions, processes, and constraints.

2. Elaboration-Define the details of each thing identified. For instance, a requirement might be to provide consolidated customer account information for ad hoc reporting. During elaboration, you seek to answer ques- tions like:

What information should be consolidated about a user? Does it currently exist?

What does ad hoc mean to the user? What type of queries do the users ask now? What types of questions do the users want to

ask that they cannot ask now? What kinds of data analysis do users need? What form (for example, screen or paper)

does output take? Where (geographically) are the users asking

the questions?

Define development sequence Revise schedule and budget for programming phase

Create source library members Write record descriptions for source library (This is

not done if an active dictionary is used or if the dic- tionary for the DBMS monitors all interactions to the database. Instead, copy books or analogous code are included to describe user views.)

Write standard program code to source library

Refine operational requirements Revise computer run procedures Produce tentative production control cards (JCL)

Where (centralized/distributed/decentralized) is the data? and where should it be?

3. Synthesis-Build a unified view of the appli- cation, reconciling any parts that do not fit and representing requirements in graphic form. The representation can be either man- ual (i.e., on paper) or automated, using computer-based tools.

4. Review-Perform quality control. At the end of the phase (either analysis or design), rean- alyze feasibility, schedules, and staffing. Revise them as needed based on the more complete, current definition of the new application.

S. Document-Create useful documents from graphics and supporting text either manually or with computer-based tools.

Each of the three methodologies begins analysis by defining requirements, but each has a different starting (and ending) perspective for its analysis process. Similarly, for each of the other analysis activities, the results of the activity differ because the perspective at the start focuses your attention to dif- ferent aspects of the application.

Keep in mind that even though we discuss these methodologies as fairly linear, sequential processes,

Application Development as a Translation Activity 205

Define Current Translate from

Define New .. Logical System 'Old' Logical to Logical System

New Logical System

Translate from Translate from Physical to Logical to

Logical System Physical System

Define Current Define New Physical System Physical System

FIGURE 111-1 Application Development Translations

they are not. You get application requirements in a nonlinear fashion, usually through interviews. Fre- quently, you get high-, low-, and medium-level information all at the same interview. Your job, as the SE, is to make sense of the information received. The sense-making activity is part of the process of building your mental model of the application domain. Since you receive information at different levels over time, your mental model of the domain gets fleshed out at different levels over time, too. You constantly have to reevaluate the information you currently have against new information to deter- mine if adjustments to the current mental model are necessary.

A second point about the nonlinear aspect is that specification and implementation are never really separated completely in your thinking process. In systems analysis class you usually learn not to think about the language or implementation environment while you are performing analysis. You are told only to think about functional requirements. You must think of the implementation environment periodi- cally in the real-world, however, because some desired function might not be able to be done (or done easily) in the planned environment. When

expensive or complex functions are requested, you must alert the user/sponsor to be sure they agree with the desired function. An expensive change is one that adds more than 10% to the cost of the applica- tion. A complex change is one that convolutes an otherwise simple process (see Example III-1).

Just as analysis is a translation activity, so too, is design. The goal of design is to map the functional requirements from analysis into a specific hardware and software environment. In design, the same five general sub activities are done, but they have differ- ent definitions.

1. Identification-Design is the act of mapping how logical requirements will work in the target computer environment. This means that we identify the system design structure (if not already decided). The system structure is the underlying design approach. Possible approaches include the following:

• Batch, on-line (portions of complete), or real-time

• Which functions are connected and how ... how the application will work in the production environment

206 PART III Analysis and Design

TABLE 111-3 Summary of Analysis and Design General Activities

Activity

Identification

Elaboration

Synthesis

Review

Document

Analysis

Find the focal things that are in the application. This includes, but is not limited to, entities, objects, relationships, functions, constraints, data elements, control, legal requirements, etc.

Define the functional details of each thing identified. Users provide definitions for all terms and describe all procedures, formulae, and pro- cessing. This elaboration is inde- pendent of hardware, software, or location.

Develop a unified view of the application. Develop and document a representation of the application. Graphics, tables, and other techniques are preferred representations.

Review and walk-through the analysis with peers and project members. Walk- through the analysis with users. Review and revise schedules and costs as

necessary.

Develop 'final' forms of graphics and supporting text for all analysis activities.

• General user interface as menu-driven, windows-icons-menus-pointers (WIMP), command-driven

• Mode of operation, that is, is user an ex- pert, novice, or somewhere in between

2. Elaboration-Each requirement from the analysis phase is expanded into greater detail

Design

Refine the system concept and apply it to the functional requirements. Identify any compromises of requirements that might be necessary to work around implementation environment limitations. Define the general standards and rules for the implementation environment to which all remaining work must adhere.

For each function, map the function to the hardware and software environment. Identify reusable modules. Finalize details of message processing and intermodule communications.

Develop a unified mapping of the appli- cation to the intended hardware and software environment. Determine geo- graphic and package locations for all data and processes. Graphics, tables, and other techniques are preferred representations.

Review and walk-through design compo- nents, test plan, conversion plan, and DB design, with peers and project members, program specifications with the programmer and other peers, and screenswith users. Review and revise schedules and costs as necessary.

Develop 'final' forms of graphics and supporting text for all design activities.

and mapped to hardware and software within the system design structure. Questions re- late to:

How should the database be designed to provide, for instance, the best possible response time with the greatest efficiency?

Application Development as a Translation Activity 207

CARTER CORDUROY-YOU SHOULD HAVE ASKED AGAIN Carter Corduroy, a $100 million company, wanted to install an integrated database application to perform order entry, inventory control, and manufacturing control. During the analysis of the application, George Dare was the user contact who approved all re- quirements, acted as liaison to the rest of the company, and provided many requirements.

The analYSis phase of the project com- pleted on time and all ten project team mem- bers felt they had a good understanding of the process required and what the resulting application would do. The two people pre- paring most of the documentation and all of the program specifications were Maria Martinez, SE/project manager for 10 years who had done two other such integrated order-inventory systems, and Charlie Chou, SE with 12 years of experience who had de- veloped applications using all of the software involved.

During the middle of the analysis phase, the systems manager was replaced with a newly hired person, Robert Blake. Mr. Blake came from a larger fabric manufacturer and wanted to make a name for himself quickly in his new environment. He quickly forged a liaison with Harry Crater, the plants' manager. The application would be installed in his two finishing plants: one in Virginia and one in Arkansas.

Crater and Dare were political enemies. Dare had once worked for Crater and had not gotten along with him. Dare was young and highly proficient at his job and soon sur- passed Crater. Crater now reported to Dare for purposes of developing the application- the biggest in the company's history.

These circumstances did not affect the application team until late in design, after programming had begun. Six weeks before the application was supposed to go into pro-

duction, Dare was on vacation. Crater had a validation meeting for reporting require- ments with Martinezond Chou. At the meet- ing, he said that planned reports could not identify 'reworks: goods that were defective and reentered into the finishing process a second time. He was adamant that he must have some way of knowing if a lot of goods were a 'first work' or a 'second work.' It was the first mention of anything other than one-time-through manufacturing. Maria said this constituted a major change to the re- quirements and a nontrivial change to all pro- grams already begun. It was so significant that the end date of the project was in jeop- ardy. She decided to examine the spe- cific impact, then talk to George about the change.

Mr. Blake heard of the meeting and, that afternoon, began pressuring Maria and Char- lie to 'do what Crater wants.' After aiL he was the real user.

Maria talked to the team and asked for an assessment of effort to change their programs to allow the same lot order to be processed more than once. She and Charlie then did their own assessment. The team was unani- mous. The change would add four to six weeks time for programming and testing, all documentation would have to be modified, and all databases would be changed. In short, the change could add as much as $90,000 to the $225,000 contract-a 40% increase. Maria decided to speak to George before committing to the change.

Mr. Blake coerced the team, as their immediate boss no matter who the user of the application was, to begin work on the change. When George got back, he was immersed in another special project that was

(Continued on next page)

208 PART III Analysis and Design

CARTER CORDUROY-YOU SHOULD HAVE ASKED AGAIN, Continued taking most of his time, When Maria finally got to him, he said, "Yeah, if Blake approves and Crater insists, we probably need it, H Still, Maria had doubts.

She put the changes with cost estimates in a memo to Blake, He never signed-off on the change, but verbally agreed again, The ap- plication was three weeks late when every- one at Carter exploded, Suddenly, no one remembered that the application would be late, No one remembered being warned that this one, small change would cause so many problems, Maria was to blame for a poor de- sign that could not be made to work, Crater now said that he 'requested' the change but that it was not absolutely 'necessary,' Blake forgot the conversations, memo, and ap- provals. Dare was furious because his special project was now overbudget and late,

When the written memo and other docu- mentation from the meetings held at the time

How should programs be packaged to fulfill processing constraints? Examples might be to provide five-second response time; to provide completion of reporting within a three-hour period daily; or to provide 24-hour access to information that is up- to-the-minute.

Other elaboration activities to be decided include common routines for commonly used processes. For instance, how will screen processing be performed? Will each programmer write his or her own version of screen interface or will there be common modules for screen interactions? The scope and details of system 'utility' programs to be used by all programmers are defined.

The last major elaboration activity is to examine the application constraints. We ensure that each constraint is considered

were produced, Dare's comment to Maria was, "You are the expert, you should have asked again whether or not the change was necessary, You were the only one who knew how big it really was! H

In the end, the application was put into production with only one run through the fin- ishing plant per work order, Reworks were assigned a new number and tracked as if it were the first time through the process, The costly change and insufficient whistle- blowing by the project manager led to un- happy clients, overworked project team members, and a less than optimal applica- tion, Could they have been avoided? Prob- ably not, The client should have been made to realize the magnitude of the change, how- ever. Maria and Charlie should have been more insistent on a detailed review of the request and sign-offs for this major change,

in the design and that processing is within the prescribed limits.

3. Synthesis-Build a unified physical design of the application, reconciling any parts that do not fit and representing requirements in more detail. We may add functions to the application that are environment specific. For instance, in a mainframe IMS database envi- ronment, applications require user views, data base definitions (DBDs), data control blocks (DeBs), and data service blocks (DSBs). These control blocks are not re- quired if using dBase IV on a Pc. The repre- sentation can be either manual (i.e., on paper) or automated, using computer-based tools.

4. Review-Perform quality control. At the end of the phase, conduct a design walk-through, comparing design to logical requirements to validate completeness and correctness. Rean-

alyze schedule and staffing for coming stages of implementation, testing, conversion, train- ing, and turnover, revising them as required.

5. Document-Create useful program specifi- cations and an overall design document. The design document describes the database, application structure, constraints, and so on. Graphics and supporting text document the design. The program/module specifications include the details of processing, all inter- face designs, and any specific information required to develop the program.

As in analysis, these activities vary by methodol- ogy because the ending point of analysis, which pro- vides the input to design, is different. However, the intention of all methodologies is to define the appli- cation such that programming and implementation can be started after the design is complete. Program/ module specifications, in some form, are the desired output of the design phase.

Keep in mind that even though we discuss design as a straightforward mapping of 'what' to 'how,' it is not a one-to-one mapping. You might need to com- promise analysis requirements during design. Com- promise of requirements means that they may be rescoped, manipulated, dropped, or otherwise changed to fit the environment's limitations.

Prototyping is an important activity in design to minimize the amount of requirements compromise that takes place. Especially when you use a package or language for the first time, proto typing should be used because prototypes frequently find the lan- guage's limits. You must verify that the application structure and concept can be implemented using the software as planned. Frequently in a PC environ- ment, you will find you are bumping into language/ package limitations that cause you to rethink the design. Vendors call this process 'work around.' You are finding a way to work around the built-in limits of the language. Vendors will usually help find a work around if the application cannot be built in known ways. They also challenge users to find work arounds and broadcast them to others who have sim- ilar problems.

The linkage between analysis, design, and pro- gram design is looser or tighter depending on the

Organizational and Automated Support 209

methodology and implementation environment. For instance, data information required differs if we use dBASE IV2 or if we use IMS DB/DC. 3 Level of requirements detail differs if we use the Focus4 lan- guage or if we use C-Ianguage.5 Where possible, we point out specific instances of these linkages.

You, as the SE, must constantly check your men- tal model of functional requirements when building a mental model of how they will be implemented. Do not be afraid to try different ways of thinking. Fre- quently the old way was not too good. We get trapped in our thought processes and don't even remember to do the out of the box thinking6 that is necessary for innovative designs.

Before we discuss methodologies, some organi- zation and automated supports that facilitate ap- plication development regardless of methodology are discussed.

ORGANIZATIONAL ___ _ AND AUTOMATED ___ _ SUPPORT ________ _

Organizational innovations that are useful with all methodologies are joint user-IS application development activities, user managed application

2 dBASE IV is a trademark product of Ashton-Tate, Inc.

3 IMS DB/DC is a trademarked mainframe product of the IBM Corporation. IMS, Information Management System, is a hierarchic database product. DB stands for database; DC abbreviates data communications.

4 Focus is a trademarked database, query, application generator, expert system product of Information Builders, Inc. Focus is thought of as a 4th-generation language because of its power- ful query capabilities.

5 C-language is a trademark product of Bell Labs; C++ is a trademarked product of Borland International; and there are other versions of C-language.

6 Out of the box thinking means to rethink the entire process as if the current methods, procedures, and policies did not exist. Put yourself in the shoes of a caveman (or an intelligent child) who just walked into the company, and redesign the work as they might. Question everything. For instance, who says you need to keep a copy of an order? What is the real, i.e., legal requirement?

210 PART 111 Analysis and Design

development, structured walk-throughs, and data administration. The goal of these organizational innovations is to speed the development process, foster user participation, and improve the quality of the resulting application. Automated support for structured analysis and design comes from computer-aided software engineering (CASE) tools. Each chapter will identify CASE tools that relate to the phase and activities. In this section we describe the characteristics of CASE tools and the ideal CASE environment.

Joint Application Development Several techniques have been developed to describe the joint, intensive definition of application require- ments-Joint Requirements Planning (JRP) , Joint Application Design Development (JAD),7 and Fast-Track. 8 They are all similar in that the goal is a collaborative, user-IS definition of application requirements. The planning and execution of a joint session are also similar. The differences are the level of participants, subject matter, and level of detail of the discussions. These are more fully described below.

JRP is an executive level user-IS activity to iden- tify overall requirements at the enterprise level. Fast- Track and JAD both are designed to produce a functional requirements specification. If a JRP report exists, the Fast-Track/JAD uses the JRP report as constraining or defining the business environment within which the application is defined.

JRP, Fast-Track, and JAD activities are

• designed to shorten the application develop- ment process

• productivity tools • structured to improve the quality of the appli-

cation development deliverables.

These characteristics of the joint development activities can also provide opposite results if the ses- sions do not adhere to the guidelines defined by their

7 JRP and JAD are design techniques of the IBM Corporation.

8 Fast-Track is a design technique of the Boeing Computer Company.

developers. However, these techniques do not sub- stitute for experience, good project management, or knowledge about the application! Even with user involvement in analysis and design, application developers must develop knowledge and shared mental models of both the application and problem domain. One purpose of the joint sessions is to be sure of a common mental model for all participants.

Requirements for a joint session relate to:

• the team • the session • joint structured process • the meeting facility • documentation tools.

The Team

The team is composed of client representatives, facilitator, systems representatives, and support per- sonnel (see Table III-4). The clients must include decision makers at a high enough level to resolve conflicts and make decisions that affect the scope and content of the application. They must also be at a low enough level to be conversant and able to explain the daily functions and procedures. Finally, clients must represent every functional area affected by the application. You must also keep the number of client participants less than 15 and ideally between three to four people. The more people, the longer the process and the more difficult the decisions. Ide- ally, the whole session team is about seven people.

Systems representatives should include the proj- ect manager, an SE, and one to two analysts with technical expertise. The systems representatives must be able to assess feasibility of requested requirements and the expected complexity of imple- menting the requirements in the target environment. The main role of the system representatives is to learn the problem domain area during the sessions and ensure accurate problem restatement in sys- tem terms.

The facilitator is a specially trained individual who runs the session. The facilitator has several roles:

• Elicit information from participants • Keep the meetings moving

Organizational and Automated Support 211

TABLE 111-4 Joint IS-User Team and Responsibilities

Role Job Title

Facilitator Consultant IS Manager Senior SE Facilitator

User Manager Professional Clerk

IS Representative Project Manager Project Leader SE

Support

Systems Analyst

Secretary Systems Analyst

Keep the discussion from becoming monopo- lized by one individual Identify and resolve conflicts Keep the meeting on a business (rather than personal) level.

Frequently in joint sessions, organizational dis- agreements on goals and objectives arise. Such con- flict is to be expected and is normal. The facilitator's job is to identify and ensure resolution of disagree- ments during the sessions. The conflicts are poten- tially explosive and can lead to personal conflicts.

Responsibilities

Elicit information. Keep meeting moving. Minimize monopolization by one or few individuals. Identify and resolve conflicts. Maintain professional atmosphere.

Make decisions about compromises, changes, or other aspects of the application requirements that require managerial approval.

Participate in and contribute to discussions about requirements.

Provide information, requirements ideas, and suggestions on the meeting topic.

Maintain open, professional atmosphere. Interpret and explain application problem domain to

IS personnel.

Learn the application problem domain. Assist in interpreting requirements into graphical

representations. Determine technological capabilities and limitations

as they relate to the application requirements. Interpret and explain technical IS domain to users.

Take notes as requested. Plan for coffee, meals, etc. Act as liaison with outside world. Take notes as requested; assist in transcribing and

documenting daytime work.

The facilitator must recognize such situations and defuse them. Occasionally, defusing means asking for a participant to be replaced.

The facilitator is a cheerleader, meeting leader, and ring leader who keeps the session moving. Usu- ally facilitators are senior staff from the information systems organization who already know how to de- velop application requirements, but who are specifi- cally trained to facilitate joint user-IS sessions.

Finally, support personnel are individuals who take notes during the day and provide liaison with the outside world. The notes include data-related

212 PART III Analysis and Design

information and process-related information. Data information includes identification, naming and definitions of entities, elements, and entity relation- ships. Process information includes decision ration- ales, process identification, procedural details of processes, and policies that constrain processes. The actual results of the data and process discussions are reflected around the room (see the photo in Figure 111-2) on flip-charts, blackboards, and other visual aids that are always accessible to the entire group.

A second kind of support is administrative assis- tance, which includes documenting the information during evening sessions, coordinating coffee and meals, and ferrying messages to and from work for participants.

Preparation

A meeting to prepare session attendees should be held for all participants. The primary purpose is to give participants a list of tasks to complete before they attend the joint session and to train participants in the completion of the tasks.

The meeting includes an orientation, document examples, data requirements, and training in devel- opment of graphical techniques being used to docu-

ment processing. The orientation discusses the ex- pectations of the organization and normal results of such sessions. Then participants are given an overview of the joint structured process: what it is, how it is conducted, proper behavior, and decision- making necessity. The scope and purpose of the application are discussed and agreed upon again by all participants. If there is disagreement or problems with the scope, they are revised at this meeting so everyone has a shared understanding of what work functions and information are in, and what are not in, the application.

If data flow diagrams are the graphical technique being used, for example, the users are trained to develop a context diagram and first-cut data flow diagram of their current job. The list of tasks for data flow diagrams would include the following activities:

• Define the scope and functions of your position

• Document the 'what is' in a data flow diagram • Try to draw a context diagram of all the

departments, groups, and applications with whom you exchange information in your job

• Define all data used in your job

FIGURE 111-2 Photo of Joint User-IS Session Room

• Collect statistics-how often, how much, when-for all data and processes

• Collect samples of all input and output documents.

Frequently these sessions are taught by an in- house facilitator, but they may be taught by a con- sultant who knows the techniques.

The Joint Structured Process

The ideal joint user-IS session is full-time, off-site, lasts three to five days, and has five to nine partici- pants. All of these ideal characteristics can be loos- ened somewhat and still maintain the momentum that comes from intensive work sessions. The idea is to do the work intensively and quickly because no one has time to spend in months of meetings. Par- ticipants become very close and frequently become good friends as a result of working together. At best, the users and IS team realize they are business part- ners in the application development and that rela- tionship prevails throughout the project's life.

Joint sessions are divided into mainly daytime and nighttime sessions. The word mainly is used because the activities can be done at any time. In general, daytime, when people are most alert, is devoted to creating new information; evening is devoted to documenting the new information.

During the day, activities are the following:

• Confirm business functions • Identify and analyze specific requirements

(processes by function, inputs and outputs for each process). For each process, identify what is done, how frequently, exception and error processing, periodic processing, problems with current procedures, policies that might need to be changed, and any new business requirements relating to the processes.

• Identify general requirements for the applica- tion. For data, how accessible and accurate does the information need to be? Can it be accurate as of close of business yesterday or must it be up to the minute? Can answers take one or two hours, or must the answer be within seconds?

Organizational and Automated Support 213

Application constraints are a second type of general requirement. Constraints place lim- its on the application. For instance, upper bounds of cost and time are allocated for de- velopment, hardware, software, language, or DBMS. These constraints are general, but they place strict boundaries on how the appli- cation will be designed. They also identify, to the technical staff, activities that need to be further elaborated during the detailed design to accommodate the implementation environ- ment. Constraints from the first chapter dis- cussion also apply. They include time, pre- and postrequisites, structural, control, and in- ferential constraints.

• Identify the likelihood of requirements change over the next three to five years. If require- ments are identified as changing within the expected implementation time of the project, then the expected requirements become the current requirements for the application.

For instance, users ma~' currently need data up to the close of business yesterday. They discuss the industry as moving rapidly toward instant access of up-to-the-minute informa- tion and expect this requirement within 12- 18 months, and the application will be imple- mented in 12 months. Build the new require- ment into the application now to be an early leader and avoid costly redevelopment of the new application .

• Have the support staff record all processes, functions, data, outputs, data elements, terms of processing, names given to items, and so on.

Figure 111-3 shows the first-cut data flow diagram developed by an accountant in a major company for a JAD/Fast-Track session. The user, after one training session, developed a DFD that was about 90% correct. Figures 111-4 and 111-5 show the related Level 0 and Levell diagrams, respectively, from the JAD which had minor changes during IS design. Figure 111-6 shows the DFD level 2 as decom- posed by the project team during design. Only one of the processes changed: General Ledger was elabo- rated to be Accounts Receivable and Accounts

214 PART III Analysis and Design

Accounts .- Expenses

Payable

Purchases

Accounts - Sales Receivable

FIGURE 111-3 User-Developed First-Cut DFD

Payable. The other changes were to files and external entities.

The evening sessions do the following:

Define all elements and terms Document all processes Draw formal DFDs Document general application requirements and write an executive summary Review documentation output of other mini- teams.

The group works together during the day to create information. In the evening, the group splits into mini-teams to perform one of the above activities. Documentation should be done using automated tools, including word processors, CASE tools, or other automated support tools that might be avail- able. The goal is easily modifiable documents that can be formatted and printed.

When the mini-teams complete their work, they jointly review each others' work products. This review fosters the shared common view of the application and ends the participants' day with each having a clear sense of what was accomplished.

Financial .. Reporting

1 Inventory

The Meeting Facility

The location should be at least 20 miles from the main work site of the participants to minimize inter- ruptions and preclude people being pulled out of the sessions. The facility should provide above average meeting, sleeping, and eating arrangements in the same building. Phone access must be available but must be removed from the meeting room(s). The facility must provide computer accessibility. The location must be easily accessible for managers, who are not participants, to attend sessions for resolving conflicts. The facility must allow use of walls in the meeting room. The room should be equipped with flip-charts, overhead projector, markers, slide pro- jector, and other meeting equipment as needed.

Documentation Tools

Documentation tools should include some word pro- cessing capability, dictionary support, and some graphical form support. All of these should ideally be in a computer-aided software engineering (CASE) tool. The CASE tool should allow cus- tomized reports of the information and should

Organizational and Automated Support 215

2.0

Purchases \--_______ +-____ ---,

FIGURE 111-4 JAD Team First-Cut and IS Final DFD Summary Level 0

Stock Reports

Consignment I----.J Areas

Controllers 14----l

Cost of Sales Rpt.

Inv. Cost Variance Rpt.

FIGURE 111-5 JAD Team First-Cut and IS Final DFD Levell

Controllers

216 PART III Analysis and Design

Blending Factor File

Inventory File Reference Files

Revised Rates/Blends w/ Stock Location, Product, Pkg. P&L, Group

Validation Std. Cost Calculation Revisions

.------, rrors

Valid Updates Rates/Blends

Std. Cost Calculations

Std. Cost Report Blending Factor File

C1 4.3.3

Cost System

Computer

Record Rates! Blends

Inventory File

Rev.lnv. Rates

FIGURE 111-6 IS Level 2 DFD for Updating Standard Costs

provide some intelligence on checking and cross- checking both completeness and accuracy of the information entered.

At a minimum, word processing should be pro- vided via some tool such as WordPerfect,9 Word Star,lO MS Word,l1 and so on that allows graphics to be imbedded in text, creates tables easily, and does full text manipulation.

An active data dictionary is desirable for docu- menting the objects (e.g., entities, files, flows, objects) and object relationships defined during the sessions. An active dictionary is one that allows cus- tom report development, provides intelligent assess- ment of completeness, and identifies potential duplicates based on name and definition. If a passive dictionary (i.e., has only vendor reports and no intelligence) is an option, you are better off using a word processor to document the information.

9 WordPerfect is a trademark of WordPerfect, Inc.

10 Word Star is a trademark of Word Star, Inc.

11 MS Word is a trademark of Microsoft, Inc.

A graphical drawing tool is the third type of soft- ware needed. The tool should allow the type of drawing you are using with your methodology. An automated graphical tool is preferred to manual drawing because automated drawings are more eas- ily changed and maintained. The joint groups fre- quently do several iterations of a drawing before they are satisfied with the result.

To summarize, joint user-IS sessions are a way to obtain quick results with a high degree of user par- ticipation in the development of requirements plans and application requirements. Joint sessions are intensive and require high commitment from partici- pants. The rewards are a user-centered requirements document that frequently leads to more satisfied users and high user involvement throughout project development.

User-Managed Application Development Joint sessions are designed to bring users and IS per- sonnel together with the underlying understanding

that users will always know more about their jobs than IS people. Joint sessions foster commitment to the IS development effort and give users a sense of participation. The user aspects of application devel- opment should not stop there. A user manager should be appointed for the application and should be the person ultimately responsible for the success- ful completion of both the application software and the organizational changes that accompany a new application.

The need for user-centered design seems obvious. User-managed applications foster a sense of busi- ness partnership; IS-managed applications foster a sense of them-and-us. User-managed applications provide a regular, natural communications line between the technicians and users; IS-managed applications provide a way for IS people to only talk among themselves. User-managed applications tend to require less IS involvement in application train- ing, because users do their own training; IS training is notoriously condescending, inappropriate, and ireffective. Users 'own' the application and train their own staffs.

Not all is rosy with user managed applications. If the IS project manager is not used to working for a user, she or he will have to adjust some aspects of work. For instance, conversations will use busi- ness terms rather than technical terms. Variances in time and budget will require explanation and discussion. Rather than running the whole show, the project manager is clear~y relegated to a support- ing roll and only manages the actual software development.

User-managed development can also be sub- verted by un supportive IS personnel. For instance, user teams can meet to develop functional reqllire- ments, but IS teams may not use them. IS groups have been notorious in ignoring user requirements. The comment heard is, "They Can tell me anything, I'll give them what I want." The attitude is that mere users could never define as good a system as an IS person. How someone who does not know the busi- ness could make such a statement defies logic, but it is made. IS developers frequently need indoctrina- tion that the business partnership aspect of applica- tion development does extend to the users.

Organizational and Automated Support 217

Structured Walk-Throughs Have you ever had a program bug that you spent hours trying to locate? You give up in frustration and tum to a friend for help. The friend takes a sideways glance and says, "Oh yeah, this period is out of place." Just like that, your hours have been a waste. That type of easily seen error is not a fluke. Your friend is not necessarily a genius, just as you are not necessarily stupid for not finding the error. The phe- nomenon at work is that you are too close to the problem to see the 'big picture.' At some point, we all reach this stage regardless of where on a project we work. Walk-throughs were designed to formal- ize the 'friendly review' described above.

A walk-through is a semiformal presentation of some work product for the sole purpose of finding errors. Work products might include all or part of the following:

• Functional requirements specification • Project plan • Design specification • Logical or physical database design • Program specification(s) • Program code • Test plan • Test design.

This list is not complete. Its purpose is to give you an idea of the range of items that can be the sub- ject of a walk-through. Virtually any work product, or piece of a work product, can be reviewed using the walk-through technique.

Ideally, a walk-through should not be scheduled for more than two hours at a time. If more time is needed, then additional walk-throughs are sched- uled. Like all rules of thumb, this one is frequently broken. Participants who do not work on the devel- opment team sometimes have a difficult time walking-through application requirements in bursts. When they focus on the application, they like to see everything at once. So, occasionally you might have a marathon session that runs a whole day.

Walk-throughs are formalized in that there is preparation, a team with members having different responsibilities, and a process. Preparation for the

218 PART III Analysis and Design

session is as follows: The team is identified and approved by an SE or project manager. The day, place, and time are agreed upon. A memo of meeting details is sent to all participants several days in advance. Attached to the memo is the work product to be reviewed.

All participants are expected to review the work product, annotating questions and potential errors in the margins. They must come to the session already having some understanding of the work product.

Participants in a session include the facilitator, work producer, one or two peers who are on the same project, one or two outsiders, and a scribe. Ide- ally, the number of participants is between five and seven. The facilitator is much like a lAD facilita- tor. He or she keeps the meeting moving, makes sure no personal or blaming remarks are allowed, and maintains focus on the work product.

The producer presents his or her work. First, an overview focuses attention on the purpose of the product. Then, the work is reviewed in a page-by- page or line-by-line manner following the logic of the document. The peers and outsiders are there to question the correctness, completeness, efficiency, and effectiveness of the product. Questions, com- ments, or errors are discussed as the presentation is made. When an issue is raised and appears legiti- mate, the scribe notes the problem and its location (see Table 111-5).

Possible 'outsiders' who might attend a walk- through include representatives from auditing, qual- ity assurance, operations, or other project teams who need to approve or work with the final product.

After the session, the scribe types the notes and presents a memo to the author for resolution. The author then responds to each item (see Figure 111-7). If an item is an error, the response details how and when it was fixed. If the item is an efficiency or effectiveness issue, the response describes what research was done and the resolution. Depending on the extent of problems or the importance of the prod- uct, another walk-through might be held. Usually, if the products are for analysis or design, two or three walk -throughs are held. If the product relates to pro- gram or test design, then the number of walk- throughs is determined by the number of errors. With

less than 10 errors, only one walk-through would be needed.

Data Administration Data administration is the management of data to support and foster data sharing across multiple divi- sions' and to facilitate the development of database applications. The principle activity for the organiza- tion is the development of a data architecture which depicts the structure and relationships of major data entities, such as customer, vendors, and orders. A data architecture is similar to the frame of a building. Once the frame is constructed, the siding and fa~ade are added. The frame provides the skele- ton to which the other substructures, such as electri- cal wiring and plumbing, are added. In information systems, the data architecture defines automated and nonautomated data and how they are used in the organization. The architecture provides a 'frame' for defining new applications and documents all data uses and responsibilities for existing applications.

The other major organization level activity is defining, with users, data that is 'mission critical' for the organization. Critical data is defined as that data required to maintain the organization as a going con- cern. As such, critical data is subject to management and standards through the data administration func- tion. Noncritical data is data that, while useful, is not required to maintain the organization in event of a disaster. Noncritical data does not require the same degree of management as critical data.

At a more detailed level, data administrators develop, administer, and maintain policies and stan- dards regarding data definition, sharing, acquisition, integrity, and security for the corporation's data resource. Data administration provides guidance to project teams on storage, access, use, disposition, and standardization of data. Data administrators are responsible for maintaining corporate definitions in addition to the creation and maintenance of the data architecture representing the enterprise.

Historically, the motivation for data administra- tion relates to a maturing organization. When DBMS software was installed in most organizations, a data- base administration (DBA) group was created to

Organizational and Automated Support 219

TABLE 111-5 Example of Errors Found in Walk-Throughs

Walk-Through Type

Feasibility

Analysis

Logical Data Model

Design

Physical Data Model

Program Specification

Acceptance Test Plan (This could be any test plan)

Code

Representative Errors Found

1. One of organization, technical, or financial analyses is missing. 2. Financial analysis has mathematical errors. 3. Typos or poor English render the document (or some part) incomprehensible. 4. Analysis contains incorrect information.

1. Data elements for data store, file, or other structure are incomplete. 2. Data items do not have formal names or names do not conform to standards. 3. Subsystem specification unclear. 4. Obvious 'holes' in the system as specified. 5. Graphical representations contain syntactical errors or confusing, ambiguous terms. 6. Nature of application interfaces not fully specified.

1. Logical data model (LDM) is not in third normal form (3NF). 2. Names do not conform to standards or are ambiguous.

1. Mapping to implementation environment does not include all functional requirements. 2. Implementation as specified will be difficult to operate, maintain, or implement. 3. Design is incomplete ... one or more screens are missing, screen dialog is incomplete,

allowable navigation not provided, etc.

1. Physical mapping does not provide necessary user views and security simultaneously. 2. Numerous user views may be unwieldy in implemented environment. 3. Physical model does not provide growth anticipated.

1. Program specification does not clearly say what the program is to do. 2. Program specification does not map to design or functional requirements. 3. File requirements not specific ... missing user view, copy lib name, JCL, etc. 4. Logic specification incomplete. 5. Faulty logic. 6. Access control for secure data not present.

1. Test plan does not test that all requirements are met. 2. Test case x data cannot perform as specified. 3. Missing/erroneous predicted results for reports, screens, file contents, or messages. 4. Missing on-line test dialog for single user functions. 5. Missing scenario and test dialogs for multiuser test. 6. Results predicted cannot be attained with current test design. 7. Test for breach of security missing. 8. Specific audit control tests missing/faulty.

1. Logic error-missing, extra, or wrong logic. 2. Nonstructured format will make maintenance difficult and expensive. 3. Comments do not identify module linkages. 4. Comments on user view copy books do not clearly identify the database, user view,

or JCL. 5. Access control for secure data not present. 6. Control totals for end of program counts missing. 7. Format error on report. 8. Misspelled word on screen, report, etc.

220 PART III Analysis and Design

Consolidated NY Bank

InterOffice Memo

DATE: December 7,1992

TO: Ms. Sandra Jones,

Walk-Through Facilitator

FROM: Mr. John James,

Producer

The following table includes all errors found during the Requirements Walk-Through on December 1 (see H. Hines, Scribe

memo of 1212). Each item has either been resolved or found not to be an error as indicated. One item, #5, identified an

audit problem for which I am awaiting Audit Dept. resolution. They are supposed to respond by next Friday, December 11.

Since we decided not to have another walk-through, I will proceed with finalizing the analysis phase.

Error

Error # Page

3 4

63 125

127

Description

Overview inconsistent in treatment of

errors for transactions.

System access code design not clear.

Test of screens is incomplete.

Security for accounting data not clear.

Interface to accounting system has

inadequate control counts and security.

Resolution

Rewritten

The lack of clarity was deliberate to prevent general

access to security procedures. The group felt that the

document should contain all of the information.

Upon reviewing this request with Mr. Fields, Project

Manager, we decided, for security reasons, not to

include the information. Mr. Fields has a detailed

description of security procedures and the document

now refers individuals requesting the security information

to him.

Missing information was added.

Same as #2.

Referred to the Audit Dept. for recommended action.

FIGURE 111-7 Sample Walk-Through Error Resolution Memo

maintain and monitor the DBMS' use. There was no necessity for other data-related organizations be- cause applications, for the most part, were isolated from one another and data sharing across organiza- tional boundaries was low. Most industry followed this pattern of development.

In the normal process of maturation, companies realized that sharing and consolidation of databases

across organization boundaries was desirable. The need to share data frequently accompanies the real- ization that individual division and work groups have their own vocabulary which often overlaps or conflicts with the vocabulary and terms used by other work groups. When divisions automate data, they incorporate local rules, policy, and definitions in their applications. Data, while having the same

name, then, may have several different meanings, uses, formats, and connotations across an organiza- tion. Conversely, data may have different names but the same definitions. This lack of consensus about terminology and data characterizes pre data admin- istration organizations.

The lack of consensus about data definitions leads to the realization that data standards pertaining to definitions, usage, ownership, security, access, and maintenance are not only desirable, but mandatory, in large-scale development of shared databases. This need for standardization increases with the recogni- tion of data as a shared resource of the organization.

A formal data administration function is needed to define and manage data company-wide. Data administration requires recognition and commitment to the notion that data is a resource of the corpora- tion. As a critical corporate resource, data requires the same careful planning and on-going management as cash on hand, office equipment, or personnel.

Commitment to DA is sometimes difficult to develop because data are fundamentally different than other resources. Data are abstract and nonphys- ical, do not decay, and are easily replicated as the need arises. They are also subject to different confi- dentiality, accuracy, and access requirements. Data are all of these things. In service industries, espe- cially, information is a primary product, and the quality of the data resource directly affects the com- pany's bottom line and how customers perceive the quality of service delivered. Data administration consolidates information across the organization to simplify the development of applications to service customers.

Benefits of data administration outweigh the frus- trations and difficulties of establishing the function. Some of the benefits include:

1. Creating and documenting a data architecture leads to formal recognition and agreement of business rules and relationships which are inherent in the data. This agreement im- proves communications and understanding of corporate data.

2. By defining and documenting data only once, efficiencies are realized throughout the sys- tem development life cycle. All subsequent

Organizational and Automated Support 221

application-using previously defined data items-identify data required and obtain access to already automated data. The data design and documentation phases are short- ened. Edit routines are reused, just like the data definitions, and ultimately the cost of program code is reduced.

3. Data administration leads to faster response to changing business conditions. The devel- opment of applications to support new prod- ucts, for instance, can be speeded due to fully specified definition of data required to sup- port a product.

4. Data administration provides a means for deciding what data must be controlled as part of the corporate data resource, and what data can be user-owned and controlled (including data that is off-loaded to PCs and LANs).

5. Data administration maintains definitions of all data in the corporation regardless of hard- ware platform or criticality. The central repository for this information, then, becomes the focal point of data-related activities.

6. By fostering data sharing, the cost of creat- ing, sorting, updating, and backing up multi- ple copies of the same data items is reduced, if not eliminated. That is, we only introduce planned data redundancy. Just as DBMSs allow us to minimize intraapplication data redundancy, DA allows us to minimize inter- application data redundancy.

In summary, the creation of data administration is recommended to guarantee minimal redundancy, shared understanding of data item definitions, and a managed approach to providing for future data- base environments. Data administration should not be confused with DBA data management which includes physical DB design, disk space alloca- tion, and day-to-day operations support for actual databases.

Data administration has numerous interfaces both within and outside of the IS area. Therefore, data administration interfaces occur at all levels of all divisions specifically to perform user liaison and application liaison.

222 PART III Analysis and Design

User Liaison

The data administration function works with busi- ness areas to define the data which that area uses to perform its function. All data, whether it is under the control of a current information system or not, is subject to data administration review. Thus, all data on any hardware platform is subject to review. During the review, critical data entities and data items are defined, maintained, and managed by data administration. Applications with critical data will be required to comply with standards on data, access, and security.

The person performing user liaison must be able to understand and converse in business terminology, not technical jargon. He or she should have problem- solving and analytical skills but also should have excellent communication/negotiation skills, user ori- entation, and understanding of the role and functions of data administration. The individual must be able to translate user data, definitions, and rules into information in the corporate data repository.

Application Liaison

Data administration works with application project teams to define the data requirements of the appli- cation. The data administration analyst identifies what data is already automated and works with the project team to define logical descriptions of the data. The DA analyst, DBA, and project analyst together transform the logical database definition into a specific database's logical definition. The DA analyst down-loads the data definitions from the cor- porate central repository for use by the project team and DBA. DBA then works to develop a physical database definition of how best to store the data.

In project-oriented work, the project analyst and DA analyst reconcile all data requirements with existing information in the corporate repository. For instance, if a team needs a "plan" field, but their definition varies from that of the corporate defini- tion, one of three actions is possible:

1. The corporate definition is changed to accommodate the new information.

2. The application redefines its use to be consis- tent with the corporate definition and usage.

3. A new data item is defined by the project analyst and DA; the new item is entered into the corporate central repository by the DA.

The skills, then, needed to perform application project liaison include analytical, communication, problem-solving, negotiation, data analysis, and modeling skills.

Where in the Organization is Data Administration

Ideally, the recommended organizational location of the DA function is independent of the corporate IS area, reporting to the president of the business entity it supports. DA affects and interacts with all departments and areas of the organization, includ- ing all of the application development groups as well as users, regardless of organizational position or hardware platform. The DA group could be part of an internal consulting/technology-related organiza- tion whose mission is to provide services across the entire organization. The DA group should be neu- tral about hardware, software, development, or man- agement of applications as long as the data is not defined as critical.

CASE Tools Computer-aided software engineering (CASE) is the automation of the software engineering disci- pline. You will find descriptions of ICASE, Upper CASE, and Lower CASE. These are variations on the theme with'!, standing for 'integrated,' 'Upper' standing for conceptual or logical design only, and 'Lower' standing for programming support only. While these differences do exist, this text concen- trates on CASE tools that support at least the analy- sis phase and may support others; they are all called 'CASE' here. We will identify which phases are now supported (of course, this might change by publica- tion time).

The typical CASE environment includes a repos- itory, graphic drawing tools, text definition software, repository interface software, evaluative software,

Human Interface

FIGURE 111-8 CASE Architecture

Graphic Processing

Tool

Text Processing

Tool

and a human interface (see Figure III-8). A reposi- tory is an active data dictionary that supports the definition of different types of objects and the rela- tionships between those objects. Graphic drawing tools support the development of diagram types and evaluates the completeness of the diagram based on predefined rules. Text software allows definition of names, contents, and details of items in the reposi- tory. The interface software is the interpreter which determines the form the data should take (either graphic or text). Evaluative software is the intelli- gence in CASE. Evaluative software analyzes the entries for a diagram or repository entry and deter- mines if they are lexically complete (i.e., conforms to the definition of the item type), and if they are compatible with other existing objects in the appli- cation. The human interface provides screens and reports for interactive and off-line processing.

In this section, we discuss the characteristics of the ideal CASE environment. This is just an ideal

Organizational and Automated Support 223

Repository Manager Repository

Intelligent Analytical Software

and is the author's own invention. 12 No commer- cially available products and no research prototypes are known to embody this ideal.

The ideal CASE should provide complete auto- mated support for the entire project life cycle, begin- ning with enterprise level analysis and working through to maintenance and retirement. The ideal CASE then becomes the focal point for all work that takes place in software engineering, and the work of the SE concentrates on the logical aspects of design. The ideal CASE tool would provide for the technical, data, and process architectures of the organization, project planning and monitoring,

12 The ideal CASE in this section is partially the result of re- search done with Judy Wynekoop, UT San Antonio and Nancy Russo, U of Northern Illinois, published in Wynekoop and Conger [1991], Conger [1989], Conger and Russo [1990]. It also results from 10 years offrustration in using CASE tools and waiting for vendors and researchers to build decent products.

224 PART III Analysis and Design

group work on applications, application and manual procedure definition, normalization of data, DB schema generation, generation of bug-free code in a user-selected language, automatic testing of gener- ated code against the application logic, and intelli- gent assessment of completeness and correction along the way. Really advanced CASE would rec- ognize components already in the repository for reusability of analyses, designs, and code.

The repository of CASE determines both what is supported and, to some extent, how much support can be provided. The repository is something of a super dictionary that captures and maintains meta- data. Meta-data is information about data (see Chap- ter 1). For example, a data element in an application is data, and its attributes constitute the meta-data that would be stored in the dictionary. Attributes of an element include, for instance, data type, size, vol- ume, frequency of change, and edit criteria. A CASE repository acts as the DBMS for the engineering effort, provides the capability for expanded meta- data capture, and maintains all components and their interrelationships.

The ideal repository should allow customizing of the methodology supported and enforcement soft- ware that can evaluate the correctness of user- defined repository entries. To do this requires some decoupling of the repository from a specific method- ology and an abstracting of methodology compli- ance rules within the repository. These are not trivial tasks! This decoupling would allow organizations to adopt and use the components of methodologies that work for them, and ignore those that don't. The initial sacrifice for this capability will be less intelli- gence. But, decoupling the intelligence from a spe- cific methodology and type of repository entry will also allow customizing of evaluation software and enforcement of local rules.

Intelligence in CASE comes in two major forms: intelligence of the interface and intelligence of the CASE product itself. The interface should provide both novice and expert modes of operation. It should allow work to be saved and restarted as part of the functionality. The tool should be customizable by individual users. For instance, if I want yellow print on a blue background, and I call a data flow diagram

a DFD, I should be allowed to change the defaults to use my terms and formats.

Alternate forms of inputs should be reflected throughout the diagram sets. This means that if a user enters entities and attributes in a repository, when she or he moves to developing a graphical entity-relationship diagram, the information in the repository should be reflected on the diagram.

Intelligence of the CASE product includes analy- sis within and between both diagram types and repository entries. Ideally, application A's require- ment that conflicts with enterprise goal Z, should be flagged for management consideration.

The ideal CASE should allow users to separate and integrate different applications easily. For instance, the company may want to document already operational applications and begin to man- age them electronically. Users defining a new appli- cation may want to integrate it with an old application. They should be allowed to create an integrated third definition that highlights the over- laps, redundancies, inconsistencies, and other prob- lems that the integrated pair have.

According to the 40-20-40 rule of systems devel- opment, 40% of project time is used for analysis and design, 20% is devoted to programming, and a full 40% is devoted to testing. 13 The current direction of vendors is to eliminate code, thereby cutting 20% off development time. But, the ideal CASE would cut the 40% devoted to testing as well. The urgency for CASE testing tools is low relative to other current concerns (like getting the products to work bug- free). At some point in the 1990s, vendors will begin to provide testing support in their CASE environments. Ideally, such support will include black- and white-box tests with human intervention allowed but not required. Black-box testing is for correctness of output based on inputs; white-box testing is for specific logic paths in a program. Intelligent software will analyze the type of process and determine the most appropriate testing strategy. Additional intelligent software will develop test data based on logical requirements, conduct the tests, and maintain test results. Test results will be integrated

13 Pressman [1987].

across test runs, phases of testing, versions of the software, and even hardware platform environments. When bugs are found, backtracking to find its source, possibly across modules, will be provided. Since the software built the bugs, it should be able to fix them; but, if the source is a logical, human speci- fication, notice to the SEs will require correction of the errors.

Future products will eventually tackle the remaining 40% of project time by providing intelli- gence to identify reusable components of applica- tions. Reusability of designs will have the most payback but is also the most difficult. Initially, reusable code modules will be enabled, then reusable designs, and finally, reusable logical analyses. Code reusability recognition should be available in CASE tools by the mid-1990s; the others will take until the tum of the century to surface.

This description of ideal CASE characteristics concentrates on what CASE should do rather than on what it currently does. For that, we discuss CASE as it supports each methodology and phase of development in the coming chapters. Although CASE and artificial intelligence (AI) are both in their infancy, the developments described above are cur- rently feasible with current state-of-the-art technolo- gies. The CASE repository will become the hub for all of the work that takes place in IS organizations. The limits to CASE intelligence that can be built are only due to human limitations.

SUMMARY

----------------In this section preview, we identified the major ac- tivities of analysis and design. Analysis identifies what the application will do; design describes how the application will work in production. Both analy- sis and design have the same five generic activities: identification, elaboration, synthesis, review, and documentation. These activities are constrained and guided by a methodology. Each methodology takes a different perspective of an application leading to dif- ferent phase-end results.

The organizational supports facilitate application development regardless of methodology. Organiza-

References 225

tional supports described in this chapter included joint requirements definition, joint application design, user-managed application development, data administration, and walk-throughs.

Software support that most facilitates application development is computer-aided software engineer- ing (CASE). The ideal CASE environment has both expert and novice modes, can be customized for hybrid methodology use, and provides many addi- tional intelligent functions beyond analyzing com- pleteness of work. Future environments will identify reusable components of previous work to further reduce application development time.

The next six chapters discuss the analysis and design phases using the following example meth- odologies:

Process-Structured Analysis (Chapter 7) and Design (Chapter 8)

Data-Information Engineering-Business Area Analysis (Chapter 9) and Business Sys- tem Design (Chapter 10)

Object-abject-Oriented Analysis (Chapter 11) and Object-Oriented Design (Chapter 12).

Chapter 13 summarizes and compares the meth- odologies and their CASE support. Chapter 14 dis- cusses forgotten activities of systems analysis and design.

REFERENCES __________ __ Blum, B., Software Engineering: A Holistic View. NY:

Oxford University Press, 1992. Conger, S., "The active dictionary in a CASE environ-

ment," Data Base Management, #25-01-20, NY: Auerbach Publishers, 1989, pp. 1-12.

Conger, S., and N. Russo, "A Taxonomy of Applications: A Framework for Selecting and Designing Method- ologies," Georgia State University Working Paper #90-0201, 1990.

Couger, J. D., M. A. Colter, and R. W. Knapp, Advanced System Development/Feasibility Techniques. NY: John Wiley & Sons, 1982.

McClure, c., CASE is Software Automation. Englewood Cliffs, NJ: Prentice-Hall, 1989.

McMenamin, S. M., and J. F. Palmer, Essential Systems Analysis. NY: Yourdon, Inc., 1984.

226 PART III Analysis and Design

Pressman, R, Software Engineering: A Practitioner's Approach, 2nd ed. NY: McGraw-Hill, 1987.

Olle, T. W., J. Hagelstein, I. G. MacDonald, C. Rolland, H. G. Sol, F. J. M. Van Assche, and A. A. Verrijn- Stuart, Information Systems Methodology: A Frame- workfor Understanding. Workingham, England: Addison-Wesley, 1988.

Swartout, W., and R Balzer, "On the inevitable inter- twining of specification and implementation," Com- munications of the ACM, Vol. 25, #7, July, 1982, pp. 438-440.

Wynekoop, J. L., and S. Conger [1991], "A review of computer-aided software engineering research meth- ods," in Information Systems Research: Contempo- rary Approaches and Emergent Traditions, (H-E. Nissen, H. K. Klein, and R Hirschheim, eds.). NY: North-Holland, 1991, pp. 301-326.

KEY TERMS _______ _

active data dictionary analysis CASE repository compromise of

requirements computer-aided software

engineering (CASE) critical data data administration (DA) data architecture database administration

(DBA) design document elaboration facilitator

Fast-Track identification IS-managed application joint application design

(JAD) joint requirements planning

(JRP) out of the box thinking repository review synthesis user-managed application user manager walk-through work around

CHAPT ER7

PROCESS- ------------------------~ ______ .r----

ORIENTED -------------------------,--------------ANALYSIS -----------------------------------------

INTRODUCTION ____ _

In this chapter, we review process-oriented analysis using structured analysis following DeMarco [1979], Yourdon [1989], and McMenamin and Palmer [1985]. Structured analysis was the first well- documented, and well-understood method of describing application problems. While the tech- niques have changed as our understanding and application types have changed, the techniques will remain useful for many years to come. This mate- rial should be a review, and for that reason, you might want to skim or skip it altogether. You might rate your know ledge by tracing the development of the ABC Rental Processing case. If you understand and can reproduce the work, skip the chapter.

CONCEPTUAL _____ _ FOUNDATIONS _____ _

Structured analysis (and design) follow the archi- tectural notion that "form ever follows function."l Functions of an information system are the processes

1 Sullivan, Louis, "The Tall Office Building Artistically Con- sidered," Lippencott's Magazine, March, 1896.

that transform application data. Therefore, we em- phasize processes and the flows of data into and out of those processes in structured analysis.

Structured analysis also is based on systems the- ory which assumes inputs are fed into processes to produce outputs. To complete the systems model (see Figure 7-1), there must be some sort of feedback to eliminate system entropy, that is, to keep the sys- tem from 'running down.'

To conceptually analyze complex systems as we have in IS, pieces of a problem are analyzed in iso- lation. We might look at inputs, outputs, and pro- cesses separately, then integrate them to produce a unified system. As system processing gets more complex, we study pieces of processes separately then integrate them. The pieces of the processes

Input System

Output

Feedback

FIGURE 7-1 Systems Model

227

228 CHAPTER 7 Process-Oriented Analysis

must themselves be self-contained, small systems. These smaller systems comprise a hierarchy of sys- tem components, such that a component at any level is itself a system of components. Each system, re- gardless of level, has its own inputs, processes, out- puts, and feedback. At the lowest level of the hierarchy are the elementary components which can no longer be subdivided and retain their system characteristics.

Structured development provides heuristics, guidelines, and diagram sets for dividing an infor- mation system into a hierarchy of logical component parts.

SUMMARY OF ------------------- STRUCTURED ------------------- SYSTEMS ANALYSIS ------- TERMS ----------------------------- Structured analysis begins with two assumptions. First, we assume that we are most interested in what the application is to do. That is, what are its func- tions or processes? A function or process is some activity that transforms an input data flow into an output data flow. Second, we assume that we will treat the problem in a top-down manner. In top-down analysis, we analyze the external interfaces of the application first, then high level functions, and finally, lower level functions.

At the highest level, we define the scope of proj- ect activity. The scope defines the boundaries of the project: what is in the project and what is outside of the project. We document the scope of the project in a context diagram. A context defines a setting or environment. In structured systems analysis, the context diagram defines the interactions of the application with the external world. External world interactions occur between external entities and the application via the data flows that pass between th~m. A~ external entity is a person, place, or thing wIth whIch the application interacts, such as

Accounts Receivable Application Citibank Customer

Customer Service Department Medicaid Processing Application Medicaid Administration The Federal Reserve Bank The Internet (or other public network) U.S. Internal Revenue Service

A data flow is data or information that is in transit. A data flow might be a piece of paper, a report, a diskette, or a computer message. Data flows in a diagram are directed arrows that depict data move- ment from one place to another.

A context diagram depicts the scope of the proj- ect, using circles, squares, and arrows. A large cir- cle designates the application (see Figure 7-2). Squares identify external entities with which the application must interact. Directed lines (i.e., with arrows) are the data flows which indicate movement of data between entities and the application.

At the next lower level of analysis, we look inside the circle representing the application to define the major functions and files. Again, the functions are the major transformations triggered by input data flows to create output data flows. Files or data stores are relatively permanent collections of data. Data flows are distinct from data stores in their time orientation. Data flows are temporary and cease to exist once they are acted upon by a process. Data stores are persistent and maintained over time. Data stores may represent one or more data structures.

A data flow diagram (DFD) (see Figure 7-3) is a graphic representation of the application's compo- nent parts. Notice in Figure 7-3 that the entities and data flows from the context are all present. Also notice that data flows may connect processes to other processes, data stores, or external entities. Data stores and external entities do not interact directly with each other. If we compare the context to the data flow, we can perform quality assurance for com- pleteness and consistency. Completeness checking ensures that all data flows and entities are included. Consistency checking ensures that only expected data flows and entities are included and that they are in the correct locations in the diagram set.

We do several iterations of DFD process analysis. At the highest level of analysis, the DFD is said to

Summary of Structured Systems Analysis Terms 229

External Entity Name 1

Application Name

FIGURE 7-2 Sample Context Diagram

describe Level 0 of the application. Each iteration is a deeper level of analysis to look into the pro- cesses from the previous level, analyzing the sub- processes, their constituent data flows, and their data stores. We link DFD levels through the process num- bering scheme (see Figure 7-4). For example, process 1.0 from the level 0 diagram is decomposed into processes 1.1, 1.2, 1.3, and so on to describe the Levell DFD. In Figure 7-4 Process 1.0 is de- composed into two subprocesses. Notice that a new file and an entity are other details added to the dia- gram. Levell DFDs may be further decomposed. To continue the example, process 1.1 might be decom- posed into processes 1.1.1, 1.1.2, 1.1.3, and so on, until we reach the primitive, basic level. The primi- tive level is the level of each process at which no fur- ther decomposition can be done without fracturing the function. In other words, the decompositions at each level fully define the function, but may not

Incoming Data Flow 2

External Entity Name 2

define all of the functional details. At the primitive level, all files, flows, entities, and individual func- tions have been defined. There is no right level of definition; level is usually related to the type of application and target implementation language. You may do only two or three levels of decomposition for a nonprocedural, fourth generation language; you may do six or seven levels of decomposition for as- sembler or low level procedural languages (e.g., COBOL or Pascal).

The structured decomposition technique is a mechanism for coping with application complexity through the principal of 'divide and conquer.' A large, complex application problem is divided into its parts for individual analysis. Each part is further divided and individually analyzed. Complexity is reduced by allowing us to analyze small parts of the problem in isolation. The difficulties in structured decomposition are in correctly identifying the

230 CHAPTER 7 Process-Oriented Analysis

External Entity Name 1

Outgoing Data Flow

2.0

Generate Results

Incoming Data Flow 1

1.0

Get Input Flow 1

File InpuVOutput

Data

File A

Inter-Process Flow 1

3.0

Get Input ~----I Flow 2

Inter-Process Incoming

Data Flow 2 Flow 2

FIGURE 7-3 Sample Data Flow Diagram

isolated parts, and keeping the level of abstraction consistent.

After each analysis, the current level of DFDs is balanced with the previous level. Balancing is the act of checking entities, data flows, and processes across the levels of the diagram set. All entities and data flows from the higher level processes must be in every more detailed diagram. The names of entities and data flows must be consistent across the levels of the diagrams. We also balance processes. Lower level processes 'explain' or provide the details of higher level processes. Lower level processes are checked to be sure that they all relate to one, and only one, of the processes named at the higher level. They are then checked to be sure that they are in the

External Entity Name 2

diagram set for their related higher level process. When complete, processing is fully documented in a leveled set of DFDs.

While a set of balanced DFDs is being created, the secondary documentation is also being created. The secondary documentation includes creation of a data dictionary and optional graphics for real-time applications called state-transition diagrams. The data dictionary2 compiles detailed definitions for each element in a DFD (see Figure 7-5 for contents for each entry type). The dictionary entries for processes contain details of how to accomplish the

2 See DeMarco [1979] and Yourdon [1989].

External Entity Name 1

~ 1.1

Structured Systems Analysis Activities 231

I File A Validate

and Create A Data Code Validation File

3.0

1.2 Get Input

_ _ _ _ -r- Flow 2

Generate _ - - A Report I- - -

Output '\. Report "-

...-...;:a ___ .......

Entity In

Application

FIGURE 7-4 Example of Decomposed DFD

process. For instance, a process description for order creation might contain requirements for data entry, customer validation, item validation, order printing, and order filing. Since you get information on data piecemeal throughout the analysis (and design), it is easiest to document what you know as you go along. Surfacing assumptions, misconceptions, and data conflicts can be easier with this approach because the dictionary is always up to date with information and its source. If you collect pieces of paper and create the dictionary late in the analysis phase, identifying the source of conflicting information can be difficult.

Although not originally part of structured analy- sis, state-transition diagrams are frequently used to supplement DFDs in structured analysis for on-line

Boldface items show new information on detailed, decomposed diagram that is omitted on the higher level diagram. The dotted lines mirror the net inflows and outflows of the level 0 diagram. Process 3.0 is only here to show the net outflow connection.

(and real-time) applications. A state-transition diagram shows the time ordering of processes and identifies relationships between processes. State- transition diagrams are an integral part of object- oriented analysis and are deferred until that discussion in Chapter 11.

STRUCTURED __________ _ SYSTEMS ANALYSIS __ _ ACTIVITIES ______ _

The specific activities in structured systems analy- sis are:

232 CHAPTER 7 Process-Oriented Analysis

1. Develop a context diagram 2. Develop a set of balanced data flow diagrams 3. Develop a data dictionary 4. Optionally, develop a state-transition

diagram if building an on-line or real-time application.

Data File or Data Base

Data Field or Attribute

Data Flow

Process

External Entity

File/Database Name Aliases Primary Key Alternate Keys Size of Relations/Records Growth Percentage per Year Security Data Structure Organization

User Name System Name Aliases Definition, if needed Creating Process(es) Length Data Type Allowable Values and Meanings Validation Method (e.g., cross-

reference file, code check, etc.)

Name Aliases Timing (e.g., daily, weekly, as

occurs, etc.) Contents Constraints (e.g., requires

5 second response; only oc- curs for sales orders, etc.)

Process Name Process Number Description Constraints (e.g., must be

complete within 20 seconds or Process x times out.)

Entity Name Aliases Definition Relationship to Application Contact (if entity is an

organization)

FIGURE 7-5 Data Dictionary Contents by Type of Entry

Structured analysis can be likened to a video cam- era with a zoom lens. At a distance, with no zoom, the item being examined is abstract and fuzzy. It has shape but no details. We can tell the photo is a build- ing but little else (see Figure 7-6). When we draw a context diagram, we are examining the abstract shape of the item, in our case an application. Next, we zoom in with the camera to identify a greater level of detail about the object. In the photo, colors are distinct and some features of objects stand out. Pieces of the structure, for instance, columns, can be discussed in isolation of other pieces. Internal photos might show position, size, and type decor of rooms. There are still details which remain indis- tinct. When we develop the Level 0 diagram, we zoom in a level to expose more details of the prob- lem. At this level, we describe the major normal processes, data flows, and files, and how they inter- relate with external entities from the context.

In the third photo, we see all of the details: loose tiles on a roof, a crack in a foundation. Internal photos at the same level might detail construction materials (e.g., hardwood or concrete floors), and windows and doors to the outside. We can describe the context and surroundings, as well as the photo item, in as much detail as needed to suit our purpose. Similarly, at each additional level of application problem decomposition, we are zooming in to examine ever more detailed layers of the item, until we arrive at the essential processes in the applica- tion. At the lowest level of decomposition, we analyze not just the normal processing but all exceptions, errors, and details of reporting that accompany the normal processes. From systems the- ory, we know we are finished decomposing when we can no longer identify minisystems as the compo- nents of subprocesses.

The problem with the photographic zoom anal- ogy is that the activities in structured analysis are not strictly top-down. First, we do not think in a strictly top-down manner. We jump back and forth between levels of detail to 'test' how a higher level decision might look at a lower level, to get details of a new process so we are sure how it 'fits' with the other processes, and so on. When we are developing an application similar to something we have already done, we have a good understanding of familiar parts and little understanding of new parts. We spend time

Structured Systems Analysis Activities 233

FIGURE 7-6 Zoom Analogy to Structured Analysis

analyzing the new parts of the application to see how they fit with what we already know. We have to change our preconceptions based on the new infor- mation, and alter our 'mental model' at all levels of detail to accommodate the new information. We may go into great detail on a new aspect of the applica- tion, ignoring the known aspects temporarily. Then, when we understand the new parts, we can go back up to a high level of abstraction to document how the parts fit together.

Second, application analysis is iterative. We have already discussed planned iterations to move to lower levels of detail in documentation. We also reiterate through analysis when we find some unex- pected, unknown, or changed requirement to ensure that it fits what we already know. To decide that fit, we must walk-through the entire process top to bot- tom. Recall that a walk-through is a formal review of analysis, design, program code, test design, or some other component of application development work. A walk-through can be used to determine

where the new requirement fits what other processes, flows, stores, or entities

are involved in the change

what are the ripple effects of the change through the set of DFDs.

Another analogy for structured analysis, as equally applicable as the photo zoom, is fr.om ge~l ogy (see Figure 7 -73). If we are trying to dnll for 011, we might find a variety of different formations and even have different drilling results, depending on the depth and angle. So, too, in structured analysis our results depend on our approach and the information we obtain from interviews and information gather- ing. The information differs for each user because their perspective of the problem, their job goals, and their personal aspirations all distort their view. We require multiple approaches, multiple intervie~s with both the same and different people, and multI- ple perspectives of analyzing the info~ation. Fi~ure 7 -6 shows unfocused probing. The pIeces and VIews do not fit together. We know we are at the end of analysis when all users agree and all the disparate

3 This analogy is from Gary Moore, University of Calgary, who originally used it to describe research in information systems. It fits the application development context as well.

234 CHAPTER 7 Process-Oriented Analysis

Drilling Lines _

-.,.

Earth Surface

FIGURE 7-7 Geologic Analogy to Structured Analysis

views fit together coherently. Recall that triangula- tion is a data gathering technique comparing multi- ple verifying sources of all information. The purpose of triangulation is to ensure that our resulting view of an application accurately depicts the requirements of the work process it supports. So, we analyze top- down, sideways-out, bottom-up, and do them all more than once in the analysis process.

Now, we turn to the discussion of how to actually develop the documentation in structured analysis.

Develop Context Diagram Rules for Developing Context Diagram

The context diagram summarizes the scope of the project. The rules for developing the context dia- gram are listed below for easy reference.

1. Define the boundaries (i.e., scope) of the application. Specifically, define what the application will do and what it will not do. Draw the circle identifying the application and write the application name in the center.

2. Using the application boundary as a starting point, identify all external entities with which the application must interact. For each entity, draw one square on the diagram and label the square.

3. For each entity, create a definition in the data dictionary.

4. For each external entity, identify the specific data flows that define the interface.

5. For each data flow, create a definition and list of tentative contents in the data dictionary.

Scoping may take place before analysis actually occurs and is usually part of the feasibility study as

discussed in Chapter 6. Some organizations which might not perform feasibility analysis still require a bounding of the application. Review that portion of Chapter 6 if you do not remember the political and organizational issues involved. Here·, we assume that boundaries are defined and that the application and its interfaces to external entities are reasonably well defined.

Definition of external entities is next. External entities are people, places, or things which interact with the application. Usually, we identify titles/roles (e.g., Customer), departments (e.g., Accounts Re- ceivable), organizations (e.g., Medicare Adminis- tration), or applications (e.g., Accounts Receivable Application) as entities. The phrase 'interact with the application' has a very specific meaning. The entity is outside the control and/or processing being mod- eled for the current application. That is, external entity processing, procedures, and data are not sub- ject to analysis or change. Relationships between external entities are not shown on the diagram(s) (i.e., external entities cannot connect to each other). For example, if you are modeling an order process- ing application that does not do inventory control, the warehouse would be on the context diagram. If inventory control and warehouse processing are within the scope of the application, the warehouse would not be on the context diagram.

After entities are identified and drawn on the diagram, they should be defined in the data dictio- nary. The entries for an entity include a name and definition (see sample Figure 7-8). This step is important for two reasons: to develop a common vocabulary, and to develop documentation as analy- sis proceeds. Frequently, individuals might believe they have a common vocabulary because they use the same words in their discussions. Only when they develop a common definition of the terms can they be sure that their shared terminology also means they share the meaning of the terms (see Example 7 -1). Finally, in organizations having a data admin- istration function, a dictionary (or repository) of 'corporate' data is an integral part of the organiza- tion's data architecture (see Chapters 9 and 10 for more on this topic). The name and definition of each entity (and, eventually, each attribute) should be matched against the organizational definitions to

Structured Systems Analysis Activities 235

Entity Name

Aliases

Definition

Relationship to Application

Contact, if entity is an organization

Entity Name

Aliases

Relationship to Application

Contact, if entity is an organization

Customer

None

A company, government agency, nonprofit organization, or individual who orders goods and services from X Company

Order goods, return goods, receive invoice

None

Medicaid Administration

Medicaid

Receives claims, sends claim reconciliation, payment

Mary Jones 202-445-0011 , NY State Claims Adjustor Medicaid Administration 1401 Avenue C, NE Washington, D.C. 01010

FIGURE 7-8 Example of External Entity Description

ensure consistency with other uses of the same name, or uniqueness of the name if a new definition is developed.

There are several reasons for documenting defin- itions in the dictionary as work proceeds. First, the dictionary provides the basis for intraproject com- munication. Whenever a definition is developed and added to the dictionary, the more the team builds a shared view of the application reflecting the dictio- nary contents. Second, documentation is best done as the project progresses to ensure that it gets done. If documentation is delayed until after implementa- tion, it rarely includes the wealth of detail and his- tory of decisions that can be incorporated if done instream.

The next action in developing the context dia- gram is to define data flows b~tween the application and each external entity. The questions you ask your- self to identify data flows are, "What information do I (as the application) need from this entity?" and "What information do I feedback or provide to this entity?" Frequently, but not always, input flows (to

236 CHAPTER 7 Process-Oriented Analysis

A CASE OF NO SHARED MEANING

The XYZ Annuity Company was developing a new application to define the institutions which defined its customer base. The exercise was prompted partially by a lament from the head of marketing who claimed, \\There are 6,400,7,500, or 9,650 institutions, depending on who I ask and which application they are getting the numbers from. Can't I have one number of institutions?"

A newly founded Data Administration team decided that the first \\corporate" defi- nition they would tackle was institution. The data analyst assigned first asked appli- cation developer colleagues, \\What is an institution?"

The replies were varied and generally, unsatisfactory:

Anyone we do business with. An organization we do business with. Any legal entity we do business with. A school, research and development

institution, not-for-profit foundation, or other organization which is approved by the IRS to contract for annuity busi- ness with XYZ Annuity.

An organization that has a plan defining a group of annuity contracts.

Then the analyst asked the users, \\What is an institution?"

Some organization that remits annuity payments (a remittance clerk's definition)

An organization with a plan defining a group of contracts (a accounting man- ager's attempt at a generic definition)

An approved organization which mayor may not have a contract plan (a mar- keting definition)

An organization to whom annuity and pension product counseling is provided (a counselor's definition)

A target audience for marketing and seil- ing annuity products (a marketing definition)

The analyst then asked the senior manager in charge of institutional relations to please define an institution. His response was a three- page, single-spaced memo that defined six major variants and over 30 different situa- tional definitions for an institution.

Two important ideas here are, first, all of these definitions are correct, and second, each definition has some generally ac- cepted component. Definitions relate to per- spective. A systems person defines an institution in relation to the application's use of the term. A user defines the term in rela- tion to their job's use of the term. The man- ager tried to synthesize all perspectives and highlighted the variation and divergence that had evolved throughout the organiza- tion. Third, all of these definitions have some element that appears important to defining "institution. "

When asked about the differences in the definitions, one user said, \\Oh, yes, we know we don't all mean the same thing when we use the term institution. I even mean different things depending on the topic."

Resolution of the differences took over six months of part-time work, resulted in the defi- nition of 20 new attributes of an institution, and required the approval of 72 managers in the process. Several applications under development that were using an institu- tional billing code as the primary key identi- fier underwent substantial redefinition as a result of the development of a shared term, 'institution. '

Structured Systems Analysis Activities 237

Summary Context

Inflows Context

FIGURE 7-9 Example of Complex Context Diagram

the application) are matched with output flows to the same entity. For instance, customers place orders; the application sends an invoice (and goods) back to the customer. Check for reciprocating input- output flows such as these. When you identify single flows to/from an entity, you want to double check by asking, "How do I know they got this output?" or "Do I have to tell them I got this input?" As you define each data flow, draw the directed arrow on the context diagram, and label the flow. For a complex

application, you might need two levels of context diagrams (see Figure 7-9). One level summarizes all entities with directed arrows that are unlabeled. The other level shows input flows on one diagram, and output flows on the other diagram with labeled data flows on both diagrams.

Data flows are information about some business event being tracked by the application. They do not identify physical items. For example, an invoice is information about an order that would also have

238 CHAPTER 7 Process-Oriented Analysis

Name

Aliases

Timing

Contents

Constraints

Order

None

As Occurs

Customer Name + [Address I Customer ID] + Shipping Instructions + 1 {Item name

+ (Item number) + (Color) + (Size) + Quantity ordered}m

80% must be billed and shipped within 24 hours

100% orders in by noon must be billed and shipped the same day

FIGURE 7-10 Example of Data Flow Dictionary Description

actual goods. A data flow to a customer shows the invoice but not the physical goods.

Last, for each data flow, create a definition in the data dictionary. The dictionary information provided for a data flow is its name, contents, and contents' source when it is not obvious (see Figure 7-10 for sample data flow description).

ABC Video Example Context Diagram

The scope of the project for ABC Rental Processing system is to provide rental/return processing for videos, including customer maintenance, video inventory maintenance, historical information main- tenance, and reports to management. At the end of the day, accounting totals of sales information are generated, but there is no automated accounting interface. There is no purchase order processing in this application. The application's main function is rental processing, so we will call it 'ABC Rental Processing.' We draw the circle for the application in the context diagram and label it 'ABC Rental Processing. '4

4 The names of items from a diagram are in italics to set them off from the rest of the discussion and, hopefully, minimize your confusion.

Then we define the entities. Possible entities are customer, video vendor, ABC management, ABC accountants, and the Internal Revenue Service (IRS). The IRS is omitted because there is no tax -related processing performed in the application. ABC ac- countants are included because they receive an end- of-day report of receipts. How management and/or accountants use that information is beyond the scope of the application. ABC management could conceiv- ably be on the diagram. Now, we ask ourselves, "Do we have control over what ABC management does with respect to the Order Processing application?" The answer, in this case, is yes, because ABC is so small. In other circumstances, the answer could be no. For instance, with a large application generating reports for many levels of management or for other departments' management, the answer might be no. Here, ABC management is not on the context dia- gram; in other companies or contexts it might be.

The entities left are Customers, Video Vendors, and video. Customers should be obviously correct. All rental and return processing relate to interactions of the application with customers. ABC has no con- trol over customers' rental choices.

Vendors as an entity might be less obvious. Even though there is no automated purchase order process, the videos entered into the application come from somewhere, so video vendors should be identified as the source of video information.

Last, we deal with video. Is video an entity that the application interacts with? The answer is yes. Is video an entity that the application can control? The answer is again yes. Video is not on the context dia- gram because it is in the application. In effect, the video is within the circle that describes the ABC Rental Processing.

As a result of this analysis, we add three external entity squares to the context diagram labeled Cus- tomer, Video, Vendor, and Accountant (see Figure 7-11), and define the entities in the dictionary (see Figure 7-12).

Next, we define the data flows and document them in the dictionary. What happens in this applica- tion? When a customer selects a video, they first tell the clerk their phone number. The clerk uses the phone number to 'look up' the customer and validate their rentals. If the customer is new (i.e., not on file),

Structured Systems Analysis Activities 239

Accountant

... 7 Customer -:J ABC Rental I------~ - Video

Vendor __ 1 Processing 14-------4

FIGURE 7-11 Skeleton ABC Rental Processing Context Diagram

the customer information is entered and stored. After phone number processing, the customer either gives the clerk the cardboard shell, or tells the clerk the video name (see Chapter 2). This sentence identifies

Entity Name Customer

Aliases None

Relationship Rents and Pays for Videos, to Application Provides New Customer

Information, Returns Videos

Contact None

Entity Name Video Vendor

Aliases Vendor

Relationship Provides New Videos to Application

Contact None

Entity Name Accountant

Aliases None

Relationship Part-time employee receives to Application end-ot-day reports

Contact None

FIGURE 7-12 ABC Rental Processing Data Definitions for External Entities

a data flow: rental request. After entering the infor- mation into the computer system, the clerk needs to provide some record with customer signature that the rental took place. This record accounts for the transaction and establishes customer liability for the rental property. This information identifies a recipro- cating outward flow to the customer: rental receipt. When the tape is returned, the charges are com- puted based on the due date of the rental(s). This identifies another incoming data flow for a video return. So we have identified four data flows be- tween the ABC Order processing application and customers:

• New Customer to store customer information • Rental request (analogous to placing an order)

from the customer to create a video rental and payments

• Rental Receipt from the application to confirm the rental

• Video Return to determine late charges, if any, and payment due.

For these four flows, there are four arrows be- tween customer and ABC Rental Processing. Three arrows are from customer for new customers, rental requests, and returns. One arrow is to customer for the rental receipt.

240 CHAPTER 7 Process-Oriented Analysis

New Customer ..

Video Rental, Payment • (

Rental Receipt \ Customer

-- Video Return

Accountant

End of Day Rental Summary

ABC Rental Processing

... New Video Video Vendor

FIGURE 7-13 ABC Order Processing Context Diagram

The data flow relating to vendors is somewhat obscure, but is identified by the need to enter new video information. Since new video information comes from somewhere, its source must be identified as the entity. There is one data flow from vendor to ABC Rental Processing for video information. There are no data flows back to vendor because the scope does not include ordering videos from the vendor.

Last, we define the data flows to and from the accountant. The accountant does not feed any infor- mation into the application, and receives only an end-of-day rental summary. So, there is one data flow to accountant for the' end-of-day rental sum- mary.' Next, we draw the data flows on the context diagram and label them (see Figure 7-13).

While we label the flows, we evaluate the names of the data flows to ensure their meaningfulness. Rental request implies a request for assistance in rental processing and is a weak name. Stronger, more meaningful names are 'Video rental' or 'Video rental information.' Either of these might be used. Here, we use Video rental since the word 'informa- tion' is not particularly meaningful. Also, rentals are always accompanied by payments which are added to the name to be more explicit.

Novice analysts frequently have trouble differ- entiating between the thing, and information about the thing. Keep in mind that what we document on DFDs is always information about the thing. So, when we name a data flow 'Video Rental' we really mean information about 'Video Rental.' That is why the word 'information' is weak in the data flow name. The other names: Rental Receipt, Video Re- turn, New Customer, New Video (not New Video Information), and End-of-day Rental Summary are all acceptable. Again, there are no 'right' or 'wrong' names for data flows. Some names are more descrip- tive than others, and, therefore, stronger. Many com- panies define their own conventions, or local rules, for naming data flows, entities, and processes.

Last, we define data flows in the dictionary (see Figure 7-14). Keep in mind that just because the information is in the dictionary does not mean it is cast in concrete. It is subject to review and change throughout the life of the project. The goal is to define the application at a level of detail so that changes can be made before they become costly, that is, during analysis.

Upon completion of the context diagram, you are ready to do the next level of analysis, opening up the circle, to define a data flow diagram.

Structured Systems Analysis Activities 241

Name New Customer

Aliases None

Timing As Occurs

Contents Name + Address + Phone Number + Credit Card Type + Credit Card Number + Credit Card Expiration Date

Constraints None

Name Rental, Payment Name New Video

Aliases None Aliases None

Timing As Occurs Timing As Occurs

Contents Phone Number + 1 {Video ID}m Contents Video ID + Video Name + Total Amount of Order + Date + Rental Price

Constraints None Constraints None

Name Copy of Order Name End of Day Summary

Aliases Printed Order Aliases EOD Rental Summary

Timing One per rental transaction Timing Close of Business

Contents Phone Number + Customer Contents Videos Rented + Total Fees Name + Customer Address Collected + Videos Returned + 1 {Video ID + Video Name + On-Time Returns + Late + Rental Charge + Due Date}m Returns + Total Late Days + Total Amount + Total Amount + Late Fees Collected Paid + Total Amount Due (should Constraints None be zero)

Constraints Must be signed by customer. Optional that customer takes a copy.

FIGURE 7-14 ABC Video Data Flow Definitions-Tentative

Develop Data Flow Diagram Rules for Developing a Data Flow Diagram

To develop a data flow diagram, iterate through the following steps until a primitive level is reached:

1. Define the processes. 2. Define the files and other data flows required

to support the processes. 3. Draw a Level 0 DFD. At level 0, ignore

trivial error paths and data stores. If you define a validation process, you must eventu- ally identify an error path. Define the error

path at the primitive level. Similarly for data stores, define files when they are shared between processes. Introduce files that are only used within a given process at the level at which the file is shared between two or more subprocesses.

4. Balance the DFD with the context diagram. Compare the net inputs and outputs to exter- nal entities on the DFD to the net inputs and outputs on the context diagram. There should be a one-to-one correspondence between the diagrams.

5. Iterate through this procedure until the primi- tive level of DFD is reached for all processes.

242 CHAPTER 7 Process-Oriented Analysis

Always balance the current level DFD's net inputs and outputs with those of the previous level.

First, we will discuss how to identify the Level 0 processes that are within the circle of the context diagram, without defining any data stores. The diffi- culty of this activity varies with your understanding of the problem domain and the scope of the project. One of the hardest parts of this activity is to decide the 'right' level of abstraction. What is right in one instance may not be right in another. For instance, if you have a multidepartmental, multiapplication en- vironment you are trying to describe, the Level 0 diagram might link departments and the net data flows of the context diagram (see Figure 7-15). If you have a multidepartmental, single application environment, you might identify major functions and their relationships (see Figure 7-16). Or, if you have a single department, single function applica-

1.0

Counsel Patient

6.0

Accounting

2.0

Maintain Patient

Records

tion, such as ABC Rental Processing, you try to define the general functions to be performed. The approach in this text is to define the simple environ- ment, discussing the common features for all levels of abstraction.

During the information gathering stage of the application, you discussed with users what they did and how they did it (see Chapter 4). The individual steps that each user performs in the tasks relating to the application are components of the applications' processes. There are a variety of ways to identify processes; some examples are:

1. Direct identification: If you have similar experience and either know the processes, or have articulate users who know the pro- cesses, identify them directly.

2. Top-down: Decompose the problem into its constituent parts. The functions at each level should completely define the problem and

3.0

Create Government

Reports

FIGURE 7-15 Multidepartment, Multiapplication Level 0 DFD

Doctor

Structured Systems Analysis Activities 243

1.0

Maintain Psychiatric Counseling

Visit Information

Visit Date, Time,

Patient ID

Maintain Psychiatric Counseling Information

Diagnosis

Notes

_, Newand ,-------, Update

3.0

New and Update Patient

Information

Patient Patient Information

Maintain Patient

Information

FIGURE 7-16 Multidepartment, Single-Application Level 0 DFD to Maintain Patient Records

should be as independent of each oth~r as possible. The resulting independent functions can be analyzed in isolation of the other parts to develop each part's subprocesses. Decom- position continues until atomic levels of pro- cessing are identified.

3. Bottom-up: Do bottom-up analysis starting with the details of task steps and procedures described by users, synthesizing and combin- ing the steps to define processes.

4. Outward-in: Use context diagram entities and data flows to identify 'boundary' processes with which they directly interact. Work outward-in to define what other trans- formations are required to link the input and output boundary processes.

5. Functional sequence: Examine the input data flows from external entities to identify

the 'first process' in a sequence of processes. From that first process, define the other trans- formations that are required to go through each function from beginning to end.

All of these approaches can work. None is more right than another. We all use one or more of these in performing analysis without thinking about how we actually do it. A good approach is to use two or three of the methods as a way of double-checking that all processes are defined and connected prop- erly. For ABC, we will combine the last two approaches, using the information from the context diagram.

Once processes are identified, you draw them and connect them to the external entities via the named data flows. Other data flows and processes are iden- tified to connect the initial ones defined until you feel

244 CHAPTER 7 Process-Oriented Analysis

the diagram completely describes the overall pro- cessing. Keep in mind while you are performing this activity that you do not pay attention to timing or sequencing of processes. You do not show start-up or shutdown activities on a data flow. If you have end of day, end of month, or other periodic processing, the DFD shows the processes without necessarily identifying the timing of the processing. As the processes are drawn, name each with a verb and the data they create, and number them. Numbering of processes is not meant to sequence them, even though we unconsciously tend to do this.

Also, at Level 0, ignore exception processing. You might have a data flow named' Valid X' without a matching 'Invalid X.' The exception process is added at the next lower level. This avoids unneces- sary clutter at the highest level.

Mter the processes are identified, next define file locations on the Level 0 data flow diagram. You could leave files for a lower level of analysis as many texts and companies do by convention. In that case, you are ready to draw the diagram. Here, we will develop the thoughts that are used to identify data stores.

To identify data stores, first consider each pro- cess. Can the process be completed without reading or writing to a data store? If your answer is yes, then you do not need a file at this level. If the answer is no, you need one data store for every required read action and every required write action. Many times, the reads and writes are to the same data store. Then, you have one data flow per input/output action. As these required reads and writes are identified, you add to the DFD to include the data store name and data flow(s). When you do this part of the drawing, make sure that each flow and store has a name.

Finally, when you have reviewed each process for determining whether to include data stores, review the diagram to make sure that its DFD syntax con- forms to the rules. The first seven rules relate only to processes and their connectivity. Processes with con- nection errors are called 'pathological' processes because they do not follow the philosophy of DFDs that processes are connected via flows, files (data stores), or entities.

The next four rules check that all connections in the diagram are legal. The rule about no dangling

arrows5 is our own. Work and teaching experience have proven that novices use dangling arrows to hide their lack of understanding of what they are doing. The final two rules deal with balancing, error han- dling, and the introduction of files.

The DFD syntax rules are:

1. All processes are connected to something else.

2. All process have both inputs and outputs. 3. No processes have only outputs or only

inputs. 4. Processes may connect to anything: other

processes, data stores, or entities. 5. All processes have a unique name and

number. 6. Each process number is used once in the

diagram set. 7. Only subprocesses of a process shall follow

the numbering scheme of the parent process.

8. Entities and data stores may connect only to processes. Another way to state this is that each data flow must have at least one end connected to a process.

9. Data flows are the only legal type of con- nection between entities, processes, and data stores.

10. Make sure there are no dangling arrows. 11. The net data flows to and from context dia-

gram external entities must balance, that is, be present, in each level of DFDs.

12. Trivial errors and exceptions are not han- dled until Ll or lower in the DFD set.

13. Trivial data stores show up in the diagram set the first time they are referenced by a process.

When the Level 0 DFD is complete, walk through the DFD with your peers, then review it with your user. Keep in mind that you are teaching the users

5 I realize that this is contrary to DeMarco, Yourdon, and many undergraduate texts. For novices, dangling arrows frequently mean you have no clue about what attaches at the other end. In addition, most companies want all terminators identified to ensure accuracy and to simplify quality assurance. Until you are proficient, draw the entire diagram!

Maintain Counseling Information

Doctor

3.0

Patient

Maintain Visit

Information

FIGURE 7-17 Context Expansion of Level 0 Processes: Maintain Patient Records

as well as having them review your work. If they do not understand what you are showing them, they cannot adequately comment on it. So, use a top- down approach to the presentation, too. First, show the users the context diagram. Define all of the items in the diagram. Once they agree on the external entities, show them a blowup of the context diagram that includes the inside of the circle: the major processes and the data flows connecting them to external entities (see Figure 7-17). Then, replace that diagram with a Level 0 DFD showing the entities and processes. Use overlays, adding the data stores and remaining data flows. Finally, review the detailed definitions from the data dictionary for each process, data flow, data store, and entity. If you take a step-by-step approach, users can more easily accept and assimilate the information.

Structured Systems Analysis Activities 245

Do not expect to have agreement on the first, or even second, review. One benefit of data flow dia- grams is focusing thoughts on the problem. Users will frequently 'see' what is missing when they look at a diagram that they could not 'see' when they dis- cussed the topic verbally. When they begin a sen- tence, "Well, what about ... " pay close attention; the subject is usually some variation, exception, or forgotten information that they did not discuss previously.

As you understand and users agree on the con- text and Level 0 processes (see Figure 7-18), begin work on the lower level DFDs. For each Level 0 process,

1. Draw the input and output flows and the icons to which they connect from the higher level diagram. This forms the skeleton of the diagram (see Figure 7-19). These are called the net6 inflows and outflows.

2. Define the subprocesses by asking, "What are the steps required to do this process?" Then for each step, "Can I separate this from the other steps and do it in isolation?" For each subprocess you isolate, draw a process rec- tangle on the lower level diagram.

3. Identify whether data stores are required or not. Add them and, if they are new, name them.

4. Identify data flows to complete the diagram (see Figure 7-19). Make sure you provide all and only the information required to perform the process.

5. Review the diagram for unnecessary connec- tions and, if found, remove them.

6. Update the data dictionary with all new information.

The goal of subprocess identification is to de- compose the upper level processes into what will eventually be programmable modules. A good, that is, correct, design has certain characteristics that are

6 Net, from accounting, means remaining after all necessary deductions. Here, net means remaining data flow and data store connections after the higher level process is removed. The net data flows in and out of a higher level process may connect to different subprocesses at the lower level.

246 CHAPTER 7 Process-Oriented Analysis

Doctor

Patient

New and Update ..

Patient Information

Maintain Counseling Information

3.0

Maintain Patient

Information

Diagnosis

Notes

1.0

Maintain Visit

Information

Visit Date, Time, Patient

New and Update Patient

Information

FIGURE 7-18 Completed Level 0 DFD to Maintain Patient Records

traceable back to a properly decomposed DFD. The two most important characteristics are maximal cohesion and minimal coupling. Cohesion measures the internal strength of a process (this is also called

Doctor

intraprocess strength). We want modules that result from process descriptions to have exactly the logic required to perform the task, and nothing more. Min- imal coupling measures the interprocess connec-

FIGURE 7-19 Skeleton Levell DFD with Net Inflows and Outflows for Process 1.0: Maintain Visit Information

Structured Systems Analysis Activities 247

1.1

Review and Code Visit Info

Unreconcilable

Errors

Doctor

Patient File

CCD Time

Keeper

FIGURE 7-20 Completed Levell DFD: Maintain Visit Information

tions. Ideally, we want data flows and stores to con- tain exactly the information needed to trigger or per- form each task, and nothing more. The questions and evaluation of processes in the decomposition process, if done properly, result in cohesive, mini- mally coupled processes.

Three types of quality checking are performed on the analysis results. First, correctness checking determines that the syntax and connections used in diagrams, charts, and so forth are accurately used. Next, completeness checking is performed with the users to validate the meanings of all terms and to verify the semantics used in all documenta- tion. Last, consistency checking ensures con- sistency and correctness of all entries that span multiple diagrams, text, charts, and so on. Consis- tency checks evaluate the interitem syntax and semantics. These checks are first performed by the project team during walk-throughs or other quality assurance evaluations. Then, they may be reviewed by independent quality assurance analysts as an added check.

If you find data flows that are identical, with no transformations, going to many processes, reassess

the processes definitions (see Figure 7-21). On the other hand, if you have a transaction processing application in which each transaction has its own version of some process, this type of diagram is cor- rect (see Chapter 8). If the processes all do different transformations and have either unique inputs or unique outputs, leave them separate. If the transfor- mations have an if-then-else logic, they are at too Iowa level and should be combined (see Figure 7-22). If they all do different transformations to the incoming data, are the processes' outputs going to the same place? If so, you may have over- decomposed and should combine the processes. Fig- ure 7-23 shows two possible corrections to the over- decomposition. Either correction may be acceptable depending on the 'Y.y' data complexity and their processing complexity. Semantic (i.e., interpreting problem meaning) DFD problems are discussed again in the next section.

At Level 0, we did not concern ourselves with exception processing. At the lower levels, when a data flow is named 'Valid X,' you must balance that flow with another one called 'Invalid X.' In other words, you do define errors and exceptions at the

248 CHAPTER 7 Process-Oriented Analysis

AFTER

YFile

FIGURE 7-21 Example 1 of Excessively Detailed Processes

same level at which you define the split of valid and error/exception processing.

Let's examine how to apply these thoughts to develop a set of DFDs for ABC Rental Processing.

ABC Example Data Flow Diagram

We said above that in ABC Rental processing we are combining the analysis of context with analysis of the sequence of actions for each data flow. So, we

start with a customer placing a video rental request. Customer and video information trigger a 'Create rental' process. The first check in 'create rental' is to validate the customer; if the customer does not currently exist, we want to 'add new customer' to the company's files before rental processing. Here, we have a decision to make. We just described two input data flows to the create rental process. We need to decide if they are related or not. In this case, the is- sue is whether we can add new customers as a sub-

BEFORE

AFTER

FIGURE 7-22 Example of If-then-else Logic inDFD

process of rental processing, or whether they are sep- arate. If we separate the two, we have the two data flows we defined. If we combine them, we only have one data flow that optionally contains new customer information with rental information. If you do not know how the user wants the processing performed, you go back and ask. So, we will set this issue aside for the moment and finish defining what it means to 'create rental.'7

7 Postponing decisions that are noncritical to the main logic is an important problem-solving behavior. Notice that we first identify alternatives and implications of the postponed item before setting it aside. If there are more side effects we have not identified, we are more likely to notice them with alterna- tives and implications than without.

Structured Systems Analysis Activities 249

Mter customer validation, we Ilext have to vali- date the video and get a rental price. This requires reading some sort of video inventory file. Again, we ignore invalid video information for the moment. Once we have found the information on all the videos to be rented, we compute the total amount due. Again, we have a decision. At this point, how do we know whether late fees have been paid or not? Do we assume that people always return videos as they come into the store, and rent videos on their way out of the store? The rule is, never assume any- thing. If we know how to deal with this issue from the data gathering, we continue; otherwise, we add it to the list of questions for the user and continue.

Mter the rental amount is created (whatever it is and however it is computed), payment information is entered and customer change is computed. Then, the rental 'order' is written to a file and a paper copy is created for customer signing.

So, we have a process, 'Create rental,' and we have several subprocesses, 'Validate customer,' 'Validate video,' 'Compute rental total,' 'Process payment,' 'Write rental,' and 'Print rental.' We also have several questions and decisions that we deferred. We can create the Create rental process on the Level 0 diagram whether we deal with the deferred issues or not. But we cannot identify the other processes, with certainty, until the issues on new customers and late fees are decided. So, we review the interview information and go see Vic for the detailed answers.

Mary goes back to Vic and says: "We are talking about the options for entering rentals and we have several questions. The first question is about new customers. One option is to separate the functions, that is, add new customers in a separate process from rental processing. A sec- ond option is to allow adding a new customer as part of video rental processing. A third option is to allow both. Do you have a prefere~ce?"

Vic: "I don't know. What will the cost differences be?"

Mary: "No matter what, you want to be able to add, change, and qelete customers. It seems desirable to do that without being tied to the rental process. However, rental processing is

250 CHAPTER 7 Process-Oriented Analysis

BEFORE

• May have excessive detail

• Definitely has excessive detail

AFTER 1

AFTER 2

~ Ext. PX.1-Ent 5 Y File Y,A,S NewY

FIGURE 7-23 Example 2 of Excessively Detailed Processes

90% of your activity and you don't want to slow it down by having to leave that process to add a new customer. The slow-down for going from rental processing to add customer and back will range from 4 to 30 seconds depending on the PC's speed and the software we use. Unless you have a business reason for separating the two processes, I would suggest that you allow both. If we decide this direction now, there is no added cost. If we change direction in a few

weeks, there will be a cost, as high as several thousand dollars."

. Vic: "OK, let's do both, then. It sounds more con- venient this way anyway."

Mary: "OK, we will allow entry of new cus- tomers as a process to be run by itself, or as part of rental processing.8 My second question

8 Notice that Mary reconfirms the decision by repeating the agreed upon solution.

relates to video returns. When we collected our information, we observed people returning videos in several ways. First, they can put them into a slot and pay the fee the next time they rent a video. Second, they can return them and pay when they come in to get a new rental. Third, they can return them and rent a new video both at the same time. Do you want all of these options in the new system?"

Vic: "Yes, why wouldn't I?" Mary: "It is easier for us if we have a somewhat

fixed method of returns. But, if you want no changes, then we allow for all return methods. This may have a cost implication, but I can't tell right now. Should we talk about this again when I know what the cost of the options are?"

Vic was a little upset: "I told you at the beginning, NO bureaucracy and changes only if it improves convenience to my customers. If we don't allow them to return in all three of these ways, some- one will get mad. Besides, don't customers pay when they rent? So, my only risk is on the 10% of customers who have late fees.

"Also, if I limit the ways they can return tapes, I lose my edge over Ajax Video's chain up the street. If there is a cost to allowing all of these things, why can't you tell now, and, if you can't tell now, when will you know?"

Mary tried to placate Vic somewhat but is still com- pletelyhonest: "Usually, there is little incre- mental cost when all variations are known at this stage of the analysis. But I can't tell until we've proceeded a little further and have a sense of how many different programs will result from the most flexible design. I will know when we get to about two more levels of detail which will be in a few days. If there is no added cost, we will go for the flexibility. If there is an added cost, I will let you decide and give you an esti- mate for the different choices.

"Let me summarize: We will analyze for returns through the drop box, returns as a person coming in, or returns as part of rentals, and get back to you with cost implications, if any."

From the application perspective, maximum flex- ibility for both customer and return processing

Structured Systems Analysis Activities 251

means, at least, that the rent and return screens and processing must be closely linked to each other. Now we need to guard against having the processes too closely coupled. Ideally, we want to accommodate Vic's wishes and still have processes separated as much as possible. To obtain this goal, we need to decide the minimum information needed to link cus- tomer and rental processing, and rental to return pro- cessing. Then, visualizing an implementation, we might be able to use, for example, windows for each process. We might open a new window to add a cus- tomer during rentals and maybe open another win- dow to process returns during rentals. Also, with minimal coupling, we maintain separation even though the processes are interleaved.9 This decision process is another example of how not top-down a top-down process is. We are going to an implemen- tation level of detail to jump back up and define the data at the higher, more abstract level. Don't think this is the final answer. It is one way to reason through the problem and figure out how it might work at the computer level. Then, we back off to the logical level to describe that possible model.

We said before that the first step in create rental is to validate customer. If either the phone number or customer name is not retrieved, we know we have a new customer and can switch to that process. Once the new customer information is entered and saved, we can pass it back to rental processing as if it were in answer to an original request. Once we have the customer information in the create rental process, we can automatically check outstanding rentals. If there are any, we can ask if they want to return them or add the new rentals to the list. Our problem is solved unless Vic wants late fees processed whether or not the outstanding rentals have been physically returned. This decision, however, does not affect us until we try to define the details of processing. At the moment, we will assume late fees are only processed when the physical tape is returned.

9 Interleaving means weaving pieces of multiple processes to- gether to give the appearance of parallel processing. Each process progresses a little. First, we switch to a process and do some of its function. Then we switch to another, then back to the first process, and so on.

252 CHAPTER 7 Process-Oriented Analysis

Accountant

End of Day Rental Summary

::::::....

2.0 1\ L1

1.0 1 Create

I---------+l~ reate R , .... , New Customer C J EOD

I Customer ~_ Customer I

r 5 0 'I New Video Video Rental, Payment _ r-----:--jooI;lt--------t

Video Vendor

Rental Receipt .. 3.0 Create II ~----,---I.o4-------_\~~ ~ Video 1/

Video Return rocess I \ Rental p. 17'-----0 ~ Returns

FIGURE 7-24 ABC Video Expanded Context Diagram

The result of this discussion so far is that we have three processes identified: create rental, create cus- tomer, and process returns. Each process could be initiated by the create rental process, or could be ini- tiated by a customer action. We draw these processes (see Figure 7-24) and attach them to the correct data flows. Within the context circle expansion, do not show connections between processes. Processes still unaccounted for are 'create video' and 'Create end of day report' for summary totals. We know we have to get video information into the system, so we add that process and connect it to the data flow from video vendor. Since we must print an end-of-day summary for the accountant, we add the process to the diagram. '

Figure 7-24 shows our high level processes of ABC Video Rental Processing, expanding the con- text diagram within the circle. The processes are shown in small circles or in rounded vertical rectan- gles, depending on local customs. This text uses rounded vertical rectangles. Notice that the data flows to/from each external entity are attached to a process, and all data flows are labeled and have a directional arrow showing which way the data is flowing. Also notice that the processes each have an

action name beginning with a verb, and each process has a numeric identifier.

The next step is to expand to a Level 0 DFD, defining the data stores 10 in the application and link- ing processes, as required (see Figure 7-25). Data store identification usually occurs naturally during the identification of proceSSes and subprocesses. For instance, what actions are done to enter a rental? First, you would check to verify that the customer is, in fact, a customer. This means checking some permanent 'list' or file for presence of the customer. Then, you would ask for each video they want to rent and verify the description and its price. To retrieve the description arid price, we need a permanent file of the video inventory. When the rental is complete, it is stored somewhere (in a rental file), completing the process. Following this logic, we need at least three files at this level of analysis: customer file, video inventory file, and rental file. At this stage, we don't concern ourselves too much with the file con-

10 Other names fat data stores are files, relations, or databases. The term data store means data relating to this name and does not imply normalized form. Data stores can contain more than one data structure [Gane, 1990].

New Customer 1.0

Customer I--------I~ Create

r---~ Customer

Create Rental

Order

Structured Systems Analysis Activities 253

Accountant

End-ot-Day Rental Summary

2.0

Create EOD

Report

Today's Rentals

Return, Late Fee Payment Process Returns

Video Vendor

New Video 5.0

Create Video

FIGURE 7-25 ABC Video First Cut Level 0 DFD

tents, although we identify the contents throughout analysis as they become known. As attributes, or fields, are discussed, it is a good practice to add to an attribute list for each file. The linkage between cre- ate rental and create customer is shown on the DFD as a data flow. The details of initiating create cus- tomer processing when a customer is not found are deferred to the next level of detail.

Before showing the DFD to Vic for his com- ments, we evaluate its level of abstraction and cor- rectness (see Figure 7-25). Are create customer, create rental, create video, and Process Returns all on the same level of abstraction? The first clue that they are is that the first three processes all have the

same verb. Process returns is the removal of rentals just as create rental is the creation of rentals; they are reciprocal processes. The reciprocal processes also appear to be at the same level. The name pro- cess returns is not the best we could choose to show reciprocity; return rental is a stronger name that does and we change the process name.

Next we evaluate correctness of the diagram. Are all the connections legal? Yes. Are there any patho- logical connections? No. Is there a flow through the application? Yes, the main flow is for rental and return processing.

Now, we could return to Vic and ask his opinion, giving him a verbal presentation of the details

254 CHAPTER 7 Process-Oriented Analysis

TABLE 7-1 Decision Table for Decomposing Another Level of Detail

Conditions

Domain Knowledge H H

Language 4GL 3GL 3GL 3GL

Similar Experience Y N N

Simple Process/ Few Files or Complex Process or S C Many Files

Recommended Decomposition Levels

Level 0 X X X X

Levell Opt. X X X

Level 2 Opt. Opt. X

Level 3 ... n X

Legend:

H Extensive experience L Little experience 4GL Fourth Generation Language, e.g., SQL 3GL Third Generation Language, e.g., COBOL 2GL Second Generation Language, e.g., Assembler Y Yes N No S Simple C Complex

underlying each of the processes, and in the details, getting verbal agreement to the next lower level of subprocesses.

At Levell, we first decide which, if any, pro- cesses need decomposition. What happens when you create customer? A quick definition of fields and the type of validations required is necessary. According to the information (see Chapter 2), we need customer phone, customer name, customer address, and credit card ID, number, and expiration date. Validation for these fields is that the data are present and legal for the data type. For complex validation, you fre-

H H H L L L L

2GL 2GL 2GL 4GL 4GL 3GL 2GL

Y N N

S C S C

X X X X X X X

X X X Opt. X X X

Opt. X X Opt. X X X

Opt. X Opt. X X

quently use extra cross-reference files to contain the legal codes and their meanings.

Do we also need to provide modify and delete processing for customers? Always is the answer, ... and query processing as well. Now, we need to know the implementation language to decide whether or not to decompose further. The decision table shown in Table 7-1 summarizes the decision criteria and the most likely outcomes. Keep in mind that you can always go to another level of detail and can always get some benefit from the exercise. But, why do the work if you don't have to?

New Customer 1.0

Customer 1---------1~

Create Rental

Return, Payment

Maintain Customer

Order

Structured Systems Analysis Activities 255

Accountant

2.0

Create EOD

Report

End-ot-Day Rental Summary

Today's Rentals

Return Rental

Video Vendor

New Video

5.0

Maintain Video

FIGURE 7-26 ABC Video Final Level 0 DFD

We are planning to build this application for a LAN environment, using a 4GL-nonprocedurallan- guage. For create customer there are no other data stores needed for validation. There will be add, change, delete, and query processing. The corre- sponding decision cell-4GL, simple process, one file-shows Level 1 to be optional. The decision depends on who is doing the programming. Is the person experienced with similar applications? Is the person involved in analysis fully knowledgeable about the requirements for this application? If the answer to either of these questions is 'no,' the next level of DFD should be developed with the details entered in the dictionary.

For ABC Rental Processing, we will opt not to discuss development of the Level 1 DFD for create customer. We will change the process name to 'maintain customer' to denote the more general and expanded processing. The final Level 0 DFD is Fig- ure 7-26; the Levell DFD is shown as Figure 7-27 for reference.

A similar set of arguments for Process 4.0, 'create video,' is possible. We also rename that process 'maintain video' to denote the expanded process- ing, and omit the levell DFD.

Both rental processing and return processing should be expanded regardless of the implementa- tion language because they are fairly complex and

256 CHAPTER 7 Process-Oriented Analysis

New Customer Customer t----------i~

ABC

1.0

Create Customer

Query Customer

FIGURE 7-27 ABC Rental Levell DFD for Maintain Customer

we have not discovered how they work yet although we have described rental processing in some detail. First we examine the DFD from our knowledge so far, then expand it as required (see Figure 7-26). In the level 0 DFD, the create rental process interacts with customers twice and with all three data stores. To untangle and clarify the processing of these five interactions, we decompose the process further.

The first interaction is to get rental information from the customer. The' rental information' includes customer ID (or name) and video IDs (or names). The customer ID is used to validate the customer and get the rest of the customer information for the rental. Similarly, the video ID is used to validate the video and get the rest of the video information for the rental. Customer ID is also used to check for late fees and to retrieve outstanding rentals. We also know that if the customer is not on file, we want to initiate process 3.0, maintain customer. When com-

bined, this processing is fairly complex and some- what extensive. It is complete when the clerk does something to show that entry of rentals is complete. We can group these processes together and call them 'get valid rental' (process 1.1) because once these actions are complete, the rental is ready for the next step of processing. The detailed steps we identified are either used to create another level of DFD or are documented in the dictionary for process 1.1.

A valid rental is totaled by adding all of the rental fees for the current set of entries and any late fees outstanding from past rentals. Once the total is dis- played, the amount of money paid by the customer is entered into the system by the clerk. The total paid is subtracted from the total due to get the change due to the customer. When the change and total due amounts are both zero, the rental is complete and ready for the last part of the process. Because this stage is discrete, beginning with the successful vali-

Customer

Create and Print

Rental

Structured Systems Ahalysis Activities 257

Customer File

Video File

Rental File

FIGURE 7-28 ABC Rental Processing Levell DFD

dation and ending when the change and total due are zero, we group these actions together and call them 'process fees and money' (see Figure 7-28).

Finally, a rental is completed by saving all the information in the rental file and printing the receipt for customer signature. When these actions are com- plete, the create and print rental process is complete (see Figure 7-28).

Notice that we have decomposed the data flows as well as the processes. Where we group rental and payment on the level 0 diagram, we separate them on the level 1 diagram. We add change to the process because now we are dealing with the details. Simi- larly, the data flows connecting to the data stores are decomposed to show details of data passing back and forth. On a DFD, we assume all data can be

passed when the data flows are not labeled, and it is okay to summarize on level O. At levell, we become specific and show the interface accurately and in detail.

When you are drawing the OFD, you have to guard against being too detailed. This is difficult, especially for novice analysts. If your drawing has these symptoms, you are too detailed and must com- bine processes to a higher level of abstraction. The semantic process problems to look for are listed with examples below. These problems violate one or more of the DFD Semantic Rules and Heuristics:

1. Processes that have only one data flow from the previous process as its input are probably overspecified. The solution is to

258 CHAPTER 7 Process-Oriented Analysis

BEFORE AFTER

-Gn ~

FIGURE 7-29 Example of Pathological Data Flow

combine the data flows (see Figure 7-29). Another solution may be the addition of a missing external entity (see Figure 7-30).

2. When several processes have interactions with the same external entity and at least one process has no other interactions, check that the data flows and transforma- tions are different. If any two processes have the same outflow or are closely related, that is, passing one's input data to the next, they are probably overspecified. It may be possible to localize all external entity interactions in one process, and to perform all processing on the information obtained in the other process (see Fig- ure 7-31).

3. When several processes have interactions with the same file and at least one process has no other interactions, check that the file contents read/written and transformations are different. One goal of all application is efficiency. If you read the same data more than once, it is inefficient. It is somewhat better to pass the data between processes. If you are identifying only logical processing and have the reading to show where data is used, make a note that during design you will need to redevelop the DFD to show

physical reads of the file. It may save time to redevelop the DFD at this stage rather than wait. Several solutions are possible (see Figure 7-32). In the first solution, all file interactions are localized in one process; in the other, inputting from the external en- tity and file are in one process and out- putting is in the other. Both of these solutions require rethinking of the func- tional decomposition.

4. If several processes have more than one write to the same file, check that the processes are distinct and that the data must be written disjointly. Again, to have effi- cient file processing, minimal reading and writing is desired. The alternatives are to localize reading and writing as in the first solution (see Figure 7-33), or to combine

BEFORE

AFTER

FIGURE 7-30 Example of Spontaneous Process

BEFORE

AFTER

FIGURE 7-31 Example of Overspecified Entity Processes

some of the processing but include writing in more than one process as in the second solution.

5. Any imbedded if-then-else logic that describes process interaction is wrong. Remove the logic by consolidating the processes. The logic belongs inside the process box, not outside; one solution is shown as Figure 7-34. If this problem occurs, make a note to include the control on the structure chart for the if-then-else logic, as required.

6. Processes that do only one very minor process, for instance, check customer num- ber for validity, may be overspecified. A better process would check the customer

Structured Systems Analysis Activities 259

information, do a credit status check, and identify outstanding late fees (see Figure 7-35). This example is an improvement because it is reading and validating all cus- tomer data only once.

7. Make sure that no physical entities, such as cash register or bar code reader, have sneaked into the DFD. Also make sure that no immediate users of the application are identified on the DFD. The solution to this problem is to remove all physical entities on any diagram in which they occur (see Figure 7-36).

8. Make sure that data flow names are field contents being passed or some group name for field contents that clearly identifies the information (see Figure 7-37). Unnamed data flows are frequently masking overspec- ified processes. If you cannot develop a unique, meaningful name, reevaluate the process they attach.

9. Data stores may show up on diagrams multiple times with the same name. To show that you know it is repeated, place a vertical bar down the left side of the file symbol.

10. Similarly, data flow names may show up multiple times with the same name. This condition is okay if, and only if, the contents are identical. This condition is rare, so when multiple data flows with the same name are present, there is frequently an error. Double check any data flows with the same name and give any unique data flows their own descriptive name (see Figure 7-38).

11. To simplify the design phase activities, make sure that process names include the transformation name and identify the data being transformed.

12. If data stores have only one input or one output, check that it is correct. This condi- tion may be okay on the input side as long as maintenance is performed in some other application, or for files that are cross- reference tables only. The condition for output-only connections may be correct, for instance, for temporal databases in which

260 CHAPTER 7 Process-Oriented Analysis

BEFORE

AFTER 1 or AFTER 2

FIGURE 7-32 Example of Overspecified Read File Processing

nothing is thrown away. Check the business rules relating to the data and verify the processing.

For return processing, we need to walk-through the process to define if we need subprocesses. A video ID is entered and used to retrieve the rental. The system assigns today's date as the return date. Late fees, if any, are computed. The total amount due is computed. The total amount due is displayed, an amount of money received from the customer is entered, and change is computed. When both total amount due and change are zero, payment process- ing is complete. If no late fees are owing or payment

is complete, the open rental record is removed from the open rental file and history information is up- dated. If late fees are owed but not paid, the open rental record is rewritten with return date and late fee information. Return processing has several steps, but each is simple, requiring at most one file per step. There is little need for a Levell DFD for this pro- cess at this time.

Notice that the process fees and money is identi- cal to the same p:rocess for rental processing. We can develop a common, reusable module for both rental and return processes. Also, notice that we introduce history here. If we decide to have a history file, it would show at this level of DFD.

At this point, we are ready to reevaluate the new DFDs and proceed to development of dictionary entries for all DFD information. Check the final DFDs for legal connections, similar levels of ab- straction, and balanced net inflows and outflows be- tween levels. Then, continue to the data dictionary.

Develop Data Dictionary In this section we briefly discuss the contents and rules, if any, for each type of dictionary entry. Then, we will document the information from the ABC rental application. Since you have seen examples of each type of entry, this section is short.

BEFORE

AFTER 1 or

Structured Systems Analysis Activities 261

Data Dictionary Contents and Rules-Entities

The contents of the dictionary for external entities are listed in Table 7-2. The most important are the name and the definition of the entity. In organiza- tions with data administration functions, this infor- mation must conform to the 'corporate' dictionary definitions or must be reconciled with it to define new terms. The SEs work with users and data administrators to name and define the entities for the organization. IS personnel do not name and define the terms by themselves. Most external entities are people, job titles, organizations, or applications with

AFTER 2

FIGURE 7-33 Example of Overspecified Write File Processing

262 CHAPTER 7 Process-Oriented Analysis

BEFORE

AFTER

FIGURE 7-34 Example of If-then-else Logic in DFD

TABLE 7-2 Data Dictionary Entity Contents

Entity name

Aliases

Definition

Relationship to application

Contact, if entity is an organization

which the application under development interacts. Choose a meaningful business name that describes the entity accurately and completely. If you have a data administration function, use their name. The definition should be a business definition and should be completely independent of any technology.

Make sure you include in the definition any aliases or names used in your application that do not conform to the corporate standard. Describe the entity's relationship to the application in terms of the nature and timing of the interaction. If the entity

is an organization, include the name, address, and phone number of the person most frequently contacted.

Figure 7-39 shows the notation to be used in describing the contents of an entity to a dictionary. Keep in mind that this convention works well if you are using a manual method. Automated tools have their own format and notation for repository con- tents. There is one notational structure for each type

Cust# Customer I------II~

AFTER

Structured Systems Analysis Activities 263

of entry: optional information, multiple repeating information, required information, selection between attributes, and primary keys.

ABC Example Data Dictionary-Entities

The external entities in ABC Rental are customer, vendor, and accountant. The entries for each of these are shown in Table 7-3. If the accountant is an

BEFORE

Customer File

FIGURE 7-35 Example of Excessive DFD Detail

264 CHAPTER 7 Process-Oriented Analysis

BEFORE

FIGURE 7-36 Example of Physical Entities

employee, you would not include his or her name in the dictionary. If the accountant is an outside firm, you would include the information.

Data Dictionary Contents and Rules-Processes

The contents of the dictionary for processes are listed in Table 7-4. For processes, we include the process number from the DFD to allow quality assurance, and to easily link back to the process model. In a computer-aided software engineering tool (CASE), if you used one, you usually have automatic linkage between the diagram and the

Customer History File

Customer File

Rental File

Video Inventory

Customer History File

Customer File

Rental File

Video Inventory

dictionary entries. The name of the process should be exactly the same as the process name used in the DFD.

The process description details the steps to com- plete the process and can take several forms. The most common are pseudo-code and structured Eng- lish, supplemented by decision trees or decision tables as needed. Pseudo-code uses the syntax from a language in abbreviated form for easy translation into the target language. Structured English is a computer-language independent description of a process using only simple verbs and terms from the dictionary; no adjectives or adverbs are used. Struc- tured English is used here.

BEFORE

AFTER

Valid Rental

Structured Systems Analysis Activities 265

Customer History File

Customer File

Rental File

Video Inventory

Customer History File

Customer File

Rental File

Video Inventory

FIGURE 7-37 Example of Weak Data Flow Names

ABC Example Data Dictionary- Processes

The process entries for ABC are all included at the level 0 detail level (see Table 7-5). To document the entire application, you would create a data dictionary entry for each lower level process, then refer to that process in the higher level dictionary entries. In this way, the hierarchy of processing and linkages are documented.

Notice that there are some uneven levels of detail in the process entries. For instance, the process fees and money routine is fairly detailed, while the reference to create history in return rental is not

detailed at all. You document the information you have, replacing the high level abstract thoughts with the details as you come to know them. The dictio- nary is constantly evolving and changing as more information becomes known.

Data Dictionary Contents and Rules- Data Stores

The data store defines persist ant data; contents of a data store dictionary entry are listed in Table 7-6. There is a significant amount of detail that is even- tually documented. You begin completing the infor- mation as it becomes known and complete the rest

266 CHAPTER 7 Process-Oriented Analysis

BEFORE

r-----, Cust Info, Video Info

Customer f-------1~

Customer History File

AFTER Cust#, Video # for Previous Rentals

...----......., Cust #, Video #

Customer I------I~

Customer History File

~-------1 Customer File

Cust Summary

Rental File

FIGURE 7-38 Example of Nonunique Data Flow Names

when it is available. Also, some of the information may not be relevant in your organization (for instance, if all projects always use DB2 relational files, you may not need detailed documents because the information already exists). The goal of the doc- umentation is to present necessary information with- out much verbiage. Keeping that in mind, trim the dictionary entries to fit your situation.

ABC Example Data Dictionary- Data Stores

The dictionary entries for data stores are in Table 7-7. For now, we know very few of the details about, for instance, volume, growth, and security. Those entries are left blank.

Above, we said that you trim the contents of the dictionary entries to fit the project. In a consulting situation, such as Mary and Sam are in at ABC, the

likelihood of them also maintaining the application is unknown. So, the more detailed the documenta- tion, the more you simplify future maintenance.

Data Dictionary Contents-Data Flows

Data flow contents are important pieces of docu- mentation because they cause the creation and change of files and determine the data each process actually accesses. The data flow contents are shown in Table 7-8. Contents have a primary key to uniquely identify the data. The difference between primary key for a data flow and for a data store is one of time. What period of time is the flow 'alive'? Data flows usually have a short life which means that less data is required for a unique ID. For instance, the flow payment is a money amount which is accept- able here. At the implementation level, that field might also require a terminal ID or a transaction ID

Symbol

( )

n{ }m

[ ]

Definition

is composed of

and

Parentheses show an optional entry which mayor may not be present

Braces show iteration n is minimum entries m is maximum entries If no limit to entries, the maximum is shown as m.

Square brackets identify selection from among alternatives

Vertical bar is a separator of alterna- tive choices within square brackets

Comment

Underline identifies a component of a primary key

* Adapted from Yourdon, Edward, Modern Software En- gineering. Englewood Cliffs, NJ: Prentice-Hall, Yourdon Pres~ 1989,p. 191.

FIGURE 7-39 Data Dictionary Notation*

to be unique; implementation requirements are not dealt with in analysis. Data flow constraints are most often present in real-time applications or in applica- tions with contingent processing of data. The source of the data flow is a cross-reference back to the entity, process, or file from which it flows.

ABC Example Data Dictionary- Data Flows

The data flows for ABC rental processing are shown in Table 7-9. There is nothing difficult about any of them. Keep in mind that these definitions are not cast in concrete; they can change whenever the need arises. It is important to keep this information up to date, because programmers use the dictionary to check that their modules are receiving the correct information.

Structured Systems Analysis Activities 267

TABLE 7-3 ABC Entity Dictionary Entries

Entity Name: Customer

Aliases: None

Definition:

Relationship:

Contact:

Entity Name:

Aliases:

Definition:

Relationship:

Contact:

Entity Name:

A Customer is any individual, organi- zation, or other entity authorized by ABC management to rent videos.

Rents and pays for videos

Signs rental order

Provides new customer information

Returns videos

N/A

Video Vendor

Vendor

A Video Vendor is any organization or individual from which ABC purchases or otherwise acquires videos.

Provides new video information

N/A

Accountant

Aliases: None

Definition:

Relationship:

Contact:

The employee providing accounting services for ABC video.

Gets end-of-day summary accounting reports

N/A

TABLE 7-4 Data Dictionary Process Contents

Process ID Number

Process Name

Process Description

Constraints (e.g., concurrence, sequential with another process, time-out, etc.)

268 CHAPTER 7 Process-Oriented Analysis

TABLE 7-5 ABC Process Dictionary Entries

Process Number:

Process Name:

Description:

1.0

Create Order

For each customer, Enter customer ID (or name) Read customer file using

customer ID (or name) as key

If NOT present display 'Customer not currently on file, switching to create customer' Call New-customer

routine. Display all customer infor-

mation. Read Rental file using customer

ID If rentals exists, display

rentals If returns

Call Return routine else

continue else If late fees outstanding add

late fees to total.

For each video, Read inventory file using

video ID (or description) as key If NOT present display

'Video not on file, switching to create video'

Display video description (or number), price.

Add all extended price to total. Perform process-money

routine.

Data Dictionary Contents-Attributes

Attributes, or fields, are facts about an entity. At- tribute definitions are tedious and tend to be over- documented unless you are using a CASE tool. As you can see from Table 7-10, there is a large amount

Constraints:

Process Number:

Process Name:

Description:

Constraints:

Write order to order-file. Print order confirmation. Return.

Process money routine Display total. Get amount. Subtract total from amount

giving change. Display change. If change and total = zero,

return, else go to process money.

None

2.0

Return Rental

For each video, Enter video ID Retrieve rental

If NO rental, display error message and return.

Use Customer ID to retrieve other rentals.

Display entire rental. Move to today's date to return

date. If return-date-rental-date > 2

compute late charges display late charges add late charges to total.

Create history. If new rentals,

return else

call process money routine.

None

of information about attributes that is needed to fully document them. In organizations with a data administration function, much of the information for the type of attributes used here would already be documented, and you would just copy that documentation.

Structured Systems Analysis Activities 269

TABLE 7-5 ABC Process Dictionary Entries (Continued)

Process Number:

Process Name:

Description:

Constraints:

Process Number:

Process Name:

Description:

3.0

Maintain Customer

If new create new customer

else If modify

prompt customer ID retrieve customer record get changes and verify rewrite customer

else if delete

prompt customer ID retrieve customer record prompt "Are you sure you want

to delete?" If yes,

delete customer else

else if query

call query routine. Return.

None

4.0

Maintain Video

If new create new video

else If modify

prompt video ID retrieve video record get changes and verify rewrite video

else if delete

prompt video ID

ABC Example Data Dictionary- Attributes

As the two examples provided in Table 7-11 show, the contents get quite long and take quite a bit of paper. In the interest of saving a few trees, and

Constraints:

Process Number:

Process Name:

Description:

Constraints:

retrieve video record prompt "Are you sure you want

to delete?" If yes,

delete video else

else if query

call query routine. Return.

None

5.0

Create EOD Report

Read rental file count today's rentals total today's rental receipts

Read cash register count today's returns count today's late returns total today's late fees count today's rentals total today's rental receipts

Format and print end-of-day summary report.

None

keeping the dictionary useable, when using a paper dictionary, capture only the essential information about attributes and put it in a short-form attribute table as shown in Table 7-12. Essential informa- tion is usually the user name, system name, data type, data length, and edit rules. If there is other

270 CHAPTER 7 Process-Oriented Analysis

TABLE 7-6 Data Dictionary Data Store Contents

Data Store Name

Aliases

Definition

Data Attributes (Contents in normalized form)

Data Structure (e.g., relation, hierarchy)

Organization (e.g., Vsam entry sequenced)

Sequence and sequence attributes

Size of Relations/Records

Primary Key

Alternate Keys

Index Attributes

information required, such as security restrictions or cross-reference file names, you would add it for that attribute but not all of the others. The short form is used in this text to document ABC's attributes.

AUTOMATED _____ _ SUPPORT TOOLS ____ _

Structured analysis and process methods, in general, are the oldest and most widely used methods. Because they are most widely used, a large number of CASE tools to support structured analysis are available on the market. All of the tools support DFDs; all have a dictionary (although they are not all 'active'). A table of representative CASE tools supporting structured analysis is listed below in Table 7-13.

If you did not get the impression that CASE tools represent a 'buyer beware' situation, perhaps some comments from a recent survey will prove that it is. Data flow diagrams in 12 CASE environments were compared on DFD correctness checkingY The

11 See Vessey, Jarvenpaa, & Tractinsky [1992].

Volume

Percent change per cycle

Frequency of cycle (e.g., as occurs, daily, weekly, etc.)

Growth percentage per year

Allowable actions (read, write, or read/write) by process

Security access restrictions

Backup/recovery requirements

Special processing considerations

If in a distributed environment, form of partitioning, schematic showing number/location of replications for each partition.

authors developed 19 rules by which automated DFDs might be evaluated. The most checked by any of the CASE tools evaluated was 13 (by two CASE tools); the least rules checked was three; the aver- age was eight. The extent of intelligence in CASE obviously varies and is inconsistent with the collec- tive wisdom about how DFDs should be developed and drawn.

Thus, there are many CASE tools available which 'support' structured analysis. The tools vary widely in the diagrams supported and in the extent to which rules about developing DFDs and other diagrams are enforced.

SUMMARY ____ ~ __ _ Process-oriented structured analysis originated in the work of DeMarco, Gane and Sarson, and Yourdon. In structured analysis, we first define the application context then follow a top-down approach to progressively more detailed levels of process analysis. The application is documented

(Text continues on page 274)

Summary 271

TABLE 7-7 ABC Data Store Contents

Data Store Name: Customer File Data Store Name: Rental File

Aliases: None Aliases: None

Definition: A computer file of information Definition: A computer file of rental orders about customers who are outstanding. When a rental is allowed to rent from ABC. made, it is added to the file.

Data Attributes: Customer Phone = When it is returned, if there are

[Area code + exchange no late fees, it is removed. If

+ number] there are late fees, the rental

+ Customer Last Name stays on file until the late fees

+ Customer First Name are paid.

+ Customer Address line 1 Data attributes: Customer Phone + Customer Address line 2 + Customer Last Name + Customer City + Customer First Name + Customer State + Rental Date + Customer Zip+4 + Video ID + Credit Card Type + Video Title + Credit Card Number + Date Due + Credit Card Expiration Date + Date Returned + Date of entry + Rental Price

Data Structure: Relational + Late Fees

Organization: Random Data Structure: Relational

Sequence: Entry Organization: Random

Sequence Attributes: N/A Sequence: Entry

Record Size: 198 Bytes decompressed Sequence Attributes:

File Size: Size: 134

Primary Key: Customer Phone Primary Key: Customer Phone + Video ID

Alternate Keys: Address line 1 Alternate Keys:

Index Attributes: Customer last name, Customer Index Attributes: Customer Last Name, Customer

zip, Credit Card Number, Phone, Video ID, Customer

Address line 1 Phone+ Video ID, Video Title

Volume: Volume:

Percent Change: Percent Change:

Cycle Frequency: Cycle Frequency:

Growth: Growth:

Allowable actions Allowable actions

by process: by process: Rental = Add, Change, Read

Return = Change, Delete, Read Security Access:

Security Access: BackuplRecovery:

Backup/Recovery: Special processing:

Special processing:

TABLE 7-8 Data Dictionary Data Flow Contents

Data Flow Name

Aliases

Timing (e.g., as occurs, daily, weekly, etc.)

Contents

Constraints (e.g., requires 5-second response; only occurs for sales orders, etc.)

Source

TABLE 7-9 ABC Data Flow Dictionary Entries

Data Flow Name: New Customer Data Flow Name: Aliases: None Aliases: Timing: As occurs Timing: Contents: Customer Phone = Contents:

[Area code + exchange Constraints:

+ number] + Customer Last Name Source:

+ Customer First Name + Customer Address line 1 Data Flow Name: + Customer Address line 2 Aliases: + Customer City

Timing: + Customer State + Customer Zip+4 Contents:

+ Credit Card Type Constraints: + Credit Card Number Source: + Credit Card Expiration Date + Date of entry Data Flow Name:

Constraints: None Aliases: Source: Customer Timing:

Data Flow Name: Rental Contents:

Aliases: Rental Information Constraints:

Timing: As Occurs Source:

Contents: [Customer Phone I Customer Name] Data Flow Name: + l{[Video IS I Video Name]}m

Aliases: Constraints: None

Source: Customer Timing:

Contents:

Constraints:

Source:

272

Payment

Money

One per complete rental transaction

Total Paid

None

Customer

Copy of Order

Printed Rental Order

One per complete rental transaction

= Rental

None

System

Return

Video Return

As Occurs

Video ID + (Customer Phone)

None

Customer

Late Fee Payment

None

As Occurs

Total Late Fee

May be included within rental payment

Customer

Summary 273

TABLE 7-10 Data Dictionary Attribute Contents

Attribute User Name

System Name

Aliases

Attribute Definition

Data Type

Data Length

Allowable values and meanings

Creating Process( es)

Primary Data Store

Other files where stored

Flows where used

EditNalidation Rules

Validation Method (e.g., cross- reference file, code check, etc.)

Security access restrictions

Special processing considerations

TABLE 7-11 Sample ABC Attribute Dictionary Entries

User Name:

System Name:

Aliases:

Attribute Definition:

Data Type:

Data Length:

Allowable values and meanings:

Customer Phone

CPhone

None

The customer's phone number

Numeric

10, Area code (3), exchange (3), and number (4)

Numeric

Creating Process( es): Add custor

Primary Data Store: Customer

Other Files: Rental File

Flows: New rental order Customer record Rental Return rental information

EditNalidation: Numeric

Validation Method: Software check

Security Access: None

Special processing: None

User Name:

System Name:

Aliases:

Attribute Definition:

Data Type:

Data Length:

Allowable values

Video ID

None

The numeric identifier for a spe- cific videotape. Uniquely identi- fies a copy of a group of tapes with the same title.

Numeric

and meanings: Numeric

Creating Process(es): 4.1 Create video

:Primary Data Store: Video File

Other Files: Rental File

Flows:

EditNalidation:

Validation Method:

Security Access:

Video Information, Rental Information, Return rental information, New Rental Order

Numeric

Software check

None

Special processing: None

274 CHAPTER 7 Process-Oriented Analysis

TABLE 7-12 ABC Attributes-Short Form Dictionary

User Name System Name Data Type Length EditNalidation Rules

Customer Phone CPhone N

Customer Last Name CLast A

Customer First Name CFirst A

Customer Address CLine 1 A/N Line 1

Customer Address CLine2 A/N Line 2

Customer City City A

Customer State State A

Customer Zip Zip N

Credit Card Type CCType A

Credit Card Number CCNo N

Credit Card Expiration CCExp N Date

Date of Entry EntryDate N

Credit Rating CCredit A

via graphical forms including a context diagram, lev- eled set of data flow diagrams, a data dictionary, and, optionally, a state-transition diagram. Diagram sym- bols and their meanings include (1) circle, entire ap- plication; (2) square, external entity; (3) rounded vertical rectangle, process; (4) open ended rectangle, data store, and (5) directed arrow, data flow. Each di- agram symbol has a formal definition that is docu- mented in a data dictionary. DFDs identify processes and the flow of data through those processes to achieve some business function. DFDs start at a high level of abstraction to summarize the processing tak- ing place. At successively more detailed levels, pro- cedural and data are added to describe the processing

10 Must be present, Check for numeric

50 Must be present, Check for alpha

25 Must be present, Check for alpha

50 Must be present

50 None

30 Must be present, Check for alpha

2 Post Office Abbreviation

10 Must be present, numeric

1 A=AmExpress V=Visa M=Mastercard

17 Must be present, numeric

8 Valid Date, Format YYYYMMDD

1 o = OK, 1 = not OK

in more detail. Graphical representation replaces much of the text, but does not completely replace text descriptions of individual processes. The data dictionary (or repository) is used to maintain defini- tions of all DFDs and other analysis information, including files, fields, flows, and external entities, in addition to processes.

The reasoning process for defining the applica- tion context and the detailed levels of data flow dia- grams was presented. The definitions and contents of data dictionary entries were described. All diagrams and dictionary entries were developed using the ABC rental processing application to show varia- tions and nuances in the thought processes.

TABLE 7-13 CASE Support for Structured Analysis

Product

Analyst/Designer Toolkit

Anatool

Deft

Design/1

The Developer

Excelerator, Telon

lEW

MacAnalyst, MacDesigner

Maestro

MetaSystem Tool Set

Company

Yourdon, Inc. New York, NY

Advanced Logical SW Beverly Hills, CA

Deft Ontario, Canada

Arthur Anderson, Inc. Chicago,IL

ASYST Technology, Inc. N apierville, IL

Intersolv Cambridge, MA

Knowledgeware Atlanta, GA

Excel Software Marshalltown, IA

SoftLab San Francisco, CA

Meta Systems Ann Arbor, MI

Summary 275

Technique

Context Diagram Data Flow Diagram (DFD) State-Transition Diagram

DFD Structured English

DFD

DFD Warnier-Orr Diagram

DFD Matrix Diagram (for decision tables and real-time systems)

DFD State-Transition Diagram Matrix graph (for real-time systems)

DFD Database diagram

DFD Decision Table State Transition Diagram Structured English

DFD

(Continued on next page)

276 CHAPTER 7 Process-Oriented Analysis

TABLE 7-13 CASE Support for Structured Analysis, Continued

Product

Multi-Cam

PacBase

ProKitVVorkbench

ProMod

Silverrun

SVV Thru Pictures

System Engineer

Teamwork

Transform

Visible Analyst

vs Designer

Company

AGS Management Systems King of Prussia, PA

CGI Systems, Inc. Pearl River, NY

McDonnell Douglas St. Louis, MO

Promod, Inc. Lake Forest, CA

Computer Systems Advisers, Inc. VVoodcliff Lake, NJ

Interactive Dev. Env. San Francisco, CA

LBMS Houston, TX

CADRE Tech. Inc. Providence, RI

Transform Logic Corp. Scottsdale, AZ

Visible Systems Corp. Newton, MA

Visual Software Inc Santa Clara, CA

Technique

DFD State-Transition Diagram Matrix graph (for real-time systems)

Context Diagram DFD

DFD

DFD State-Transition Diagram

User-Controlled Modeling

Data Structure DFD State Transition Diagram

DFD

Decision Table DFD State Transition Diagram

Uses Pro Kit, Excelerator

DFD

DFD VVard-Mellor Diagram for real- time systems

REFERENCES -------.--- Curtis, B., M. I. Kellner, and J. Over, "Process model-

ing," Communications of the ACM, Vol. 3S, #9, Sep- tember 1992, pp. 7S-90.

DeMarco, Tom, Structured Analysis. New York: Yourdon Press, 1979.

Frances, B., "A window into CASE," Datamation, March 1, 1992, pp. 43-44.

Gane, C., and T. Sarson, Structured Systems Analysis: Tools and Techniques. Englewood Cliffs, NJ: Pren- tice-Hall, 1979.

Gane, Chris, Computer-Aided Software Engineering: The Methodology, The Products and the Future. Englewood Cliffs, NJ: Prentice-Hall, 1990.

Krasner, J., J. Terrel, A. Lindhan, P. Arnold, and W. H. Ett, "Lessons learned from a software process model- ing system," Communications of the ACM, Vol. 3S, #9, September 1992, pp. 91-100.

Lee, T., "Bridging the CASE/OOP gap," Datamation, March 1, 1992, pp. 63-64.

Lindholm, E. "A world of CASE tools," Datamation, March 1, 1992, pp. 7S-8l.

Martin, James, Systems Design from Provably Correct Constructs. Englewood Cliffs, NJ: Prentice-Hall, 1985.

McClure, c., The Three R's of Software Automation: Re-Engineering, Repository and Reusability. Engle- wood Cliffs, NJ: Prentice-Hall, 1992.

McMenamin, Stephan M., and John F. Palmer, Essential Systems Analysis. NY: Yourdon Press, 1984.

Slater, D., "PacBase, IEF lead rising CASE satisfaction," Computerworld, August 3, 1992, p. 8l.

Sullivan, Louis, "The tall building artistically considered," Lippincott's Magazine, March 1896.

Vessey, I., S. Jarvenpaa, and N. Tractinsky, "Evaluation of vendor products: CASE tools as methodology com- panions," Communications of the ACM, Vol. 3S, #4, April 1992, pp. 90-10S.

Yourdon, Edward, Modern Structured Analysis. Englewood Cliffs, NJ: Prentice-Hall, Yourdon Press, 1989.

KEy TERMS attribute balancing

--------- bottom-up cohesion

completeness checking consistency checking context context diagram correctness checking coupling cross reference file data attribute data dictionary data flow data flow diagram (DFD) data store direct identification elementary components external entity field file function

Study Questions 277

functional sequence level 0 DFD level 1 ... n DFD leveled set of DFDs net inflows and outflows outward-in primitive level process process description pseudo-code quality assurance structured decomposition structured English structured systems analysis systems model systems theory top-down

EXERCISES _______ _

1. Complete the level 1 DFD for 2.0 Rental return process and discuss it in class. Compare several of the answers. Are they the same? Why, or why not?

2. Make a list of outstanding and deferred issues to discuss with Vic.

The next three questions have bothered my stu- dents for several years. For each question, identify and discuss the issues and ramifications of each decision, technical issues, user issues, legal or other issues.

3. How should customers be identified to the appli- cation? What are the security issues? What are the bureaucracy issues? Is there a way to 'mini- mize bureaucracy' and still have good security?

4. Should late fees relate to a person or a tape or a rental? What are the issues? How do you decide? Can Vic be helpful in deciding this issue?

5. Where should history get created-at tape rental time? or at tape return time? Can Vic be helpful in deciding this issue? How do you decide?

278 CHAPTER 7 Process-Oriented Analysis

STUDY QUESTIONS ____ _

1. Define the following terms: balancing external entity context function data flow net inflow data store top-down direct identification

2. How do you define the scope of a project? Who should define the scope?

3. What is a leveled set of DFDs? How do you know you have that?

4. Why is the strategy of using net inflows and outflows from the previous level of DFD as a starting point for a new level of detail a good idea?

5. Is structured process analysis more like analyz- ing with a zoom feature on a set of photos or more like analyzing a geologic formation?

6. Define structured decomposition. Why do you use this technique?

7. What is the purpose of the data dictionary? 8. Discuss the reasoning process used in struc-

tured analysis. Does it guarantee that everyone will get the same analytical result? If not, why not?

9. How might the process of structured analysis be improved to be more rigorous, i.e., guaran- tee the same results regardless of who performs the analysis?

10. Evaluate the following diagram. What type of diagram is it? What is its purpose? Label errors and list all reasons why they are wrong. Redraw the diagram correctly.

1.0 1.1 Cust Info

Customer .. Maintain Customer p Customer Reporting

Error \ValidCust/ Cust Info Records

~I Customer File

11. What are the major diagrams in the analysis phase? How are they derived?

12. List and briefly describe the five approaches to identifying processes.

13. Describe all data dictionary entries and give an example of each.

14. Why might CASE tools be useful in structured analysis?

15. Draw and identify five common DFD errors and their corrections.

16. Discuss the three types of quality checks done on DFDs.

* EXTRA-CREDIT QUESTIONS 1. The example used in Figures 7-15 through 7-20

refers to a psychiatric clinic and processing performed for Medicaid claim processing. Per- form a structured analysis of this problem as described in the Appendix Case: The Child Development Clinic. Refer to the figures in the text to help you if you get stuck.

2. Perform a structured analysis of any of the prob- lems in the Appendix. Decide what information in the problem description is relevant to an auto- mated application. Then, build a context dia- gram, a levels set of DFDs and a data dictionary.

CHAPT ER8

PROCESS- ORIENTED ---------------------------. __________ r----- DESIGN

------------------------------------------------

INTRODUCTION ____ _

Structured design is the art of designing system components and the interrelationships between those components in the best possible way to solve some well specified problem. The main goal of design is to map the functional requirements of the application to a hardware and software environment. The results of structured design are programming specifications and plans for testing, conversion, training, and in- stallation. In addition, the design may result in pro- to typing part or all of the application. This section discusses the mapping process and the development of program specifications. The other topics are dis- cussed in Chapter 14.

The goals of structured design, as first docu- mented by Yourdon and Constantine [1979], have not changed much over the years. They are to mini- mize cost of development and maintenance. We can minimize the cost of development by keeping parts manageably small and separately solvable. We can minimize the cost of maintenance by keeping parts manageably small and separately correctable. In design we determine the smallest solvable parts as a way of managing application complexity.

Conceptual Foundations The concept 'form follows function' that informed analysis is again the basis for structured design. The application processes determine the form of the application. The divide and conquer principle guides the definition of the smallest solvable parts while keeping the general goals of maintainability and low cost in mind. Partitioning and functional decomposi- tion are the basic activities used in dividing pro- cesses into modules. The basic input-process-output (IPO) model from the DFD results in a structure chart that adds a control component to the IPO model (see Figure 8-1).

Principles of good structured design are informa- tion hiding, modularity, coupling, and cohesion. Information hiding means that only data needed to perform a function is made available to that function. The idea is a sound one: You cannot mess up what you don't have access to. Modularity is the design principle that calls for design of small, self- contained units that should lead to maintainability. Following systems theory, each module should be a small, self-contained system itself. Coupling is a measure of intermodule connection with minimal

279

280 CHAPTER 8 Process-Oriented Design

IPO Model

Becomes the ControllPO or CIPO Model

FIGURE 8-1 Input-Process-Output Model and Structure Chart

coupling the goal (i.e., less is best). Cohesion is a measure of internal strength of a module with the notion that maximal, or functional, cohesion is the goal. These principles are related to the process of design in the next section.

DEFINITION OF ____ _ STRUCTURED ______ _ DESIGN TERMS ________ _

The major activities of structured design are:

1. Transform or transaction analysis of DFD 2. Refine and complete structure chart 3. Identify load units and program packages 4. Define the physical database 5. Develop program specifications

The terms associated with each of these activities are defined in this section and summarized in Table 8-1.

In design we partition the application to divide subprocesses into codifiable program modules. Par- titioning is the divide and conquer strategy by which

we divide existing subprocesses from the DFD into groups for implementation. The two methods of par- titioning used are transform analysis and transac- tion analysis.

DFD processes transform data from one form to another; these transformations will eventually be automated by programs each containing several modules. Transform analysis is the process of iden- tifying the c1usterings of subprocesses based on their major functions. The functions are either input, out- put, or transform-oriented. The input-oriented pro- cesses are called afferent flows. Afferent means bringing inward to a central part. Afferent processes read data and prepare it for processing. The output- oriented processes are called efferent flows, where efferent means moving away from the central part. Efferent processes write, display, and print data. The remaining processes are collectively called the cen- tral transform. The central transform processes have as their major function the change of informa- tion from its incoming state to some other state.

An example of a data flow diagram with its afferent and efferent flows and its central transform identified is shown in Figure 8-2. Notice that multi- ple afferent or efferent flow streams may be found.

TABLE 8-1 Definitions

Term

Stepwise refinement

Program morphology

Data structure

Modularity

Abstraction

Information hiding

Cohesion

Coupling

Structured Design Concept

Definition

The process of defining functions that will accomplish a process; includes definition of modules, programs, and data

The shape of a program, including the extent of fan-out, fan-in, scope of control, and scope of effect

The definition of data in an application includes logical data definition and physical data structure

A property of programs meaning they are divided into several separate addressable elements

Attention to some level of g~neraliza tion without regard to irrelevant low- level details

Design decisions in one module are hidden from other modules

A measure of the internal strength of a module

A measure of the intermodule strength ofa module

The streams are partitioned off from the rest of the diagram by drawing arcs showing where they end.

Examples of transform-centered applications include accounting, personnel, payroll, or order entry-inventory control. For these applications, get- ting data into and out of the system is secondary to the file handling and manipulation of n~mber& that keep track of the information. In accounting, for instance, balancing of debits and credits takes place at end-of-day, end-of-month, and end-of-year pro- cessing. These periodic process transfor-mations summarize and move data, erase some information, archive other information, and write data to the gen- eral ledger to summarize the details in the re- ceivables and payables subledgers. All of these transforms process data that is already in the files.

Definition of Structured Design Terms 281

These processes are the heart of accounting process- ing. Without these processes, the application would be doing something else.

Not all applications are transform-centered. Some applications do simple processing but have many different transaction types on which the simple processes are performed. These systems are called transaction-centered. Transaction analysis re- places transform analysis for transaction-centered applications with partitioning by transaction type, which may not be obvious from DFDs. Figure 8-3 shows an example of a partitioned DFD for a transaction-centered application. This detailed DFD looks like it contains redundancy because many of the same processes appear more than once. Look closely and you see that each set of processes relates to a different type of transaction.

When the high-level partitioning is done, the information is transferred to a first-cut structure chart. We will develop the structure chart from Fig- ure 8-2. A structure chart is a hierarchic, input- process-output view of the application that reflects the DFD partitioning. The structure chart contains one rectangle for each lowest level process on the DFD. The rectangles are arranged in a hierarchy to show superior control and coordination modules. Individual process modules are the lowest in their hierarchy. The rectangles in the hierarchy are con- nected via undirected lines that are always read top- down and left to right. The lines imply transfer of processing'from the top to the bottom of the hierar- chy. Diamonds overlay the connection when a con- ditional execution of a module is possible using if-then-else logic. Reused modules are shown in one of two ways. Either they are repeated several times on the diagram and have a slash in tqe lower left cor- ner to signify reuse, or they are connected to more than one superior module via the linking lines.

The identification of afferent flows, efferent flows, and transforms results in chains of processes, each its own 'net output.' If we look at Figure 8-2 again, we see the net afferent output is data flow Good Input. For the central transform, the net out- put is Splution. For the efferent flows, the net output is Printed Solution. These net outputs are used to determine the initial structure of the structure chart, using a process called factoring.

282 CHAPTER 8 Process-Oriented Design

Input

Afferent Flow

I Output Stream I

•

Central Transform

Efferent Flows

FIGURE 8-2 Transform-Centered DFD Partitioned

Factoring is the process of decomposing a DFD into a hierarchy of program components that will eventually be programmed modules, functions, or control structures. Each stream of processing is ana- lyzed to determine its IPO structure. When the structure is identified, the processes are placed on the structure chart and named until all low-level DFD processes are on the structure chart (see Fig- ure 8-4).

N ext, data and control information are added to the structure chart. Data couples identify the flow of

data into and out of modules and match the data flows on the DFD. Data couples are identified by a directed arrow with an open circle at the source end (see Figure 8-5). The arrowhead points in the direc~ tion the data moves.

Control couples identify the flow of control in the structure. Control couples are placed to show where the control data originates and which mod- ule(s) each couple affects. A control couple is usu- ally a program switch whose value identifies how a module is activated. Control couples are drawn as

Trans Customer t------1~

Afferent Flow

Definition of Structured Design Terms 283

Central Transform

Efferent Flow

Updated File

Thing Data

Process Coordinator

FIGURE 8-3 Transaction-Centered DFD Partitioned

directed arrows with a closed circle at the source end (see Figure 8-6). The arrowhead points in the direc- tion the control travels. If a control couple is in, set and reset in the same module, it is not shown on the diagram. A control couple that is set and reset in one place, but used in another, is shown. If a control cou- ple is set in one module and reset in another, it is shown as both input and output. Control is 'designed into' the application by you, the SE, based on the need for one module to control the processing of another module. The goal is to keep control to a min-

imum. Figure 8-4 shows the completed structure chart for the DFD in Figure 8-2.

Next, we evaluate and revise the structure chart to balance its morphology. Morphology means form or shape. The shape of the structure chart should be bal- anced to avoid processing bottlenecks. Balance is determined by analyzing the depth and width of the hierarchy, the skew of modules, the span of control, the scope of effect, and the levels of coupling and cohesion. When one portion of the structure chart is unbalanced in relation to the rest of the diagram, you

284 CHAPTER 8 Process-Oriented Design

FIGURE 8-4 First-Cut Structure Chart

modify the structure to restore the balance, or pay closer attention to the unbalanced portion to ensure an efficient production environment.

The depth of a hierarchy is the number of lev- els in the diagram. Depth by itself is not a measure of good design nor is it a goal in itself. Rather, it can indicate the problem of too much communication overhead and not enough real work taking place (see Figure 8-7). Conversely, adding a level of depth can be a cure for too wide a hierarchy.

1 Data Couple

Incoming <P Data ,

The line between the modules shows transfer of processing.

• Outgoing cb Data

FIGURE 8-5 Data Couple Notation

The width of the hierarchy is a count of the modules directly reporting to each superior, higher- level module (see Figure 8-8). Span of control is another term for the number of immediate sub- ordinates and is a synonym for the width of the hierarchy. Width relates to two other terms: fan- out and fan-in. Fan-out is the number of imme- diate subordinate modules. Too much fan-out can identify a processing bottleneck because a supe- rior module is controlling too much processing.

Incoming ~ Control

Info

Control Couple

1 The line between the modules shows transfer of processing.

Outgoing Control

Info

FIGURE 8-6 Control Couple Notation

FIGURE 8-7 Excessive Depth of Hierarchy

While there is no one number that says 'how wide is too wide,' seven ±2 is the generally accepted guideline for number of fan-out modules. One solu- tion to fan-out processes that are functionally related is to factor another level of processing that provides middle-level management of the low- level modules. Another solution to fan-out problems that are factored properly, but not functionally re- lated, is to introduce a new control module at the IPO level.

FIGURE 8-8 Excessive Width of Hierarchy

Definition of Structured Design Terms 285

Fan-in, on the other hand, is the number of superior modules (i.e., immediate bosses) which refer to some subordinate module (see Figure 8-9). Fan-in can be desirable when it identifies reusable components and reduces the total amount of code produced. The major tasks with fan-in modules are to ensure that they perform a whole task, are highly cohesive, and are minimally coupled.

Skew is a measure of balance or lopsidedness of the structure chart (see Figure 8-10). Skew occurs

286 CHAPTER 8 Process-Oriented Design

FIGURE 8-9 Example of Fan-In

when one high-level module has many subordinate levels and some or most of the other high-level mod- ules have few subordinate levels. Skew can indicate incorrect factoring. If factoring is correct, then skew identifies a driver for the application that might require special consideration. If the skew is on the input side, we say the application is input driven or input-bound. Similarly, if the skew is on the output side, the application is output-bound. If the input and output are skewed with little transform process- ing, the application is I/O-bound (for input/output). Finally, if the application has little input or output, but lots of processing, the application is process- bound. The special considerations of each of these occurrences deal with ensuring correct language se- lection and meeting I/O and process time constraints.

The scope of effect of a module identifies the col- lection of modules that are conditionally processed based on decisions by that module (see Figure 8-11). The scope of effect can be identified by count-

ing the number of modules that are directly affected by the process results of another module. High scope of effect relates to fan-out, fan-in, and coupling in that it may identify potential problems with debug- ging and change management. Ideally, the scope of effect of anyone module should be zero or one. That is, no more than one other module should be affected by any processing that takes place in any other module.

The last measures of structure morphology which are analyzed throughout the remainder of structure design are coupling and cohesion. Cohesion is a measure of the intramodule strength. Coupling is a measure of the intermodule linkage. Maximal, func- tional cohesion and minimal coupling are the ideal relationships. Coupling and cohesion are related in- versely (see Figure 8-12). If cohesion is high, cou- pling is low, and vice versa; but, the relationship is not perfect. That means that if you have strong co- . hesion, you may still have strong coupling due to

Process-Bound

'Process-Skewed'

Output-Bound

'Output-Skewed'

Definition of Structured Design Terms 287

Input-Bound Application

'Input-Skewed'

FIGURE 8-10 Examples of Skewed Structure Charts

288 CHAPTER 8 Process-Oriented Design

Pathological Connection

End-of-file Sw

Normal Connection

FIGURE 8-11 Example of Scope of Effect

High

Cohesion

Low

Coupling

FIGURE 8- 12 Relationship between Coupling and Cohesion

High

poor design. So, attention to both coupling and co- hesion are required.

Factoring and evaluation are followed by func- tional decomposition, which is the further division of processes into self-contained IPO subprocesses. Balanced structure chart subprocesses might be fur- ther decomposed to specify all of the functions required to accomplish each subprocess. Fan-out, span of control, and excessive depth are to be avoided during this process. 1 The decision whether

1 Some companies have as a local convention (a policy in their company) that a lower-level DFD is developed to describe programmable individual functions before partitioning. This is decomposition at the DFD level and has the same effect as decomposition here.

to decompose further or not relates to the details needed for the implementation language and how well the SEs understand the details.

Structure charts are only one of many methods and techniques for documenting structured design results. Most of the alternatives would replace, rather than supplement, structure charts. Each technique has its own slightly different way of thinking about the processes to finalize a design, even though the goals are the same. Several alternatives are IBM Hierarchic input-process-output diagrams (HIPO) (see Figure 8-13), Warnier diagrams (see

Definition of Structured Design Terms 289

Figure 8-14), Nassi-Schneiderl11an diagrams (see Figure 8-15), and flow charts (see Figure 8-16).

To complete design, program specifications (spe- cifications is abbreviated to 'specs') must be devel- oped, but before specs can be developed, several other major activities are required. First, the physical database must be designed. Then, program package units are decided. Several activities not discussed here (these are covered in Chapter 14) are per- formed, including verification of adequate design for inputs, outputs, screens, reports, conversion, con- trols, and recoverability.

In all methods of documentation, the starting point is a structure chart.

IprodLlce X I

I I I Get input II Make X II

Visual Table of Contents 1.0 Produce X

1.1 Get Input 1.2 Make X 1.3 Put X

1.3.1 Format 1.3.2 Write

INPUT

Input Data

I ~ Format

PROCESS

Get Input

Make X

PutX

I PutX

I I Print I

OUTPUT

Output File

Output Report

FIGURE 8- 13 Other Structured Program Documentation Methods: IBM's Hierarchic Input- Process-Output (HIPO) Diagram Example

290 CHAPTER 8 Process-Oriented Design

Get Input (n)

Produce X

Make X (n)

PutX

{

Disk

Format .;)

Write -- Disk (0,1)

Legend:

EB Name Name

Either/or

Not Name to be performed Name to be performed

(1) Execute ( ) times, here 1 (0,1) Execute zero or one times (n ) Execute n times

1. Warnier, J-D., Logical Construction of Systems. NY: Van Rostrand Reinhold Company, 1981.

FIGURE 8-14 Warnier Diagraml

Do while

General Form of Nassi-Schneiderman

Physical database design is concurrent with fac- toring and decomposition. Several common physical database design activities are:

• design user views (if this is not already done) • select the access method • map user views to the access method and

storage media • walk-through the database design • prototype the database • document and distribute access information to

all team members • train team members in access requirements • develop a test database • develop the production database

Keep in mind that many other activities may be involved in designing a physical database for a spe- cific implem~ntation environment.

While the details of physical database design and decomposition are being finalized, project team members are also thinking about how to package the modules into program units. A program unit or a program package is one or more called modules,

Do until end-of-file = 1

Example of Nassi-Schneiderman Diagram

2. Nassi, L, and B. Schneiderman, "Flowchart techniques for structured programming," ACM SIGPLAN Notices, Vol. 8, #8, August 1973, pp. 12-26.

FIGURE 8-15 Nassi-Schneiderman2 Diagram Example

Flowchart Symbols

o D D ( )

Iteration

Selection (If ... then ... else)

Process or Module

Input/Output

Terminator, i.e., start/stop

Secondary Storage, e.g., disk

Definition of Structured Design Terms 291

Structured Constructs

Sequence

Selection

Flowchart Example

FIGURE 8-16 Flowchart Symbols, Structured Constructs, and Example

292 CHAPTER 8 Process-Oriented Design

functions, and in-line code that will be an execute unit to perform some atomic process. In nonreal- time languages, an execute unit is a link-edited load module. In real-time languages, an execute unit identifies modules that can reside in memory at the same time and are closely related, usually by mu- tual communication. The guiding principles during these design activities are to minimize coupling and maximize cohesion (see Tables 8-2 and 8-3 for defi- nition of the seven levels of coupling and cohesion).

code is the structured program code that controls and sequences execution of modules and functions. For instance, a 'read' module might do all file access; a screen interaction module might do all screen pro- cessing and have submodules that perform screen input and screen output.

A function is an external 'small program' that is self-contained and performs a well-defined, limited procedure. For example, a function might compute a square root of a number. Functions usually do not call other modules but there is no rule against it. Even though the definitions of modules and func- tions are similar, they are different entities. Func- tions sometimes come with a language, for instance, the mathematical and statistical functions that are part of Fortran. Modules are usually user-defined and have a broader range of applicability, such as a

An atomic process is a system process that can- not be further decomposed without losing its system- like qualities. An execute unit is a computer's unit of work (i.e., a task). A module is a 'small program' that is self-contained and may call other modules. Modules may be in-line, that is, in the actual pro- gram, or may be externally called modules. In-line

TABLE 8-2 Definition of Cohesion Levels

Type of Cohesion

Functional

Sequential

Communicational

Procedural

Temporal

Logical

Coincidental

Definition

Elements of a procedure are combined because they are all required to complete one specific function. This is the strongest type of cohesion and is the goal.

Elements of a common procedure are combined because they are in the same procedure and data flows from one step to the next. That is, the output of one module, for example, is passed in sequence as input to the next module. This is a strong form of cohesion and is acceptable.

Elements of a procedure are combined because they all use the same data type. Modules that all relate to customer maintenance-add, delete, update, query-are related through com- munication because they all use the Customer File.

Elements of a common procedure are combined because they are in the same procedure and control flows from one step to the next. This is weak cohesion because passing of control does not mean functions in the procedure are related.

Statements are together because they occur at the same time. This usually refers to program modules, for example, 'housekeeping' in COBOL programs to initialize variables, open files, and prepare for processing. Temporal cohesion is weak and should be avoided wher- ever practical.

The elements of a module are grouped by their type of function. For instance, all edits, all reads from files, or all input operations are grouped. This is undesirable cohesion and should be avoided.

This is the random or accidental placement of functions. This lowest level of cohesion occurs when there is no real relationship between elements of a module. This is undesirable cohesion and should be avoided.

Process Design Activities 293

TABLE 8-3 Definition of Coupling Levels

Level of Coupling

Indirect relationship

Data

Stamp

Control

External

Common

Content

Definition

No coupling is possible when modules are independent of each other and have neither a need nor a way to communicate. This is desirable when modules are independent. An example of no direct relationship is a date translate routine and a net present value rou- tine. There is no reason for them to be related, so they should not be related.

Only necessary data are passed between two modules. There are no redundant parame- ters or data items. This is the desirable form of coupling for related modules.

The module is given access to a complete data structure such as a physical data record when it only needs one or two items. The module becomes unnecessarily dependent on the format and arrangement of data items in the structure. Usually, stamp coupling implies external coupling. The presence of unneeded data violates the principal of 'information hiding' which says that only data needed to perform a task should be avail- able to the task.

Control 'flags' are shared across modules. Control coupling is normal if the setting and resetting of the flag are done by the same module. It is a pathological connection to be avoided if practical when one module sets the flag and the other module resets the flag.

Two modules reference the same data item or group of items such as a physical data record. In traditional batch applications, external coupling is unavoidable since data are passive and not directly relating to modules. External coupling is to be minimized as much as possible and avoided whenever practical. External coupling violates the princi- pal of information hiding.

Modules have access to data through global or common data areas. This is frequently a language construct problem but it can be avoided by passing parameters with only a small amount of additional work. Common coupling violates the principal of information hiding.

One module directly references and/or changes the insides of another module or when normal linkage mechanisms are bypassed. This is the highest level of coupling and is to be avoided.

screen interaction module. Functions are usually reusable across applications without alteration; mod- ules are not.

tions) relationships and communication; the other documents intraprogram processing that takes place within the individual program. Another term for interprogram relationships is interface. When program packages are decided, program

specifications are developed. Program specifica- tions document the program's purpose, process requirements, the logical and physical data defini- tions, input and output formats, screen layouts, con- straints, and special processing considerations that might complicate the program. Keep in mind that the term program might also mean a module within a program or an externally called function. There are two parts to a program specification: one identifies interprogram (including programs in other applica-

PROCESS ____________ __ DESIGN ____ -----'-____ _ ACTIVITIES ______ _

The steps in process design are transform (or trans- action) analysis, develop a structure chart, design the physical database, package program units, and write

294 CHAPTER 8 Process-Oriented Design

program specifications. Each of these steps is dis- cussed in this section.

Since both transform and transaction analysis might be appropriate in a given system, the first activity is to identify all transactions and determine if they have any common processing. This activity can be done independently from the DFD and func- tional analysis, or it can be done as a side activity while you are doing functional analysis as the pri- mary activity. If you cannot tell which is more appropriate, do a rough-cut structure chart using both methods and use the one which gives the best overall results in terms of coherence, understand- ability, and simplicity of design.

Transaction Analysis Rules for Transaction Analysis

The basic steps in transaction analysis are to de- fine transaction types and processing, develop a structure chart, and further define structure chart ele- ments. A detailed list of transaction analysis activi- ties follows.

1. Identify the transactions and their defining actions.

2. Note potential situations in which modules can be combined. For instance, the action is the same but the transaction is different-this identifies a reusable module.

3. Begin to draw the structure chart with a high- level coordination module as the top of the transaction hierarchy. The coordination mod- ule determines transaction type and dis- patches processing to a lower level.

4. For each transaction, or cohesive collection of transactions, specify a transaction module to complete processing it.

5. For each transaction, decompose and create subordinate function module(s) to accom- plish the function( s) of the transaction. If a transaction has only one unique function, then keep the unique action as part of the transaction module identified in the previous step.

6. For functions that are not unique, decompose them into common reusable modules. Make sure that the usage of the module is identical for all using transactions. Specifically iden- tify which transactions use the module.

7. For each function module, specify subordi- nate detail module(s) to process whole detail steps as appropriate. If there is only one func- tional detail step, keep it as part of the func- tion module defined in step 5.

A typical transaction application is money trans- fer for banks. Transactions for money transfer all have the same information: sending bank, receiving bank, sender, receiver, receiver account number, and amount. There might be other information, but this is required. What makes money transfer a transaction system is that transactions can come from phone, mail, TWX!felex, fax, BankWire, FedWire, and pri- vate network sources. Each source of transaction has a different format. Phone, mail, and fax are all es- sentially manual so the application can require a per- son to parse the messages and enter them in one format. The other three are electronic messaging sys- tems to be understood electronically. TWX/telex, which are electronic free-form messages, may have field identifiers but have no required order to the information. A summary DFD for a money transfer system might look like Figure 8-17, which shows a deceptively simple process. What makes the process difficult is that the data entry-parse-edit processes are different for each message type, having differ- ent edit criteria, formats, and acceptance parameters. The partitioning for the transaction DFD can be either a high-level summary or detailed. The sum- mary partition (see Figure 8-17) shows afferent flows on the summary DFD, which is annotated that structuring is by transaction type. The detailed DFD (see Figure 8-18) shows each type of transaction with its own set of afferent and efferent flows.

To create a first-cut structure chart, one control module is defined for each transaction's afferent stream and efferent stream; there may be only one transform center. For each transaction, the afferent data flows are used to define data couples. The control couples relate to data passed between modules. When control is within a superior mod-

Afferent Flow

Raw Id'd Trans

Customer

Process Design Activities 295

4.0

Process t--t--~ Edited Trans

Central Transform Ack

Trans

Efferent Flow

FIGURE 8-17 Summary Money Transfer DFD Partitioned

ule, it is shown via a diamond to indicate selec- tion from among the transaction subprocesses (see Figure 8-19).

ABC Video Example Transaction Analysis

The first step to determining whether you have a transaction application or a transform centered application is to identify all sources of transactions and their types. Table 8-4 contains a list of transac- tions for ABC Video. As you can see from the list, there are maintenance transactions for customer and video information, there are rental and return trans- actions, and there are periodic transactions. The only common thread among the transactions is that they share some of the same data. The processing in which they are involved is different and there are no commonalities except reading and writing of files. Therefore, we conclude that ABC Video Rental pro- cessing is not a transaction-centered application and

move to transform analysis to complete the struc- ture chart.

Transform Analysis Rules for Transform Analysis

In transform analysis we identify the central trans- form and afferent and efferent flows, create a first- cut structure chart, refine the chart as needed at this high level, decompose the processes into functions, and refine again as needed. These rules are summa- rized as follows:

1. Identify the central transform 2. Produce a first-cut structure chart 3. Based on the design strategy, decompose the

processes into their component activities 4. Complete the structure chart 5. Evaluate the structure chart and redesign as

required.

296 CHAPTER 8 Process-Oriented Design

Mail, Phone Transaction

TWX!Telex Transaction

BankWire Transaction

FedWire Transaction

Afferent Streams Efferent Streams

FIGURE 8-18 Detailed Money Transfer DFD Partitioned

To properly structure modules, their interrelation- ships and the nature of the application must be well understood. If a system concept has not yet been decided, design cannot be finalized until it is. The concept includes the timing of the application as batch, on-line or real-time for each process, and a definition of how the modules will work together in production. This activity may be concurrent with transform analysis, but should have been decided to structure and package processes for an efficient pro- duction environment. This activity is specific to the application and will be discussed again for ABC rental processing.

First, we identify the central transform and affer- ent and efferent flows. Look at the DFD and locate each stream of processing for each input. Trace each stream until you find the data flow that identifies valid, processable input that is the end of an affer- ent stream. The afferent and efferent arcs refer only to the processes in the diagram. During this part of the transform analysis, files and data flows are ig- nored except in determining afferent and efferent flows.

After identifying the afferent flows, trace back- ward from specific outputs (files or flows to entities) to identify the efferent flows. The net afferent and

FIGUI<E 8-19 Sample Transaction Control Structure

TABLE 8-4 ABC Transaction List

Transaction General Process

Add Customer Maintenance

Change Customer Maintenance

Delete Customer Maintenance

Query Customer Periodic

Add Video Maintenance

Change Video Maintenance

Delete Video Maintenance

Query Video Periodic

Rent Video Rent/Return

Return Video Rent/Return

Assess special charges Rent/Return

Query Periodic

Create History Periodic

Generate Reports Periodic

Process Design Activities 297

Data

Customer

Video

Video, Customer, History

Customer

Video, Customer, History

298 CHAPTER 8 Process-Oriented Design

Afferent Flows

Process Coordination

6.0 Update Master

from Tran

Central Transform

New Master Record

Efferent Flow

FIGURE 8-20 Master File Update DFD Partitioned

efferent outputs are used to determine the initial structure of the structure chart, using a process called factoring. Factoring is the act of placing each unbroken, single strand of processes into its own control structure, and of creating new control processes for split strands at the point of the split. The new control structure is placed under the input, process, or output controls as appropriate.

A master file update is shown as Figure 8-20 to trace the streams. In this diagram, we have two afferent data streams which come together at Match Trans to Master. The first input, Trans Data flows through process Get Trans and through Edit Trans to become Edited Trans. Successfully edited transac- tion parts flow through Collect Transactions to become Logical Trans Record.

The second input stream deals with the master file. The Master Record is input to Get Master Record; successfully read master records flow through the process. Once the Logical Trans Record and Master Record are both present, the input trans- formations are complete. These two afferent streams completely describe inputs, and the arc is drawn over the Logical Trans Record and Master Record data flows (see Figure 8-20).

The two streams of data are first processed to- gether in Match Trans to Master. Information to be updated flows through Update Master from Trans to become Updated Master. The error report coming from the match process is considered a trivial out- put and does not change the essential transform na- ture of the process. The argument that Match Trans

to Master is part of the afferent stream might be made. While it could be treated as such, the input data is ready to be processed; that is, transactions by themselves, master records by themselves, and transactions with master records might all be pro- cessed. Here, we interpret the first transformation as matching.

The data flow out of Update Master from Trans is a net outflow, and Write New Master is an efferent process. The efferent arc is drawn over the data flow Updated Master.

Next, we factor three basic structures that relate to input-process-output processing (see Figure 8-21). If there is more than one process in a stream, get- ting the net output data may require some inter- process coordination. The coordination activities are grouped and identified by a name that identifies the

Process Design Activities 299

net output data. So, in the example, the input stream is Get Input; the transform stream is Process; the output stream is Write New Master. Each stream rep- resents the major elements of processing. Because the process and input streams both are compound, each has at least two streams beneath them-one for each sequential process stream to reach the net out- put data.

Notice that the DFD process names identify both data and transformation processes. Make sure that the lowest-level names on the structure chart are identical to the name on the data flow diagram to simplify completeness checking.

Notice also that there is transformation process- ing within the afferent and efferent streams. Modules frequently mix input/output and transform process- ing, and there is no absolute way to distinguish into

Master File Update

Get Input

Edited Trans

Get Complete

Transaction

Get Trans

Process

; Master EOF

Master 1 Master 1 1 Record Record 1 Master 1 Edited EOF Trans

Get Master Match Trans Record to Master

Card cp .. Edited EditeJP .. Edited Image, cb Card Card-' cb Trans

Edit Trans

Collect Transactions

FIGURE 8-21 Master File Update Structure Diagram

Updated Master Record

1 t Write New Master

Master EOF

Updated Master Record

Update from

Master

300 CHAPTER 8 Process-Oriented Design

which stream the module belongs. The rule of thumb is to place a module in the stream which best describes the majority of its processing.

Once the module is on the structure chart, we specifically evaluate it to ensure that it meets the principles of fan-out, span of control, maximal cohe- sion, and minimal coupling. If it violates even one principle, experiment with moving the module to the alternative streams and test if it better balances pro- cessing, without changing the processing. If so, leave it in the new location; otherwise note that the unbalanced part of the structure chart may need spe- cial design attention to avoid production bottlenecks.

Decompose the structure chart entries for each process. The three heuristics to guide the decompo- sition are:

• Is the decomposition also an IPO structure? If yes, continue; if no, do not decompose it.

• Does the control of the decomposed process- ing change? If yes, do not decompose it. If no, continue.

• Does the nature of the process change? That is, if the process is a date-validation, for instance, once it is decomposed is it still a date-validation? If no, continue. If yes, do not decompose it. In this example, I might try to decompose a date-validation into month-vali- date, day-validate, and year-validate. I would need to add a date-validate to check all three pieces together. Instead of a plain date-vali- date, I have ( a) changed the nature of the process, and (b) added control logic that was not necessary.

The thought process in analyzing depth is simi- lar to that used in analyzing the number of organi- zationallevels in reengineering. We want only those levels that are required to control hierarchic com- plexity. Any extra levels of hierarchy should be omitted. Now let us turn to ABC rental processing to do transform analysis and develop the structure chart.

ABC Video Example Transform Analysis

The decisions about factoring are based on the prin- ciples of coupling and cohesion, but they also

require a detailed understanding of the problem and a design approach that solves the whole problem. In ABC Video's case, we have to decide what the rela- tionships of rent, return, history, and maintenance processing are to each other. If you have not done this yet, now is the time to do it. Before we continue with design of transform analysis, then, we first dis- cuss the design approach and r~tionale.

DESIGN APPROACH AND RATIONALE. In Chapter 7, Table 7-5 identified the Structured Eng- lish pseudo-code for ABC's rental processing and we did not discuss it in detail. Now, we want to examine it carefully to determine an efficient, cohe- sive, and minimally coupled decomposition of the process. When we partition the ABC Level 0 DFD from Figure 7-26, customer and video maintenance are afferent streams, reports are efferent, and rental and return are the central transform& (see Figure 8-22). We will attend only to create and return rentals since they are the essence and hardest portion of the application.

There is a design decision to have return process- ing as a subprocess of rental processing that needs some discussion. Then we will continue with the design. The overall design could be to separate rentals and returns as two different processes, but are they? Think in the context of the video store about how the interactions with custpmers takes place. Customers return tapes previously taken out. Then they select tapes for rental and pay all outstanding fees, including current and past returns that gener- ate late fees. To have late fees, a tape must have been returned. 2 Rentals and returns are separated in time; they have separate actions taken on files. ABC has any combination of rentals with returns (with or without late fees) and open rentals. All open rentals are viewed during rental processing, but need not be during r"eturn processing. Adding a return date and late fees is a trivial addition. Returns could be

2 In a real video rental system, you would also have a delin- quent or exceptional charges process to add fees for lost and damaged tapes. We do not consider that complexity here as it does not materially add to the discussion.

3.0

Create Rental

Return, Payment

Order

Central Transform

Return Rental

Process Design Activities 301

End of Day Rental Summary

5.0 New Video Video

Vendor Maintain Video

Afferent

FIGURE 8-22 ABC Video Level 0 DFD Partitioned (same as Figure 7-26)

independent of rentals, so there are three design alternatives:

• Returns are separated from rentals. • Rentals are a subset of returns. • Returns are a subset of rentals.

If returns are separated from rentals, there would be two payment processes-one for the return and one for the rental. If a rental includes a return, this is not 'minimal bureaucracy' and is not desirable.

However, since returns can be done independently from rentals, the system should not require rental processing to do a return. This alternative is an acceptable partial solution, but the rest of the solu- tion must be included.

The second alternative is to treat rentals as part of the return process. This reasoning recognizes that a rental precedes a return. All returns would need a rental/no rental indicator entry and assume that more than 50% of the time, rentals accompany returns.

302 CHAPTER 8 Process-Oriented Design

Which happens more frequently-returns with rentals, or rentals without returns? Let's say Vic does not know and reason through the process. Since returns can be any of three ways, only one of which is with rentals, coupling them as rental-within-return should be less efficient than either of the other two choices.

Last, we can treat returns as part of the rental process. If returns are within rentals, we have some different issues. What information identifies the beginning of a rental? What identifies the beginning of a return? A customer number could be used to sig- nify rental processing and a video number could sig- nify a return. If we do this, we need to make sure the numbering schemes are distinct and nonoverlap- ping. We could have menu selection for both rental and return that determines the start of processing; then return processing also could be called a sub- process of rentals. Either of these choices would work if we choose this option. For both alternatives, the software needs to be reevaluated to maximize reusable modules because many actions on rentals are also taken on returns, including reading and dis- play of open rentals and customer information.

Having identified the alternatives and issues, we conduct observations and collect data to justify a selection. The results show that 90% of returns, or about 180 tapes per day, are on time. Of those, 50% are returned through the drop box, and 50% (90 tapes) are returned in person with new rentals. The remaining 10% of returns also have about 50% (10 tapes) accompanying new rentals. So, about 100 tapes a day, or 50% of rentals are the return- then-rent type. These numbers justify having returns as a subprocess of rentals. They also justify having returns as a stand-alone process. We will allow both.

Deciding to support both separate and return- within-rental processing means that we must con- sciously decide on reusable modules for the activities the two functions both perform: reading and display of open rentals and customer informa- tion, payment processing, and writing of processing results to the open rental files. We will try to design with at least these functions as reusable modules.

DEVELOP AND DECOMPOSE THE STRUC- TURE CHART. To begin transform analysis, we

start with the last DFD created in the analysis phase, and the data dictionary entries that define the DFD details. Figure 7-28 is reproduced here as Figure 8-23, with a first-cut partitioning to identify the cen- tral transform.

First, we evaluate each process. We will use the pseudo-code that is in the data dictionary (see Figure 8-24). The DFD shows three rental subprocesses: Get Valid Rental, Process Fees and Money, and Cre- ate and Print Rental. Each of the subprocesses might be further divided into logical components. Try to split a routine into a subroutine for each function or data change. First, evaluate the potential split to make sure the subroutines are all still needed to do the routine. This double-checks that the original thinking was correct. Then, evaluate each potential split asking if adding the subroutine changes the control, nature, or processing of the routine. If yes, do not separate the routine from the rest of the logic; if no, abstract out the subroutine.

For ABC, Get Valid Rental is the most complex of the routines and is evaluated in detail. Get Valid Rental has three subroutines that we evaluate: Get Valid Customer, Get Open Rentals, and Get Valid Video. These splits are based on the different files that are read to obtain data for processing a rental. Without all three of these actions, we do not have a valid rental, so the original designation of Get Valid Rental appears correct. Figure 8-25 shows refined pseudo-code for ABC rental processing with clearer structure and only structured constructs. Subroutines are shown with their own headings.

If we are to accommodate returns during rental processing, we have to decide where and how rentals fit into the pseudo-code. We want to allow return dates to be added to open rentals. We also want to allow returns before rentals and returns within rentals. This implies that there are two places in the process where a rental Video ID might be entered: before or after the Customer ID. If the Video ID is entered first, the application would initiate in the Return process; from there, we need to allow addi- tional rentals. If the Customer ID is entered first, the application would initiate rental; from there, we need to allow returns. To allow both of these actions to lead to rental and/or return processing, we need to add some control structure to the pseudo-code (see

Process Design Activities 303

Customer File

Video File

Rental File

FIGURE 8-23 ABC Video Levell DFD Partitioned (same as Figure 7-28)

Figure 8-26). The control structure also changes the resulting structure chart somewhat even though the DFDs are not changed.

Next, we evaluate the refined pseudo-code and inspect each subroutine individually to determine if further decomposition is feasible (see Figure 8-27). For Get Valid Customer, does the processing stay the same? That is, are the detail lines of procedure information the same? By adding the subroutine we want to add a level of abstraction but not new logic. In this case, the answer is yes. Now look at the details of Get Valid Customer. The subprocesses are Get Customer Number-a screen input process, Read and Test Customer File-a disk input process with logic to test read success and determine credit worthiness, and Display Customer Info-a screen output process. Again, we have decomposed Get

Valid Customer without changing the logic or adding any new functions.

The results of the other evaluations are presented. Walk-through the same procedure and see if you develop the same subroutines. Here we used the pseudo-code to decompose, but we could have used text or only our knowledge of processing to describe this thinking. When the decomposition is complete for a particular process stream, it is translated to a structure chart.

Complete the Structure Chart Rules for Completing the Structure Chart

Completion of the structure chart includes adding data and control couples and evaluating the diagram.

304 CHAPTER 8 Process-Oriented Design

Get Valid Rental. For all customer

Get customer # Read Customer File If not present,

Cancel else

Create customer Display Customer info.

Read Open-Rentals For all Open Rentals,

Compute late fees Add price to total price Display open rentals Display total price.

For all video Read Video file If not present

Cancel this video else

FIGURE 8-24 ABC Rental Pseudo-code

Get Valid Rental. Get Valid Customer.

For all customer Get customer # Read Customer File If not present,

Cancel else

Create customer Display Customer info.

Get Open Rentals. Read Open-Rentals For all Open Rentals,

Compute late fees Add price to total price Display open rentals Display total price.

Get Valid Video. For all video

Read Video file If not present

Cancel this video else

Call Create Video

FIGURE 8-25 ABC Rental Pseudo-code Refined

Create Video Display Video Add price to total price Display total price.

Process Fees and Money. Get amount paid. Subtract total from about paid giving change. Display change. If change = zero and total = zero,

mark all items paid else

go to process fees and money.

Create and Print Rental. For all open rentals

if item paid rewrite open rental.

For all new rentals write new open rental.

Print screen as rental confirmation.

Display Video Add price to total price Display total price, change.

Process Fees and Money. Get amount paid. Subtract total price from about paid giving change. Display total price, change. If change = zero and total = zero,

mark all items paid else

go to process fees and money.

Create and Print Rental. Update Open Rentals.

For all open rentals if item paid

rewrite open rental.

Create New Rentals. For all new rentals

write new open rental. Print screen as rental confirmation.

Get Valid Rental. Get entry. If entry is Video

Call Return else

Call Rental.

Rental. Get Valid Customer.

For all customer Get customer # Read Customer File If not present,

Cancel else

Create customer Display Customer info.

Get Open Rentals. Read Open-Rentals For all Open Rentals,

Compute late fees Add late fees to total price Display open rentals Display total price.

Get Valid Video. For all video

Read Video file If not present

Cancel this video

Process Design Activities 305

else Call Create Video Display Video Add price to total price Display total price, change.

Process Fees and Payment. Create and Print Receipt.

Return. Get Open Rental.

Read Open-Rentals Read Customer Display Customer Display Open Rental Add return date.

Using customer ID, Read Open Rentals. For all Open Rentals

Display open rentals. For all return request

If rental

Add return date to rental. Compute late fees Add late fees to total price Display total price.

Call Get Valid Video. Call Process Fees and Payment. Call Create and Print Receipt.

FIGURE 8-26 Get Valid Rental Pseudo-code with Control Structure for Returns

Structure chart completion rules are:

1. For each data flow on the DFD add exactly one data couple. Use exactly the same data flow name for the data couple.

2. For each control module, decide how it will control its subs. If you need to refine the pseudo-code to decide control, do this. Add control couples to the diagram when they are required between modules.

3. For modules that select one of several paths for processing, show the selection logic with a diamond in the module with the logic at- tached to the task transfer line.

Rules of thumb for developing the structure chart are:

1. Evaluate the diagram for cohesion. Does each module do one thing and do it completely?

2. Evaluate the diagram for fan-out, fan-in, skew, and redesign as required, adding new levels of control. Note skewed processing for attention during program design.

3. Evaluate the diagram for minimal coupling. Is the same data used by many modules? Do control modules pass only data needed for processing? Do control modules minimize their scope of effect?

These are all discussed in this section. First, the structure chart is drawn based on the de-

composition exercises. Then data couples are added to the diagram for each data flow on the DFD. If the

306 CHAPTER 8 Process-Oriented Design

Get Valid Rental. Get entry. If entry is Video

Call Return else

Call Rental.

Rental. Call Get Valid Customer. Call Get Open Rentals. Call Get Valid Video.

Return. Call Get First Return. Call Get Open Rentals. If rental

Call Get Valid Video.

Process Fees and Money.

Create and Print Rental. Update Open Rentals. Create New Rentals. Print receipt.

Get Valid Customer. Get customer # Read Customer File If not present,

Create Customer. If CCredit not zero, display CCredit Display Customer info.

Get Open Rentals. Read Open-Rentals For all Open Rentals,

Compute late fees Add late fees to total price Display open rentals Display total price, change.

For all return request Call Update Returns.

Get Valid Video. For all video

Read Video file If not present

Cancel this video else

Call Create Video Display Video Add price to total price Display total price, change.

Get First Return. Read Open-Rentals Read Customer Display Customer Display Open Rental Call Update Returns.

Update Returns. Move return date to rental. Update video history. Compute late fees. Add late fees to total price. Display total price.

Process Fees and Money. Get amount paid. Subtract total price from about paid giving change. Display total price, change. If change = zero and total = zero,

mark all items paid else

go to process fees and money.

Update Open Rentals. For all open rentals

rewrite open rental.

Create New Rentals. For all new rentals

write new open rental.

FIGURE 8-27 Complete Pseudo-code for Rentals and Returns

structure chart is at a lower level of detail, use the data flow as a starting point and define the specific data passed to and from each module. Show all data requirements for each module completely. Make sure that all names are exactly as they are in the dictionary.

Next, for each control module, decide how it will control its subprocesses and add the control couples to the diagram. Decide whether the logic will be in the control module or in the subprocess. If the logic is in the control module, the goal is for the controller to simply call the subordinate module, pass data to

Process Design Activities 307

... If data = x move 1 to go-sw. If data = y move 2 to go-sw. If data = z move 3 to go-sw. Call GO-MOD .....

Pathological Control

Structure G

Go-Data f o-sw;

... GO-MOD If go-sw = 1 do go-1. If go-sw = 2 do go-2. If go-sw = 3 do go-3 .

Go Da ;a f

Go-1

Solution 1

Control

L-<>-

G~ f Data

Go-2

... Return

Solution 2

Go- f Data

g~a 11 Go-3

... GO-MOD If data = x do go-1. If data = y do go-2. If data = z do go-3 . ... Return

FIGURE 8-28 Pathological Control Structure and Two Solutions

transform, and receive the transform's data back. If any other processing takes place, rethink the control process because it is not minimally coupled.

A control couple might be sent to the subprocess for it to determine what to do. This mayor may not be okay. Where is the control couple 'set' and 'reset'? If in the control module, this is acceptable. If somewhere else, rethink the control process and sim- plify it. Any time you must send a control couple for a module to decide which action to take, you identify a potential problem. The lower-level module may be doing too many things; otherwise it would not need to decide what to do, or the control may be in the wrong module.

An example of this problem and two solutions are illustrated in Figure 8-28. If the lower level is doing too many things, then decompose them to create sev- eral single-purpose modules. If the lower level is not doing multiple functions, then move control for the module into the module itself. In both cases, the goal of minimal coupling is attained.

Next, the diagram is evaluated for cohesion, cou- pling, hierarchy width, hierarchy depth, fan-out, fan- in, span of control, and skew. Evaluate the diagram for cohesion (see Table 8-2 for definition of cohesion types). Check that each module does one thing and does it completely. If several modules must be taken together to perform a whole function, the structure is

308 CHAPTER 8 Process-Oriented Design

excessively decomposed. Regroup the processes anq restructure the diagram.

Evaluate the diagram for width, depth, fan-out, fan-in, and skew. These are visual checks to see if some portion of the structure is inconsistent with the rest of the structure. The inconsistency does not nec- essarily mean that the diagram is wrong, only that there may be production bottlenecks relating to the out-of-balance processes. For a wide structure, dou- ble check that the subprocesses really aU relate to one and only one process. If not, add a new control module, else leave as is.

For deep structures, check to see if each level of depth is performing some function beyond control. Ask yourself why all the levels are needed. If there is no good reason, get rid of the level and move its functions either up or down in the hierarchy, prefer- ably up. Ask yourself if fewer levels can accomplish the same process. If the answer suggests reducing the levels of hierarchy, restructure the diagram and keep only essential levels.

For fan-in modules, check that each using module has the same type of data being passed and expects the same type of results from the fan-in module. If there are any differences, then either make the using modules consistent, or add a new module to re- place the fan-in module for the inconsistent user module. .

Skewed diagrams identify a fundamental imbal- ance of the application that may have been hidden before: that it is input-bound, output-bound, 1/0- bound, or process-bound. Skew is not necessarily a problem that results in restructuring a diagram. When skewed processing is identified, you should verify that it is not an artifact of your factoring. If it is, remove the skew from the diagram by restructur- ing the modules.

Skew is not always a problem. When a skewed application is being designed, the designers normally spend more time designing the code for the bound portion of the problem to ensure that it does not cause process inefficiencies. For instance, Fortran is notoriously inefficient at physical input/output (i.e., reading and writing files). For anything but a process-bound application, Fortran is not the best language used. For a process-bound Fortran appli- cation, with many I/Os, another language, such as

assembler or Cobol, might be used to make read/ write processing efficient. The opposite is true of Cobol. Cobol is not good at high precision, scien- tific, mathematical processing. In a Cobol applica- tion, process-bound modules and their data would be designed either for another language, or to minimize the language effects.

Finally, evaluate the diagram for minimal cou- pling. First look at data couples. If you see the same data all over the diagram, there may be a problem. Either you are not specifying the data at the element level, or data coupling is the least coupling you will be able to attain. Make sure that only needed data is identified for passing to modules. Data coupling is not the best coupling, but it is tolerable.

Next look at control couples one last time. Make sure that they are set and reset in the same or directly-related modules, and make sure that, if passed, they are passed for a reason. If either of these conditions are violated, change the coupling.

To summarize so far, decide the system concept; partition the DFD; develop a first-cut structure chart; decompose the structure chart using pseudo-code of the functions as needed to guide the process; add data couples; add control couples; evaluate and revise as needed.

ABC Video Example Structure Chart

ABC's structure chart will begin with the Levell DFD factoring and progress to provide the detail for modules as expressed in the pseudo-code. There are three first level modules: Get Valid Rental, Process Fees & Money, and Create and Print Rental (see Figure 8-~9). To get the next level of detail, we use the pseudo-code or decomposed structure charts. In our case, we use the pseudo-code. In Figure 8-27, the high level pseudo-code has only module names. We simply transfer those names to modules on the struc- ture chart, attending to the control logic present in the diagram.

For each if statement, we need to decide whether that statement will result in a direct call (our choice, here) or whether it will result in a control couple being passed. Direct calls are preferred to minimize coupling. When a direct call is used, the module is executed in its entirety every time it is called.

Legend:

Reused from another diagram

Reused on this diagram

Process Design Activities 309

FIGURE 8-29 Rent/Return First-Cut Structure Chart

We identify reused modules by a slash in the lower left corner of the rectangles to show the com- plete morphology of the diagram. The first-cut struc- ture chart shows that the processing is skewed toward input. Because there are three data stores affected by every process, there is no way to get rid of the skew without getting rid of the control level. Is the control level essential? If we omit the control level is the processing the same? Do we violate fan- out if we remove the control level ? The answers are no, mostly, and no, respectively. If we remove the control level, its logic must go somewhere. The logic can move up a module and not violate fan-out. The

change may have a language impact, so we will not change it until we decide program packages.

We note it for attention during packaging and pro- gramming. There are no other obvious problems with the first-cut structure chart. Since we have developed it bottom-up, using the pseudo-code as the basis, it is as good as our pseudo-code.

Next, we add the data and control couples needed to manage processing. The final diagram is shown in Figure 8-30, which we evaluate next.

Each module appears to do only one thing. The diagram is input -skewed as already discussed. The span of control and fan-out seem reasonable.

310 CHAPTER 8 Process-Oriented Design

Legend:

OR Open Rental C Customer V Video

VR Valid Rental Upd. OR Updated OR

FIGURE 8-30 Completed Rent/Return Structure Chart

The reused modules each have the same input data. The hierarchy is not unnecessarily deep, although the control code for Get Valid Rental, Rent, and Return might be able to be combined depending on the language. Coupling is at the data level and is acceptable. Next, we turn to designing the physical database.

Design the Physical Database Physical database design takes place concurrently with factoring and decomposition. A person with special skills, usually a database administrator (DBA), actually does physical database design. In

companies without job specialization, a project team member acts as the DBA to design the physical data- base. Physical database design is a nontrivial task that may take several weeks or even months.

Rules for Designing the Physical Database

The general physical database design activities are summarized below. Keep in mind that many other activities may be involved in designing a physical database that relate to a specific implementation environment.

1. Define user views based on transaction types and data accessed for each transaction.

2. Identify access method if choices exist. 3. Map user views to access method and storage

technology to optimize disk space and to minimize access time.

4. Build prototype and test, revising as indicated.

5. Develop database for application testing. 6. Document physical database design and dis-

tribute user view information to all project team members.

7. Work with conversion team to build produc- tion databases.

Designing user views means to analyze the trans- actions or inputs of each process to define which database items are required. In general, the data items processed together should be stored together. These logical design activities constrain the physical design and help the person mapping to hardware and software.

In selecting the access method, the physical data designer seeks to optimize matching available access methods to access requirements. Access method choices usually are data sequenced (i.e., indexed), entry sequenced (i.e., direct), inverted lists, or some type ofb-tree processing. Each DBMS and operating system has its own access methodes) from which selection is made. The details of these access meth- ods are beyond the scope of this text. 3

User views are mapped to the access method and a specific media. Media mapping seeks to optimize access time for individual items and sets of items. It also seeks to minimize wasted space while provid- ing for growth of the database. Since media have become one of the major expenses in the computing environment, there may be political issues involved with physical database design. At this point, a data- base walk-through reviews all database design before a prototype is built.

The DBA documents and trains team members in data access requirements. The DBA, working from the application specification, maps data re-

3 For more on access methods and storage considerations, see references to Fabbri and Schwab [1992], Codd [1990], Bohl [1981], and Claybrook [1983] in the references.

Process Design Activities 311

quirements to user views to processes. Each process, then, has specific data items assigned. Every team member must know exactly what data items to access and how to access them. If a module or pro- gram accesses the wrong data item, an inconsistent database might result. Also, minimal data coupling requires that each process access only data that it requires. Incorrect use of access methods can lead to process bottlenecks or an inconsistent database. To assure that programs are using the data correctly, the DBA may participate in walk-throughs to moni- tor data access.

The DBA works with the test team to load the data needed for testing. The DBA also works with the conversion team to load the initial production database. These activities may be trivial or may require hiring of temporary clerks to input informa- tion to the database. The DBA and the two teams work together to verify the correctness of the data, to provide program test database access to the rest of the development team, and to provide easily accessed backup when the test database is compro- mised. After the test database is loaded, the backup and recovery procedures, transaction logic proce- dures, and other database integrity procedures are all finalized and tested.

To summarize, a person who intimately knows the technical production data environment acts as a DBA, mapping the database to a physical environ- ment and building both test and production data- bases. The DBA provides training and guidance to the other team members for data access, and partici- pates in data related walk-throughs.

ABC Video Example Physical Database Design

In order to do the physical database design, a DBMS must be selected. We will design as if some SOL engine were being used. SOL's physical design is closely tied to the logical design so the design activ- ity becomes less DBMS software sensitive. In addi- tion, SOL data definition is the same in both mainframe and micro environments so the design activity does not need to be hardware platform sen- sitive. The amount of storage space (i.e., number of tracks or cylinders) will vary, of course, since disks

312 CHAPTER 8 Process-Oriented Design

on PCs do not yet hold as much information as main- frame disks.

Beginning with the logical design from Table 7 -7, we define the relations and data items that are required to develop user views. Remember from database class, that the logical database design can map directly to the physical database. The relations defining the actual database mayor may not be accessed by users. For security reasons, user views may be used to control access to data and only the DBA would even know the real relation names.

To define user views, we examine each process and identify the data requirements. List the require- ments by process (see Table 8-5). Match similar data requirements across processes to identify shared user views. The problem is to balance the number of views against the number of processes. Ideally a handful of user views are defined; a heuristic for large applications is about 20 user views. Beyond that, more DBAs are required and database mainte- nance becomes difficult. In a large application, keep- ing the number of user views manageable may be difficult and require several design and walk-through iterations.

For ABC rental processing, we need a user view for each major data store: Customer, Video Inven- tory, and Open Rentals. We also need user views for the minor files: Video History, Customer History, and End Of Day Totals. If data coupling and memory usage are not an issue, using a SQL database, we can create one user view for each of Customer, Video, and Open Rental, and create one joined user view using the common fields to link them together. The individual views are used for processes that do not need all of the data together; the joined view can be used for query processing and for processes that need all of the data. The resulting data definitions for customer, video, open rentals, and the related user views are shown in Table 8-6. We also need sepa- rate user views for the history files and EOD totals. They are included in the table.

At this point, with SQL software, we are ready to prototype the database. If either access method selection or storage mapping is an issue, a prototype should be built. Otherwise, the next step is to map user views to access methods and storage media. This activity depends on the implementation envi- ronment and is beyond this text. The database may

be walked through again at this point to verify pro- cessing requirements for the database. The database is then prototyped and documented. The information needed for each program is included in program specifications. Team members are usually given an overview of the database environment either as part of the last walk-through or as a separate training session. When the prototype appears complete and workable, test and production databases are developed.

Design Program Packages Rules for Designing Program Packages

The activities for grouping modules into program packages are listed below; as you can see, they are general guidelines, not rules. There are no rules for packaging because it is an environment-dependent activity. Packages for firmware or an 8K micro computer are entirely different than packages for a mainframe. Also, the implementation language de- termines how and when some types of coupling are done. With these ideas in mind, the guidelines apply common sense to identifying program execute units.

1. Identify modules that perform functionally related activities, are part of iteration units, or which access the same data. The related mod- ules identified should be considered for pack- aging together for execution.

2. Develop pseudo-code for the logic functions being performed. Use only structured pro- gramming constructs: iteration, selection, and sequence. Document complex logic using decision tables or decision trees.

3. Logically test the user views developed with the DBA to reevaluate their usefulness for each program package.

4. Design each module to have one entry and one exit.

5. Design each module such that its contents are unchanged from one execution to the next.

6. Design and document messages for called modules. Reevaluate the messages to mini- mize coupling.

7. Draw a diagram of the module and all other modules with which it interacts.

TABLE 8-5 ABC Data Requirements by Process

Process

Get Valid Customer

Get Open Rentals

Get Valid Video

Get First Return

Get Valid Video

Update Rentals

Process Fees and Money

Create Video history

Create Customer history

Update Open Rentals

Create New Rentals

Print receipt

Customer

Customer Phone, Name, Address, Credit Rating

Video Inventory

Video ID, Copy ID, Video Name, Rental Price

Process Design Activities 313

Open Rental

Customer Phone, Video ID, Copy ID, Video Name, Rent Date, Return Date, Late Days, Fees Owed

Customer Phone, Video ID, Copy ID, Video Name, Rent Date, Return Date Late Days, Fees Owed Customer Phone, Video ID, Copy ID, Video Name Rent Date, Return Date, Late Days, Fees Owed

Other

End of Day Totals Total Price +

Rental Information Video History File: Year, month, Video ID, Copy ID Customer History File: Customer Phone, Video IP

Customer Phone, Name, Address, For each Video: Video ID, Copy ID, Video Name, Rent Date, Return Date, Late Days, Fees. Owed, Total Price

314 CHAPTER 8 Process-Oriented Design

TABLE 8-6 SQL Data Definitions and User Views

Create Table Customer

(Cphone Clast Cfirst Cline I Cline2 City State Zip CCtype Ccno Ccexp CCredit Primary key

Create Table Video

(VideoID VideoNam VendorNo TotCopies RentPrice Primary key

Create Table Copy

(Vide olD CopyID DateRecd Primary key Foreign Key

Char(lO) VarChar(50) VarChar(25} VarChar(50) VarChar(50) VarChar(530) Char(2) Char(lO) Char(l) Char(17) Date Char(l), (Cphone));

Char(7) Varchar(50) Char(4) Smallint Decimal(I,2) (videoID);

Char(7) (Char(2) Date

Not null, Not null, Not null, Not null, Not null, Not null, Not null, Not null, Not null, Not null, Not null,

Not null, Not null,

(VideoID, CopyID), ((VideoID) References Video);

A program package is a collection of called mod- ules, called functions, and in-line code that does some atomic process, and that will become an exe- cute unit. The hierarchy of criteria for designing packages is to package by function, by iteration clus- ters, or by need to access the same data. At all times, you must keep in mind any production environment constraints that must also be part of the design. For instance, if the application will be on a LAN, you may want to design packages to minimize the possi- bility of multiple users for a process.

Functional grouping is, by far, the most impor- tant. Functional grouping ensures high cohesion for the program. Any modules that are required to per- form some whole function should be grouped

Create Table Rental

Cphone Char(lO) Not null, RentDate Date Not null, VideoID Char(7) Not null, CopyID (Char(2) Not null, RentPaid Decimal(2,2) Not null, FeesOwed Decimal(2,2) Primary Key (CPhone, VideoID, CopyID), Foreign Key ((VideoID) References Video) Foreign Key ((VideoID, Copyld) References

Copy), Foreign Key (CPhone) References Customer);

Create view VidCrsRef as select VideoID, CopyID, VideoName, RentPric

from Customer, Video, Copy where Video.VideoID = Copy.VideoID;

Create view RentRef as select Cphone, Clast, Cfirst, VideoID, CopyID,

VideoNam, RentPaid, RentPric, FeesOwed from Customer, VidCrsRef, Rental where VidCrsRef.VideoID = Rental.VideoID and VidCrsRef.CopyID = Rental.CopyID and Customer.Cphone = Rental.Cphone;

together. The other two considerations frequently apply to functional groups as well.

If a group of activities repeat as part of an itera- tive sequence, all activities in the group should be together in the program package. Individual mod- ules can be coded and unit tested alone, but they should be packaged for integration testing and implementation.

Grouping modules that access the same data min- imizes physical reading and writing of files. The major goal is to read the same data record in anyone pass of the processes no more than once. We want to minimize physical I/O because it is the slowest process the computer performs. Grouping modules by data accessed minimizes the frequency of read-

ing. Real-time applications, especially, are vulnera- ble to multiple reads and writes of the same data, slowing down response time.

Grouping modules by data access is a form of data coupling that minimizes the chance of unex- pected changes to data. If we do not package mod- ules together, but only read and write data once, the major alternative to common packaging is to use global data areas in memory. Global data is not pro- tected and is vulnerable to corruption.

When the packages are complete, develop Struc- tured English pseudo-code for the logic functions being performed. Use only structured programming constructs-iteration, selection, sequence. Docu- ment complex logic using decision tables or decision trees. Include control structures and names for all modules. Pseudo-code may have been done as part of analysis, or earlier in design, as we did for ABC rental and return. Incidental activities, or less cru- cial activities, may have been overlooked or not refined. Pseudo-code is completed now and struc- tured for use in program specifications.

Decision tables and trees might be used to docu- ment complex decisions. While a discussion of them is beyond this text, an example of each is shown in Figure 8-3l.

As we design the program packages, we logically test the user views developed with the DBA to reevaluate their usefulness. The questions to ask are: Is all the needed data available? Is security ade- quate? Is extra data present? If any of these answers indicate a problem, discuss it with the DBA and determine his or her reasons for the design. If the design should change, the DBA is the person to do it.

Design each module to have one entry and one exit. Multiple entrances and exits to program mod- ules imply problems because of selection and go to logic required to implement multiple exits and entrances. If each module is kept simple with one of each, there are fewer testing, debugging, and main- tenance problems.

Ideally, each module should have its internal data contents the same before and after a given execution. That is, the state and contents of the module should be unchanged from one execution to the next. This does not mean that no changes take place during an execution, only that all traces of changes are

Process Design Activities 315

removed when the execution is complete. When a module must maintain a 'memory' of its last actions, coupling is not minimized.

Design and document messages for called mod- ules. Messages should contain, at most, calling/ called module names, data needed for execution, control couples, and variable names for results of execution.

You might draw a diagram of the module and all other modules with which it interacts to facilitate visual understanding of the module and its role in the application.

ABC Video Example Program Package Design

Working with the final structure chart in Figure 8-30, our biggest decision is whether or not to package all of rental/return processing together, and how. Do we write one program with performed modules, one with called modules, or a combination of the two?

ABC is going to be in a SQL-compatible database environment, on a LAN, and requires access by PCs. The choice of language is not limited with these requirements, but packaging without knowing the language is not recommended. For this exercise, we will assume that Focus,4 the 4GL, will be used.

Focus' application generator, called the "Dia- logue Manager," allows both in-line and called mod- ules to be used. Calling modules of nonFocus languages are allowed but can be tricky. The lan- guage has its own DBMS that is SQL-compatible, but it is not fully relational. It falls in the category of DBMSs called 'born again relational,' that is, the DBMS is hierarchic, networked, or relational at the DBA's discretion. Relationality is allowed but not re- quired in Focus. Focus does not support the integrity rules. We will not redesign the database here since the SQL code above could be recoded without de- sign changes in the Focus DBMS language.

At this point, we need to step back and decide how to package the entire application. What kind of 'glue' will hold customer maintenance, video

4 Focus is a trademark of Information Builders Inc., New York. Focus is representative of PC-based application generators, including Rbase, Dbase IV, Informix, etc.

316 CHApTER 8 Process-Oriented Design

Decision Table Format:

Conditions-Possible occurrences Rules-Specific occurrences

Actions-Possible outcomes Entries-Specific outcomes for rule combinations.

Decision Table Example:

Conditions

Customer Old Old Old Old Old Old Old Old New New

Open Rentals Y Y y y N N N N

Returns Y Y N N Y Y N N

New Rentals Y N Y N Y N Y N y N

Actions

Create Customer N N N N N N N N Y Y

Check Late Fees Y Y Y Y Y Y N N Y N

Process Return Y Y N N Y y N N N N

Process New Rental Y N Y N y N Y N Y N

Process Fees and Money Y Y Y Y Y Y Y N y N

Update Open Rental y y y y y y N N N N

Create Open Rental Y N Y N Y N Y N Y N

Print Receipt Y Y Y Y y Y Y N Y N

Decision Tree Format: Tree structure showing conditions and actions.

Decision Tree Example:

FIGURE 8-31 Decision Table and Decision Tree

maintenance, end-of-day, and rental/return process- ing together. We do not discuss screen design here because it is not in the methodology (it is in Chap- ter 14), but we would finalize screens while these de- cisions are being made. We need all of the above functions to do this application, so all of the func- tions must be available in a unified environment. This means that all functions must be available for execution within the same run environment. Screens are the 'glue' that users see that unify application processing. The code behind the screens mayor may not be unified depending on the design techniques and language. With Focus, unification is done through the Dialogue Manager.

4GL and PC-DBMS languages are deceptively simple. To perform trivial tasks is easy, but to build application requires expertise. Focus is no different. The complexities with Focus relate to when, where, and how often the databases are opened and pro- cessed, how the databases are related, and how many concurrent users are allowed. The concurrent envi- ronment increases DBA complexity but changes the answers to the questions about databases; it does not change the application code. So, we will assume one user at a time for processing.

Skeleton Focus code for the application is shown in Figure 8-32. Each DFD Level 0 process is ac- counted for at this level; we even have a query func- tion that is new. Most applications require interactive file query and we have not talked about it at all as part of the rental return application. The trend in business today is for users to develop their own reports and queries using some 4GL. When the lan- guage has a built-in query facility, you can add it to the processing without any analysis or design work, as shown here with Focus. User developed queries allow users to 'stay in touch' with their data and remove a major design burden from IS personnel.

Now that the application is accommodated within one execute environment, we return to the problem of how to package rent/return processing. The ideal is to code and unit test each lowest level box on a structure chart as an independent module. Then, using the 'call' feature of the language, build a con- trol structure, based on the design of the control and coordination boxes on the structure chart that calls modules as needed for execution. We will use this

Process Design Activities 317

approach here as Figure 8-32 shows for the applica- tion, and Figure 8-33 shows for rental and return processing.

The alternative to called modules is in-line code that is 'performed' or executed as a pseudo-called module. This choice is selected with 3GL languages such as Cobol, Fortran, or PL/1 because it can be easier to code, test, and maintain.

Specify Programs Rules for Specifying Programs

The specification documents all known information about programs. Program specifications document the program's purpose, process requirements, the logical and physical data definitions, input and out- put formats, screen layouts, constraints, and special processing considerations that might complicate the program. Keep in mind that the term program might also mean a module within a program or an exter- nally called function, or even a code fragment (e.g., DB call). A program specification should include the items shown in Table 8-7. As with program packag- ing, there are no 'rules.' Rather there are items that should be included if they relate to the item being specified.

There are two parts to a program specification: one identifies interprogram relationships and com- munication, the other documents intraprogram pro- cessing that takes place within the individual program. Interfaces to other programs generally doc- ument who, what, when, where, and how communi- cation takes place. Who identifies who initiates the communication and who, in the real world, is responsible for the interface. What identifies the message( s) content that is used for communication. When identifies the frequency and timing of the interface. Where locates the application and system in a hardware environment; where becomes compli- cated and is crucial to processing of distributed applications. How describes the nature of the inter- face-internal message, external diskette, and so forth.

Internal program processing information includes the data, processes, formats, controls, security, and constraints that define a particular program.

318 CHAPTER 8 Process-Oriented Design

Focus Code

-Set &&Globalvariables

-Include Security

-Run

-Mainline

-Include Mainmenu

-Run

-If &&Choice eq 'R' goto RentRet else -If &&Choice eq 'V' goto Vidmain else -If &&Choice eq '0' goto EndOfDay else -If &&Choice eq 'Q' goto Query else -If &&Choice eq 'S' goto StopSystem else

-Goto Mainmenu; -*

-RentRet

-Include RentRet

-Run -Goto Mainline -*

-Vidmain -Include Vidmain -Run -Goto Mainline -*

-Custmain -Include Custmain -Run

-Goto Mainline -*

Query -Include Tabltalk -Run -Goto Mainline -*

-EndOfDay -Include Endofday -Run -Goto Mainline -*

Explanation

Set variables needed for intermodule communication.

Check password in a security module.

Check password before any other processing.

Comment indicator

A label identifying the main routine.

The call statement in Focus is 'INCLUDE.' Mainmenu is a module name.

Perform Mainmenu before any other processing.

Interrogate the choices from Mainmenu to decide what to do next.

If in error, go back to the Mainmenu screen.

RentReturn Label

Call Rent/Return processing.

When Rent/Ret is complete, return to the Mainmenu.

Video Maintenance Label and Processing

Customer Maintenance Label and Processing

Query Label and Processing

End-of-Day Label and Processing

-StopSystem Stop System Label

-End End Processing

FIGURE 8-32 ABC Video Processing Focus Mainline

Automated Support for Process-Oriented Design 319

RentRet Focus Mainline Code

-Set &&Globalvariables -*Rental and Return Processing

-Crtform Line 1 ABC Video Rental Processing System <d.&date"

Rentals and Returns"

- Scan or enter a card or video: <&&Entry" -If &&entry like 't&' goto Return else -If &&entry like 'c&' goto Rental else -Include Entryerr; -Run

-Return -Include ValidCus -Include Open Rent -Include ValidVid -Goto exit -Run

-Rental -Include FirstRet -Include Open Rent -Crtform Line 15 - Do you want to do rentals? <&&Rentresp/1" -If &&Rentresp ne 'y' goto exit else -Include ValidVid -Goto exit -Run

-Exit -End

FIGURE 8-33 ABC Rent/Return Focus Mainline

Frequently, program specifications also include a flowchart of the program logic, a system flowchart showing the system names of the files, and a detailed specification of timing and other constraints.

ABC Video Example Program Specification

The program specification for one program to per- form Get Valid Customer is shown as an example (see Table 8-8). Since this is a compilation of already known information there is no discussion.

TABLE 8-7 Program Specification Contents

Identification

Purpose

Characteristics

Reference to Applicable Documents

DFD and Structure Chart (possibly also System Flow- chart and Program Flowchart)

Narrative of procedures in Structured English, Decision Tables, Decision Trees

Automated Interface Definition

Screen Interface

Screen Design, Dialog Design, Error Messages

Application Interface

Communications Messages, Error Procedures Frequency, Format, Type, Responsible person

Input, Output, and System Files

Logical data design

User views, internal name, graphic of physical data structure

List of physical data structures

Tables and Internal Data

Internal name, graphic of physical data structure

List of physical data structures

Reports

Frequency, Format, Recipients, Special processing

AUTOMATED __________ _ SUPPORT FOR __________ _ PROCESS-ORIENTED ___ _ DESIGN _______ _

Automated support in the form of CASE tools is also available, although fewer products support struc- tured design than support structured analysis. Sev- eral entries provide Lower CASE support that begins

320 CHAPTER 8 Process-Oriented Design

TABLE 8-8 ABC Example Get Valid Customer Program Specification

Identification: Get Valid Customer, (ValidCus)

Purpose: Retrieve Customer Record and verify credit worthiness

Characteristics: Focus Included module

References: See System Specification, Pseudo-code for CustMain

DFD: Attached as Appendix 1

Structure Chart: Attached as Appendix 2

Narrative:

Accept CPhone Read Customer Using CPhone If read is successful

If CCredit Ie '1' continue

else Display "Customer has a credit problem; rating = <CCredit" Display "Override or cancel? : <&Custcredit" If &Custcredit eq 'C'

include Cancell Return

else If &Custcredit eq '0'

continue else include crediterr return

else Include CreatCus. Set &&ValidCus to 'Yes.' Set global customer data to values for all fields. Return.

Screen Interface

Screen Design: None Dialog Design: None Error Messages:

"Customer has a credit problem; rating = <CusCredit" "Override or cancel? : <&Custcredit"

Application Interface

Input: User views Internal data names:

Tables and Internal Data

Reports:

None

Customer File Customer Customer Contents in Data Dictionary

Global fields correspond to all Customer File fields. Set all fields to customer record values upon successful processing.

None

Automated Support for Process-Oriented Design 321

TABLE 8-8 ABC Example Get Valid Customer Program Specification (Continued)

Appendix 1: Data Flow Diagram

CPhone or CustlD

(Clast, 1.0 2.0 Cfirst)

Accept Read • Input CUstomer

Appendix 2: Structure Chart

Appendix 3:

Table Customer

(Cphone Clast Cfirst Clinel Cline2 City

Accept CPhone ...

• Cust cb Record

.----~----,

Get Customer

User View with Data Names

Char(lO) Not null, VarChar(50) Not null, VarChar(25) Not null, VarChar(50) Not null, VarChar(50) Not null, VarChar(530) Not null,

3.0

Check Credit

Get Valid Customer

CCredit

Check Credit

CreditErr

State Zip CCtype Ccno Ccexp CCredit Primary key

Valid Cust Record 4.0 Return

Set Data Ack

Values

Cust 9 Record ,.----"-----,

Set Data Values

Char(2) Char(lO) Char(l) Char(l7) Date Char(l), (Cphone));

Not null, Not null, Not null, Not null, Not null,

322 CHAPTER 8 Process-Oriented Design

TABLE 8-9 CASE Tools for Structured Design

Product

Analyst/Designer Toolkit

Anatool, Blue/60, MacDesigner

Company

Yourdon, Inc. New York, NY

Advanced Logical SW Beverly Hills, CA

Technique

Structure Chart

Structure Charts Structured English

The Developer ASYST Technology, Inc N apierville, IL

Structure Chart Operations Process Diagram Systems Flowchart

Excelerator

lEW, ADW (PS/2 Version)

Maestro

MacAnalyst, MacDesigner

Multi-Cam

Index Tech. Cambridge, MA

Knowledgeware Atlanta, GA

SoftLab San Francisco, CA

Excel Software Marshalltown, IA

AGS Mgmt Systems King of Prussia, PA

with program specification or code generation (see Table 8-9).

STRENGTHSAND ________ _ WEAKNESSES OF ________ _ PROCESS ANALYSIS ______ _ AND DESIGN __________ __ METHODOLOGIES ____ _

The objectives of structured analysis and design are reasonably clear; the manner of obtaining the objec- tives is much less clear. Structured methods rely on the individual SE's expertise to design the technical

Structure Chart Flowchart

Structure Chart

N assi -Schneiderman Hierarchic input -process-output

charts (HIPO) User Defined Functions

Decision Table Structured English Structure Chart

Structure Chart

details of the application. For implementation spe- cific details, that makes sense, but the heuristics for evaluation cannot be applied in every situation. Con- sequently, the SE must know what situations apply and don't apply. More than the other methods dis- cussed in this book, you must know when to adhere to, bend, and break the rules of structured methods.

The methodology's ability to result in minimal coupling and maximal cohesion is low because of its reliance on the SE's ability. If coupling and cohesion are not optimal, maintenance will cost more than it should, and the application will be difficult to test. In 1972, D. Parnas wrote about maximal cohesion and minimal coupling as desirable characteristics of pro- grams. In 1968, Dijkstra wrote about the problems with 'go to' statements in programs and proposed goto-Iess programming. In 1966, Bohm and Jacopini

Strengths and Weaknesses of Process Analysis and Design Methodologies 323

TABLE 8-9 CASE Tools for Structured Design (Continued)

Product

PacBase

ProKitVVorkbench

ProMod

SVV Thru Pictures

Company

CGI Systems, Inc. Pearl River, NY

McDonnell Douglas St. Louis, MO

Promod, Inc. Lake Forest, CA

Interactive Dev. Env. San Francisco, CA

Technique

Process Decomposition Structure Chart Flowchart

Structure Chart

Module Networks Function Networks Structure Chart

Control Flow Structure Chart

System Architect Popkin Software and Systems, Inc. NY, NY

Flowchart Structure Chart

Teamwork Cadre Technologies, Inc. Providence, RI

Control Flow Decision Table Structure Chart

Visible Analyst

Telon, and other products

vs Designer

Visible Systems Corp. Newton, MA

Intersolv Cambridge, MA

Visual Software, Inc. Santa Clara, CA

proposed structured programming's minimalist con- tents as sequence, iteration (e.g., if ... then ... else) and selection (e.g., do while and do until). By the time structured analysis and design were docu- mented in books, the notions of coupling and cohe- sion were understood fairly well; but how to obtain them was not.

General statements about keeping the pieces small and related to one part of the problem domain rely on the analyst to know what to do and when to start and stop doing it. Unfortunately, only experi- ence can guide such vague suggestions. While novices can learn to rely on the methodology to guide their actions, they have no basis for evaluat- ing the correctness or incorrectness of their work. Thus, the apprenticeship approach, with a junior per- son working with a more senior one to learn how to

Structure Chart

Code Generation for Cobol- SOL, C and others

Structure Chart VVarnier-Orr

evaluate designs, is required. The more complex the application, the more important having experienced senior analysts becomes.

Another problem is that structured design does not encompass enough of the activities to make it a complete methodology. We must have screen de- signs in order to develop a program specification. We must know the details of interfaces to other applications and messages to/from them to be able to develop program specifications. Structured methods do not pay any attention to either of these issues. To develop an application, the SE needs to analyze requirements and design for control, input, output, security, and recoverability. None of these are en- compassed in the process-oriented methods. To sum- marize, process methods are useful in analyzing and designing applications that are procedural in nature;

324 CHAPTER 8 Process-Oriented Design

but the methods omit a great many required analy- sis and design activities.

SUMMARy ____________ __

In this chapter, structured design which follows structured analysis in development, was discussed. The results of structured analysis-a set of leveled data flow diagrams, data dictionary, and procedural requirements-are the inputs to the design process. The major results of structured design are program specifications which detail the mapping of functional requirements into the production hardware and soft- ware environment.

First, using either transaction or transform analy- sis, the DFD is partitioned into afferent, efferent, and central transform processes. The streams of process- ing are factored to develop a structure chart. The processes are further decomposed into system-like subprocesses until further decomposition would change the nature of the process. Data requirements are documented in data couples; control is docu- mented in control couples. The chart is evaluated for fan-out, fan-in, skew, cohesion, coupling, scope of effect, and scope of control. The structure chart is revised and reevaluated as required.

The physical database is designed. Data needs for each data flow in the application are listed by process. Data similarities are matched and used to define user views. The access method and physical mapping are then decided. Physical database design walk-throughs may be held to validate the design. Test and production databases are created.

Program packages are decided based on the application concept and timing. The packages de- fine which modules will communicate and how. Pseudo-code for processes is finalized and uses only structured programming constructs-iteration, se- quence, and selection. Decision tables and trees are used, as necessary, to document complex decisions.

Finally, program specifications are written to doc- ument all known information about each module, function, or program. Specifications include data, process, interface, constraint, and production infor- mation needed for a programmer to code and unit test the work.

REFERENCES __________ __ Alexander, Christopher, Notes on the Synthesis of Form.

Cambridge, MA: Harvard University Press, 1971. B6hm, Corrado, and Guiseppe J acopini, "Flow diagrams,

Turing machines, and languages with only two forma- tion rules," Communications of the ACM, Vol. 9, #5, May 1966, pp. 366-371.

Couger, J. D., M. A. Colter, and R. W. Knapp, Advanced System Development/Feasibility Techniques. NY: John Wiley & Sons, 1982.

Curtis, B., M. I. Kellner, and J. Over, "Process model- ing," Communications of the ACM, Vol. 35, #9, Sep- tember 1992, pp. 75-90.

DeMarco, T., Structured Analysis and System Specifica- tion. NY: Yourdon, Inc., 1978.

Dijkstra, Edsgar W., "Go to statement considered harm- ful," Communications of the ACM, Vol. 11, #3, March 1968, pp. 147-148.

Flaatten, P.O., D. J. McCubbrey, P. D. O'Riordan, and K. Burgess, Foundations of Business Systems, 2nd ed. NY: The Dryden Press, 1992.

Frances, B., "A window into CASE," Datamation, March 1, 1992, pp. 43-44.

Krasner, J., J. Terrel, A. Lindhan, P. Arnold, and W. H. Ett, "Lessons learned from a software process model- ing system," Communications of the ACM, Vol. 35, #9, September 1992, pp. 91-100.

Lindholm, E., "A world of CASE tools," Datamation, March 1, 1992, pp. 75-81.

McClure, c., The Three R's of Software Automation: Re-Engineering, Repository and Reusability. Engle- wood Cliffs, NJ: Prentice-Hall, 1992.

McMenamin, S. M., and J. F. Palmer, Essential Systems Analysis. NY: Yourdon, Inc., 1984.

Olle, T. W., J. Hagelstein, I. G. MacDonald, C. Rolland, H. G. Sol, F. J. M. Van Assche, and A. A. Verrijn- Stuart, Information Systems Methodology: A Frame- work for Understanding. Workingham, England: Addison-Wesley, 1988.

Page-Jones, M., The Practical Guide to Structured Sys- tem Design, 2nd ed. Englewood Cliffs, NJ: Prentice- Hall, 1988.

Parnas, David L., "One of the criteria to be used in decomposing systems into modules," Communica- tions of the ACM, Vol. 15, #12, December 1972, pp. 1053-1058.

Swartout, W., and R. Balzer, "On the inevitable inter- twining of specification and implementation," Com- munications of the ACM, Vol. 25, #7, July 1982, pp. 438-440.

Yourdon, E., and L. L. Constantine, Structured Design: Fundamentals of a Discipline of Computer Program and Systems Design. Englewood Cliffs, NJ: Prentice- Hall, 1979.

Yourdon, E., Modern Structured Analysis. Englewood Cliffs, NJ: Prentice-Hall, 1989.

BIBLIOGRAPHY ______ _

Boh!, M., Introduction to IBM Direct Access Storage Devices. Chicago, IL: SRA, 1981. This booklet gives the clearest explanation of VSAM and the differences between data sequenced and entry sequenced storage options that I have seen.

Claybrook, B., File Management Techniques. NY: John Wiley & Sons, 1983. This book provides a good general discussion of indexed, direct, and inverted list files.

Codd, E. E, The Relational Model for Database Manage- ment, Version 2. Reading, MA: Addison-Wesley Pub- lishing Co., Inc., 1990. Codd, the father of relationship database theory, argues the merits of an almost direct translation of the logical database to the physical database.

Fabbri, A. J. and A. R. Schwab, Practical Database Man- agement. Boston, MA: PWS-Kent Publishing Co., 1992. This book discusses physical mapping for relational databases and has some discussion of the issues involved for hierarchic and network databases.

KEy TERMS _______ _

afferent afferent flows atomic process central transform cohesion coincidental cohesion common coupling communicational

cohesion content coupling control coupling coupling data coupling depth of hierarchy efferent efferent flows executable unit

external coupling factoring fan-in fan-out function functional cohesion functional decomposition HIPO I/O-bound indirect coupling information hiding in-line code input-bound interface logical cohesion modularity module

morphology N assi -Schneiderman

diagrams output-bound partitioning physical database design procedural cohesion process-bound program package program specification program unit scope of effect

Study Questions 325

sequential cohesion skew span of control stamp coupling structure chart structured design temporal cohesion transaction analysis transaction -centered transform analysis Warnier Diagram width of hierarchy

EXERCISES _______ _

1. Complete the design for the ancillary processes of ABC rental: customer maintenance, video maintenance, and end-of-day processing. Develop structure charts, including all of the required data and control couples. Evaluate the diagrams and revise as required. Refine the pseudo-code for these functions from Chapter 7. Develop program specifications and identify how the modules will be packaged. Make sure that you state your assumptions about the pro- duction environment clearly as part of the expla- nation of your decisions.

2. What is the linkage between structured analysis and structured design? How do you use the information and documentation from analysis to develop an application design? Do you think analysts and designers should be separate peo- ple? Why, or why not?

STUDY QUESTIONS ___ _

1. Define the following terms: cohesion morphology coupling partitioning decomposition program package factor program unit function transaction analysis input-bound transform .analysis module

2. How does systems theory relate to structured design?

326 CHAPTER 8 Process-Oriented Design

3. How do you know the difference between a transform centered application and a trans- action-centered application?

4. What is the role cohesion plays in the partition- ing process? in the decomposition process? in physical database design? in deciding program packages? in program specification?

5. What is the role coupling plays in the partition- ing process? in the decomposition process? in physical database design? in deciding program packages? in program specification?

6. What are the major diagrams in the design phase? How are they derived? How do they relate to the work done in structured analysis?

7. What is the reasoning process for packaging program elements?

8. What is the purpose of Structured English? What are alternatives? For what are Structured English and its alternatives used? Why?

9. List the contents of a program specification.

10. Who usually does physical database design? Why would a specialist perform this task? Can SEs do physical database design as well? Why or why not?

11. Partition the following DFD and draw a struc- ture chart. Identify potential afferent and effer- ent flows. (There are several alternatives for afferents.) Label the flows you decide best describe the processes you see. List other infor- mation you need to decide what the best parti- tioning should be.

12. Evaluate the following structure chart. Describe the morphology. Is this diagram final or does it have problems? If so, what are the problems and how would you fix them?

Extra-Credit Question 327

* EXTRA-CREDIT QUESTION 1. Perform transform analysis on a case in Appen-

dix A. Design the processing for the central transform from the high-level DFD. Develop lower level DFDs as required to assist your thinking. Factor and develop a first-cut structure chart. Develop pseudo-code for the processes you define. Refine the pseudo-code and finalize the structure chart, giving reasons for your design decisions. Develop program specifica- tions and identify how the modules will be packaged. Make sure that you state your assumptions about the production environment clearly as part of the explanation of your decisions.

CHAPTER9

DATA- ________________ ~ ______ ~ ORIENTED

--------------------------~------------~ ANALYSIS

INTRODUCTION ____ _

Unlike process orientation, data-oriented analysis is not the result of the vision of a small set of people. Rather, it is the collective wisdom of many sources: computer vendors, MIS researchers, and consul- tants. The philosophy that underlies the data- oriented approach is that data are stable and more unchanging than processes. Processes can be revised with every reorganization. Data entities, on the other hand, rarely change in the lifetime of a business. Attributes of entities also rarely change. Even though the values of data do change constantly, the structure of the data does not. If data are stable, then they should be examined closely and first.

Data-oriented methodologies teach that data redundancy is to be minimized to best manage it in an organization. Database management software is assumed, but not required, in this approach. Data administration, that is, the conscious manage- ment of data as a resource of the business, is also assumed.

Information engineering (IE) is the methodology we use to discuss data-oriented analysis. IE teaches that to know which data should be the focus, we need architectures of data, business functions, and even organizational technology to guide the process. Architectures are conceptual descriptions of the items they define. Architectures are developed at the

328

enterprise level (see Chapter 5). Data and functional architectures are defined further during business area analysis, then are divided into application areas and prioritized. Therefore, multiple application areas can result from one or more business areas.

IE methodology defines activities from the strate- gic organizational level through to implementation of individual applications. The major phases of in- formation engineering are:

1. Enterprise Analysis 2. Business Area Analysis 3. Business System Design 4. Construction 5. Maintenance

In this chapter we discuss the Business Area Analy- sis (BAA) component of information engineering, which contains the activities that are most similar to analysis in other methodologies. IE analysis is called Business Area Analysis (BAA), rather than just analysis, because the focus is on business data and functions required to do the work. A departure from process-oriented analysis is that information engi- neering specifically ignores the current business or- ganization, applications, and procedures. IE focuses on how the business should work, rather than on how it does work. Reengineering of the organization and its applications are common adjunct activities to information engineering (see Chapter 5). In the next

section, we describe the conceptual foundations of data-oriented analysis. Then, the terminology of business area analysis is defined. This is followed by the rules and examples of how to conduct each activity.

CONCEPTUAL _____ ~ FOUNDATIONS ____ _

Data-oriented analysis is based mainly on theories about data. Process activities are based on the same systems theory which was the basis for the process development paradigm of Chapters 7 and 8.

The data-related theories are semantic informa- tion theory and relational database theory. Semantic information theory seeks to understand the meaning behind the data in applications and is most obvious in the depiction of meaning underlying entity rela- tionship diagrams. By understanding the entities, or things, in the application, we know more about their domains-the allowable sets of values they may take. Eventually, rules about domain matching and entity integrity are applied to include domain pro- cessing along with data processing of the individual attributes of entities. Relationships between entities are as important as entities and domains. By know- ing allowable business relationships, we can con- strain processing naturally, by applying business rules, without regard to organizatiortal design. Rela- tionship cardinality, or number, is important to knowing how many of each related item should be evaluated. Cardinality prescribes either individual entity instances or sets of instances for processing. By knowing more about the meaning underlying the data in an application, constraints can be automated and made more general, thus, simplifying the appli- cation development process.

Relational database theory is based on mathe- matical set theory (or relational calculus) which describes allowable operations on sets of data items. Relational theory was developed to support provably correct processing of data items, something that can- not be guaranteed by either hierarchic or network database architectures. Set theory is the basis for re- lational theory which replaces the notion of 'record'

Definition of Business Area Analysis Terms 329

processing with 'set' processing. Record processing constrains languages and applications to one-at-a- time record read-manipulate-write processing actions even though most records receive identical treatment in programs. By specifying the rules for processing orlce and applying those rules to the set of data records, or tuples as they are called in relational theory, the individual program no longer does any read-write processing-it is performed by the DBMS. Applying set theory, the result of any oper- ation is always a set. Thus, using mathematically based rules, the results of database processing can be known in advance and are provable.

Process activities performed are attributed to con- sulting practices that work and build on the systems theory underlying the process development para- digm. Some problems with DFDs are:

• DFDs do not accommodate time. • DFDs have no implied sequence to

processing. • DFDs assign media to data early in analysis

without any real deliberation.

These problems are eliminated in process data flow diagrams (PDFDs) that are built during IE analysis. Process methods of decomposition rely on analyst experience in process orientation. Data meth- ods, such as information engineering (IE), provide a business-oriented approach to defining processes. Structured process constructs-selection, iteration, and sequence-are not consciously considered in process methods until structured design. Structured constructs are used in IE analysis to describe process relationships.

DEFINITION OF ____ _ BUSINESS AREA ____ _ ANALYSIS TERMS ___ _

The tasks performed during business area analysis (BAA) are:

1. Data modeling 2. Data analysis 3. Functional decomposition (I.e., process

modeling)

330 CHAPTER 9 Data-Oriented Analysis

4. Process dependency analysis 5. Process data flow diagramming 6. Process/data interaction mapping and

analysis

Throughout the analysis, a data dictionary or repos- itory is assumed to be used for documentation. The final step of BAA is completion of the repository for all information found during analysis.

For data modeling, the two major activities are the creation and refinement of an entity-relationship diagram (ERD) and entity structure analysis, along with an accompanying repository. When complete, the ERD describes the normalized data environment and data scope of the application. Each part of an ERD requires definition. An entity type (shortened to entity in this discussion)1 is some person, object,

1 Technically, a customer is an entity who is uniquely described by a set of attributes. The set of all customers describes an entity type which is described by having the same attri- butes. A specific entity, e.g., customer 'Wells,' is an entity instance. In this text we use entity to be synonymous with entity type.

concept, application, or event from the real world about which we want to maintain data (see Figure 9-1). There are three kinds of entities: fundamental, attributive, and associative. A fundamental entity, for instance, an order, is independent of all other entities and can be defined without thinking about other entities. An attributive entity is an entity whose existence depends on the presence of a fun- damental entity. If order is the fundamental entity, then order item would be an attributive entity related to order (see Figure 9-2). Technically, you wouldn't have an order without any items, but you cannot have an order item without an order. Attributive entities contain repeating information relating to a fundamental entity. An associative entity is used to simplify and define complex relationships between entities. All entities are drawn on the entity relation- ship diagram (ERD) as rectangles. 2

2 One method of diagramming is to show relationships with a diamond bisecting the line connecting entities. An associative entity, promoting a many-to-many relationship, is drawn as a rectangle with the diamond inside.

EXAMPLES

Entity Type

Person

Object

Concept

Event

Organization

FIGURE 9-1

Entity

Fundamental

Attributive

Associative

Insurance

Policyholder

Policy

Policyholder Services

Purchase of Policy

State Bureau of Insurance

Entity Type Examples

ABC Video

Customer

Customer Rental History

Vendor-Video

FIGURE 9-2 Entity Examples

ABC Video

Customer

Video

Accounting Department

Rental of Video

Vendor

Human Resources

Employee

Employee Work History

Employee-Job History

Manufacturing

Customer

Bill of Lading

Order

Shipment of Goods

IRS, OSHA

Manufacturing

Work Order

Work Order Detail Items

Work Order Item-Finished Part

Definition of Business Area Analysis Terms 331

Number Education Examples Manufacturing Examples

One-to-One 1: 1 Student to Transcript Work Order Detail Item to Machine/DayfTime Operator

Course Section to Room/DayfTime

One-to-Many 1:N Course to Section Work Order to Work Order Detail Item

Transcript to Course

Course to Room/DayfTime Customer Order to Work Order

Students to Major Salesman to Customer

Advisor to Student

Many-to-Many N:M Student to Course Part to Work Order Detail Item

Professor to Course Vendor to Inventory Part

Professor to Section

FIGURE 9-3 Relationship Cardinality Examples

A relationship is a mutual association between two or more entities. It is shown as a line connect- ing the entities. A relationship has cardinality, or the number of the relationship. Cardinalities may be one-to-one, one-to-many, or many-to-many (see Fig- ure 9-3). Cardinality is shown on a diagram by crows' feet to indicate a 'many' relationship and a single line to indicate a singular relationship.

Refinement of the ERD has two activities: attri- butes are defined, and the ERD is normalized. Attributes are named properties or characteristics of an entity which take on values. We use the terms attribute, field, or data item, as synonyms. An in- stance is one occurrence of an attribute or relation. For example, an instance of the attribute customer- ID is the number 2922951.

Normalization is the refinement of data relation- ships to remove repeating information, partial key dependencies, and nonkey dependencies. Normal- ization can be directly applied to the ERD or can use a tabular method of data analysis. The direct method proceeds by examination of the relation- ship cardinalities and the attributes of entities. For l:n relationships, and for entities with repetitive information in the entity, we create ( or validate)

attributive entities. For an m:n relationship, the rela- tionship is promoted to create an associative entity. A synonym for associative entity is relationship entity. The cardinalities of m:n are reversed to create two l:m relationships (see Figure 9- 4).

The tabular method is recommended when data and relationships are not clearly specified. The tab- ular method forces explicit definition of all attributes and their relationships. When these dependencies are removed, each relation's data are fully, functionally dependent on the primary keys. An example is shown in Figure 9-5. By removing repeating infor- mation (first normal form), we create attributive entities (for l:n relationships) and associative enti- ties (for m:n relationships). In Figure 9-5, we create the items from a purchase order as an attributive entity. By removing partial key (second normal form), and nonkey (third normal form) dependen- cies, we create new fundamental entities. In the example, the new fundamental entities relate to items and vendors.

Upon completion of data modeling, entity struc- ture analysis is performed to determine whether a class structure applies. This analysis evaluates each entity to determine if the same processes and

332 CHAPTER 9 Data-Oriented Analysis

Before:

Fundamental Entity Fundamental Entity

After:

Associative Entity

Fundamental Entity Created by promoting the NB relationship Fundamental Entity

A /'

"'"

FIGURE 9-4 Direct Normalization of ERD

attributes apply to all entities of a given type. If con- tingent data usage applies, then classes are defined and a data hierarchy depicting the structure is developed.

Next, business functions are identified as a pre- lude to process modeling. A business function is a group of activities that accomplish some complete job that is within the mission of the enterprise. Busi- ness functions are ongoing and are not related to organization structure. Functions describe what is done in the organization from a high level of abstraction. Business function analysis is usually performed at the enterprise level, but can be the first activity of process modeling, if required. Represen- tative or generic functions that may be present in a business are listed below. Some of the functions are specializations, for instance, public protection is usu- ally a government function. Specialized functions

AlB ~ B V

Attributive Entity / 1\ Created to accommodate B's BIC repeating mformatlon

included are for banking, retail, governments, schools, and manufacturing. Other functions are general, like Finance, which every organization has.

Accounting Alumni Affairs Audit Community Programs Control and

Measurement Customer Relations Data Administration Distribution Engineering Support Facilities, Equipment,

and Supplies Administration

Finance

Funds Management Funds Transfer Health and Hospitals

Services Human Resources

Administration Information Systems Judicial Management Legal Services Management Manufacturing Marketing Material Acquisition

(Purchasing)

Definition of Business Area Analysis Terms 333

Un normalized First Normal Form Second Normal Form Third Normal Form Relation Name*

Purchase Order Purchase Order

(PO) Number PO Number PO Number

PO Date PO Date PO Date PO Vendor ID PO Vendor ID PO Vendor ID PO Vendor Name PO Vendor Name PO Vendor Address PO Vendor Address .. PO Ship Terms PO Ship Terms PO Vendor

PO Vendor 10 PO Payment Terms PO Payment Terms PO Vendor Name *PO Item Number PO Vendor Address POI Description PO Number PO Ship Terms POI Quantity PO Number PO Item Number PO Payment Terms POI Price PO Item Number POI Quantity POI Extended POI Description POI Price

Price POI Quantity POI Extended Price POI Price PO Item

PO Number POI Extended Price PO Item Number

POI Quantity Item Number POI Price Description POI Extended Price X Price

Item Number Inventory Item Description Price

*X indicates deleted items or relations. Relations are deleted if they are duplicates, are consolidated if they have identical keys or are proper subsets, or are named. Attributes are deleted if they are derived by the application. POI Extended Price is derived by multiplying POI Quantity by POI Price.

FIGURE 9-5 Tabular Normalization Example

Operations Planning Product~anagement

Product/Customer Service

Public Aid Public Facilities ~anagement

Public Protection ~anagement

Public Relations

Public Service Research and

Development Research Sales Scheduling Service Offering,

e.g., Instruction in a school

Student ~anagement

Sample business functions for ABC Video are shown in Figure 9-6.

When the functions applicable to application development are identified, functional decomposi- tion is performed. Functional decomposition starts at the business function level to identify the major activities of the function, and progresses to identify the processes and subprocesses for each function (see Figure 9-6). An activity is some procedure within a business function that can be identified by its input data and output data, which differ. The

334 CHAPTER 9 Data-Oriented Analysis

Business { Function

Analyze Business

Business Area Activities

FIGURE 9-6 ABC Video Business Functions and Activities

activity level must fully define the function. That is, the activity level is complete when all possible pro- cedures performed within the scope of the function are present in the diagram. Full definition is required to ensure complete data, process, impact, and orga- nization design analysis.

Activity names are usually of the form verb- object, where the verb identifies the major transfor- mation and the object identifies what is transformed. Exceptions to this rule are accepted when a name is conventionally called by a different form, for instance, Cash Management is more common usage than Manage Cash.

Activities are decomposed into their processes. A business process identifies the details of an activity, fully defining the steps taken to accomplish the activity. Again, full definition is required to ensure completeness of the ensuing analysis. Proce- dural steps named by processes are repeated and have definable beginnings and endings. Decompo- sition continues until the elementary, or atomic, level of each process is identified. An elementary process is a procedure that cannot be decomposed further without making the procedure lose its identity. Thus,

an elementary process is the smallest unit of work users identify.

Figure 9-7 is a sample decomposition showing processes that define the two purchasing activities within ABC Video. Don't forget that the business activities and processes in a decomposition fully define the scope of the parent business function.

Decomposition results are used to develop a process dependency diagram. A process depen- dency diagram, like an ERD for data, identifies the sequence and types of relationships between pro- cesses. Process relationships describe logical con- nections that include cardinality, sequence, iteration, and selection components (see Figure 9-8). Thus, the process dependency diagram shows the logic of sequence, iteration, and selection for each process. The process dependency diagram is then expanded to include entities and data stores to emulate a data flow diagram from process analysis. The result is a process data flow diagram (PDFD).

Connections between procedural steps in a PD FD are due to data passing from one step to the next and causing it to activate. This type of connection is called a process data trigger. A trigger identifies the

Place Order

Purchasing

Identify Items & Vendors

Call Vendor to Verify Availability

and Price

Create and Mail Order

File Order Copy by Vendor

Definition of Business Area Analysis Terms 335

Monitor Order Receipt

Identify Late or Problem

Orders

Call Vendor and Inquire or Reconcile

Verify Receipts Against Orders

Send Invoices to Accountant

FIGURE 9-7 ABC Video Partial Functional Decomposition of Purchasing

arrival of some data that causes a business process to execute. Process data triggers (or just data trig- gers) identify data that flow from one process to another to start execution of the receiving process. In a PDFD, the directed lines between processes signify a data trigger. In addition, external events can cause a process to activate. An event trigger signifies data from some business transaction that causes process- ing to take place. Event triggers are drawn on the PDFD by large arrows with words inside the icons to

name the events. For instance, the arrival of a new video releases list (see Figure 9-9) is an event that triggers the purchasing process.

Because the components of the process depen- dency diagram are different from those of a DFD, the PDFD that results from process dependency analysis is also different. Several key differences are important. First, there is a sequence to the pro- cess data flow. The directed arrows on Figure 9-9 indicate that some output from a process causes

336 CHAPTER 9 Data-Oriented Analysis

Identify Items &

Until no more items

Until no more videos Call Vendor

to Verify Availability and Price

If vendor, price known

Create and Mail

Order

Until no more vendors

File Order Copy by Vendor

Until no more vendors

FIGURE 9-8 ABC Create Order Process Dependency Diagram

the execution of the next process. Variations in the directed arrow lines define variations in the sequence. Second, the media that connect processes are not implied as in a DFD.3 The information that passes between processes is identified, but the form of the data is not. For example, the Identify Items and Vendors process in Figure 9-9 generates data that passes to later processes. The shared data might be mental, paper, an automated data flow, or a file. The decision of media, or stored form, of data is deferred until design unless it is fixed. Data files, such as Ven- dor and Order files on Figure 9-9, are identified because they are known. Third, data and event trig- gers identify the cause of execution of each process.

3 Remember, DFDs require identification of either a data flow or a data store as the data linkages between processes.

In a DFD, this information either is characterized as a data flow or is hidden within process logic.

The last step of BAA is the development and analysis of an entity/process matrix, also known as a CRUD matrix. If no enterprise level ERD exists first, then an ERD is created. The entity/process matrix lists entities across the top and business processes down the side (see Figure 9-10). Each cell of the matrix, then, points to a process-entity combination. For each cell, the systems engineers define Create (C), Retrieve (R), Update (U), Delete (D), or no (blank) responsibility of each process for each entity. Subject area databases are defined by analyzing logical groupings of processes and entities based on their affinity. Affinity means 'attraction' or 'closeness.' Affinity analysis clusters processes which share data creation authority for an entity.

Definition of Business Area Analysis Terms 337

Identify Items & Vendor File

Until no more items

Until no more videos

If vendor, price known

Create and Mail

Order

Until no more vendors

File Order Copy by Vendor

Order File

Until no more vendors

FIGURE 9-9 ABC Create Order Process Data Flow Diagram

These logical groupings become the basis for data- base design. In Figure 9-10, a partial example of Create Order and Monitor Order Receipt processes, and the entities they use, are analyzed in an entity / process matrix. The matrix shown is clustered by entity affinity and is ready for analysis. After analy- sis, the processes and entities are sorted to show affinity based on the actions taken on the same enti- ties (see Figure 9-11).

Two sets of analysis are performed on the results of affinity analysis. The first analysis is to determine the adequacy of organization design based on who creates and has responsibility for data. Each cluster

of processes is related back to the organization (in a similar matrix). Ideally, processes that share data responsibility should be in the same organization and report to the same manager. For instance, the ABC Purchasing processes show three potential group- ings. If each process is evaluated with all of the data it uses, the three groupings meld into one based on the criteria that 70% or more of the data are com- monly shared. If all of these processes report to one manager, the organization is probably adequate. If the three possible groupings all report to different managers, the organization should probably be redesigned.

338 CHAPTER 9 Data-Oriented Analysis

Entities = Purchase PO Item Inventory Vendor Processes Order Item

Identity Items & Vendors R CRU

Call Vendor to Verify Avail/Price RU

Create & Mail Order CRUD CRUD R R

File Order Copy by Vendor R R

Identify Late & Problem Orders R R R RU

Call Vendor & Inquire on Order RU RU R R

Verify Receipts against Order RU RU

Send Invoices to Accountant RD RD

FIGURE 9-10 Create and Monitor Order Receipt Entity/Process Matrix

The second analysis looks at the data entities by process cluster to define subject area databases. A subject area database is normalized across the organization and provides shared support for one or more business functions. At the ap- plication level, one subject database is assumed. In the ex-ample in Figure 9-11, one database would support the purchasing function; the database would have at least two user views to package Pur- chase Order with Purchase Order Item and Inventory Item with Vendor. A fourth user view linking all entities might be used for retrieval processing.

At the organization level, if the process groupings are logical and useful, they are the basis for reaf- firming the scope of applications. At the business area level, the groupings of processes should be con- sistent with the scope of the activities defined for the application. If they are not consistent, then man- agement review and rescoping of the project are required.

The last step of BAA is to finalize all informa- tion found during the analysis in a data dictionary or CASE repository. Since dictionaries were dis- cussed in detail in Chapter 7, in this chapter we will document the information found using the same for- mat as in Chapter 7, but will not comment again on the format of entries.

To summarize, business area analysis begins with an entity-relationship diagram that is fully identified, normalized, and analyzed for class structure. Busi- ness functions are identified and decomposed to cre- ate process hierarchy, process dependency, and process data flow diagrams. The business proces~es from the decomposition are coupled to entities from the ERD. Data-related responsibilities are described for each process. Affinity analysis of the CRUD matrix is used to decide organizational and database groupings for further design and management action. Next, we turn to a detailed description of how to per- form each activity, exemplified by the ABC Video Rental Processing application.

I Entities = Purchase Processes Order

Create & Mail Order CRUD

Call Vendor & Inquire on Order RU

Verify Receipts against Order RU

Send Invoices Subject Area 1

to Accountant RD

File Order Copy by Vendor R

Identify Late & Problem Orders R

Identity Items & Vendors

Call Vendor to Verify Avail/Price

PO Item

CRUD

Business Area Analysis Activities 339

Inventory Item

R Subject Area 2

Vendor

I R

CRU

FIGURE 9-11 ABC Purchasing Process Affinity Analysis

BUSINESS AREA ____ _ ANALYSIS ______ _ ACTIVITIES ______ _

Develop Entity-Relationship Diagram Rules for Entity-Relationship Diagram

The steps to building an entity relationship diagram (ERD) are as follows:

1. Define fundamental entities and their primary keys.

2. Define the relationships between the funda- mental entities.

3. identify all attributes of tmtities, including primary keys.

4. Add attributive entities, where needed, to simplify one-to-many relationships.

5. Promote all many-to-many relationships to define associative entities.

6. Normalize the fundamental entities, analyz- ing if there are other entities which are hid- den in the current definitions. Place new entities in the ERD. Define the new entities' attributes and primary keys.

7. Analyze the entities and their relationships to determine if a class structure is needed. If some instances of entities have identifiable differences in processing, data stored, or rela- tionship participation, classes probably are needed.

The first step is to define fundamental entities and their primary keys. Identifying entities is a difficult process until you have done it several times. It is easy to talk about entities, but less easy to define them. Part of the difficulty is that entities are context related. An entity for one company/appli- cation may not be an entity in another company/

340 CHAPTER 9 Data-Oriented Analysis

ENTITY DEFINITION IN XYZ ANNUITY In Exqmple 7-1, we discussed how at the annual meeting of the XYZ board of directors in 1991, the marketing director said that she had four different, irreconcilable counts of the numper of institutions the company serviced. What was worse was that there was a de- fendable definition of each number.

The board thought that was terrible and ord~red a redevelopment of the Institutional Processing application to resolve the prob- lem. When Diane Smith, the software engi- neer, began work on the application, her first task wos to develop an ERD for the informa- tion, without regard to the cllrrent files (12), applications (6), interfaces (4), procedures (28), or time relationships currently used in the organization. Just in sheer numbers, this was a significant amount of information to be ignored.

Twenty-two different people were inter- viewed, resulting in 22 different definitions of an institution. They included such descriptions as:

application. When in doubt, define more entities rather than less. You can always eliminate unneces- sary entities when the information for deciding becomes clear.

It is important to define each entity using terms that apply for all of its uses in the company. Such definitions may not match current definitions of the entity in use in the organization. An example in defining the terms from an educatiopal pension firm (see Example 9-1) shows the difficulty dealing with current thinking about entities and th~ir definition. Current thinking is frequently imprecise, muddled, and even inconsistent as the example shQws. Unrav- eling the spaghetti of definitions imbedded in the various terms used to describe institutions, colleges, campuses, plans, and their relationships took t4ree people much of six months, working with 10 user departments for the information.

• an organization that pays in to a pension plan for its employees

• an organization that requires counsel- ing about products and services provided by us

• an organization we target for market- ing campaigns

• an organization that defines a pension plan

• an organization that is subject to a pension plan that mayor may not be of its own definition

• an organization that is subject to legally defined pension plans by the state government in which it resides

• an organization that receives informa- tion about pension plans of the subor- ganizations for which it administers plans

Working with a data administrator, Diane and the key users unraveled the spaghetti of definitions to uniquely define major entities for

ERDs depict the big picture, capturing the orga- nization and its constituent activities. For this dia- gram, we must constantly remember to ask: What processes and activities are legal in the context of the business? Not: What is legal based on today s pro- cedures in our company?

In general, entities define something about which the business keeps information. An entity can be a person, object, application, concept, or event about which the application maintains information. For example, customer, order, and inventory are all entities. Entity names are usually nouns, however, NOT all nouns are entities. First, define a list of pos- sible entities. Then, examine each entry and ask yourself the following:

1. Is this a noun? If yes, continue. If not, either rename it or strike it from the list.

(Continued)

the organization. The following definitions, which took six months to attain, fully explain all variations of XYZ Annuity's institutional processing.

XVZ Annuity Entity Definitions

State Optional Pension Plan (SOPP)-An optional pension plan(s) defined by law, governing institution(s) specified in the law. SOPP institutions must adhere only and completely to the legal requirements of the SOPP.

Institution-A legal entity that is governed by an SOPP or, if not, may define its own pension plan(s) subject to Internal Revenue Service limitations.

Campus-A legal entity that is a subunit of an institution. If an institution defines its own plan, one of the plan items speci- fies whether or not campuses are bound by its definition. If a campus is not bound by the institution plan, it may own its own plan(s).

2. Is this potential entity (replace with the name of the potential entity) unique with a clearly defined purpose? If yes, continue. If no, either define the item uniquely from the con- text that led to its being on the list, or strike it from the list.

3. Can this potential entity take a value? If yes, it is an attribute; strike it from the list. If no, continue.

4. Does the business area need to keep informa- tion about this potential entity? If yes, con- tinue. If no, ask why it is on the list. If it trig- gers processes, continue. If it is a different form of some other entity (for instance, an order report is a paper version of an order), strike it from the list. If it is unique but does not fit the other criteria, leave it on the list for now.

Business Area Analysis Activities 341

Plan-A legal description of the product(s) offered, eligibility and wait- ing period requirements, and other pension plan provisions. Usually, and al- ways after 1992, each plan defines the offering for one product.

Product-A pension service offered by Educational Pension Trust, including individual annuity, group annuity, sup- plemental retirement, or group supple- mental retirement accounts. Each product defined requires definition of one or more investment types allowed.

Investment type-Annuity, Money Market, Educational Pension Trust Stock fund, and Educational Pension Trust Bond fund.

These six definitions sound simple enough to be obvious, but they began with a total of 120 different interpretations.

5. Give a formal name to the entity and define its primary key.

6. Draw one rectangle for each entity to begin developing the ERD.

Once you are comfortable with the entities, begin defining their relationships. Relationship names, describing entity associations, are usually verbs, however, NOT all verbs describe relation- ships. The goal is for all rules of association to be unambiguous. First, define possible relationships. In general, ask yourself how entities relate to each other. If I have Entity A, do I also have Entity Bs? If so, how many are legal? Ask the question without regard to each entity's current usage in the company. As with entities, relationships should define what is legal within a business context. Sometimes, ignoring current definition is extremely difficult because we

342 CHAPTER 9 Data-Oriented Analysis

and users internalize such definitions and use them to narrow our focus and simplify the world.

Examine each possible relationship and ask your- self the following:

1. Is this a verb? If yes, continue. If not, either rename it or strike it from the list.

2. Is this verb an action? If no, continue. If yes, remind yourself that relationships do NOT describe processes or processing. If the verb is a process, strike it from the list.

3. Is this potential relationship (replace with the name of the potential relationship, e.g., place as in customers place orders) unique with a clearly defined purpose? If yes, continue. If no, either define the item uniquely from the context that led to its entry on the list, or strike it from the list.

4. Is this potential relationship needed to fully describe the business area's data? If yes, con- tinue. If no, ask yourself why it is on the list. If it is not clear what the reason is, continue, leaving the relationship to be reevaluated when more information is known. If the rea- son is not related to the business area, strike it from the list.

Once you define a relationship, draw a line( s) to connect the entities participating in the relationship. Mark the diagram with a verb to describe each direction of the relationship. The convention is to

Places Customer

read the relationship above the line from left-to- right, and the relationship below the line right- to-left. For instance, customer places order and order is placed by customer (see Figure 9-12). The words are placed differently depending on the placement of the entities on the diagram. By convention, the ac- tive verb (in this example, 'places') is positioned on top of the line with the acting entity (' customer' here) on the left of the diagram.

Next determine the number, or cardinality, of the relationship. The number of the relationship is one of three possibilities: one-to-one, one-to-many, or many-to-many. A one-to-one relationship defines a situation in which every entity A relates to one and only one entity B. In a one-to-many relationship every entity A relates to zero to n, that is, any number of entity Bs. In a many-to-many relationship all As can relate to any number of Bs. Decide cardinal- ity by asking the same questions of each side of the relationship: If I have one entity A, how many entity Bs can I have associated with it at any point in time? Conversely, if I have one entity B, how many entity As can I have associated with it at any point in time?

An example of the time issue relates to student registration and tracking. A student may take many classes in one semester; this describes a l:n relation- ship. Over time, students take many courses and courses contain many students; this is an m:n rela- tionship. Which is correct? The m:n relation that

Order

Is Placed by

Is Placed by Order Customer

Places

FIGURE 9-12 Placement of Words on Entity-Relationship Diagrams

Places / Customer

Is Placed by "

Order

FIGURE 9-13 Relationship Cardinality and ERD Representation

describes the student-course relationship without regard to time is correct within the context of a stu- dent registration application.

Draw crows' feet to show cardinality of each relationship. Crows' feet are reverse arrow heads that indicate a 'many' numbered relationship.4 The example in Figure 9-13 shows the relationship of customer to orders as one-to-many. That is, for anyone customer, one to many orders may be associated with it. Conversely, the relationship of orders to customers is one-to-one. That is, for

4 This is the IE convention. There are other techniques for drawing ERDs, such as Chen's [1976]. Chen uses multiple arrow heads for 'many' relationships and uses diamonds to identify relationships with only one verb. The logic of both approaches is identical. Martin's notation is used here because it is automated in CASE tools.

Places I Customer I

Is Placed by

Business Area Analysis Activities 343

any given order, it is associated with one, and only one, customer.

Lastly, for each entity in a relationship, we decide whether the entity is required or optional in the rela- tionship. In a required relationship, the entity must be present for the other entity to exist. In an optional relationship, the entity described mayor may not exist when the other entity exists. Either an '0' or a vertical bar, 'I', shows each side of a relationship as optional (0) or required (D.

Returning to the order example, customers place orders at their discretion, so orders are optional. Cus- tomers are required to have been identified as cus- tomers to place orders, so customer is required (see Figure 9-14).

In the order-item relationship, orders do not exist if there are no items ordered, so order is required. The vertical bar (' I ') bisects the relationship line close to the order entity. Examining the items, we have a similar relationship. For an order to exist, there must be at least one item, so item is also required. Both sides of the relationship line have a vertical bar (see Figure 9-14). 'Read' this entire relationship as follows:

Orders contain items. For each order, there are one or more order-items. For an order to exist, at least one order-item is required. An order-item is contained in order. For an order-item to exist, an order is required. For each order-item, there is one, and only one, order.

"'/ '-'" Order -f-

Contain Is Cont ained by

/ ~ Order Item

FIGURE 9- 14 Required/Optional Relationship Representation

344 CHAPTER 9 Data-Oriented Analysis

M:N Relationship:

~v_endor~>->--fm -----+-in~~_part --'

Promoted and Transformed Relationship:

~vend_or ----l11 m1'----v_endo_r-part_~ n 11 L--_pa_rt ----l FIGURE 9-15 Many-to-Many Relationship Promotion and Transformation

Similarly, associative entities are created by pro- moting m:n relationships, joining the primary keys of each participating entity to identify the associa- tion. Other fields might also be needed to provide unique identification. The m:n relationship is con- verted into two l:m relationships in the promotion process (see Figure 9-15).

After all known relationships are defined and entered on the diagram, we define attributes for the entities and normalize them (Steps 3-6 of list on p. 339). The goal of this part of the exercise is to de- fine hidden attributive and associative entities. In the example above (see Figure 9-14), Order and Cus- tomer are fundamental entities. Order-Item is an at- tributive entity. If it had not already been identified, either normalization would identify it, or it would be identified by answering the question: Can any of the attributes relating to entity Order occur more than once? If the answer to this question is yes, there are attributive entities to be identified.

Direct normalization of ERDs is possible but requires detailed understanding of data. When you have an ERD but are less comfortable about your understanding of the data and their relationships,

tabular normalization can be used to complement, validate, or replace direct normalization. Tabular normalization requires complete definition of data and relationships, and results in exactly the same entities as direct normalization.

To use tabular normalization, first describe each entity and all entity attributes. Cluster attributes depending on whether they are singular or multiple occurrences. (Tabular normalization rules are sum- marized in Table 9-1.) Then, proceed to remove repeating groups. For each repeating group create a new relation. The key of the new relation is the key of the repeating group and the original key. To remove partial key dependencies, create new rela- tions of any attributes and the part of the key to which they relate. The key of the new relation is the part of the original primary key that functionally defines the relationship. Finally, remove nonkey dependencies by creating new relations from the nonkey attributes that are related. The key to the new relation is the attribute(s) that define the functional relationship. In the tabular method, multivalued dependencies are treated as single attribute, repeat- ing groups in the nonnormalized set-up stage.

TABLE 9-1 Normalization Rules

For Unnormalized Data

1. Identify all attributes that relate to an entity. Keep in mind that there are several types of attributes.

• Nonrepeating, primary key attributes(s). A nonrepeating attribute is a single fact about an entity type. A primary key is a unique identifier for all attributes associated with an entity type.

• Nonrepeating, nonkey attributes(s) are single facts about an entity type. • Repeating attribute(s) are facts that may have more than one occurrence for a specific value of an entity's

primary key. Repeating attributes may be single repeating facts, such as the date of birth of offspring; or may be groups of repeating fqcts, such as date of birth and name of offspring. Repeating attributes are either repeating key attributes or repeating nonkey attributes. Repeating nonkey attributes are listed with their primary key identifier.

2. List all attributes that relate to an entity together. Indent repeating information. Skip a line or leave a space between entities and between repeating groups. Repeating groups might have only one attribute that repeats; this is also called a multivalued dependency. Place an asterisk at the first attribute of each repeating group to show its beginning.

3. Underline the primary key field(s) of the unnormalized relations, including keys of both singular groups and repeating groups.

4. Proceed to first normal form.

First Normal Form (lNF)-The Goal of INF Is to Remove Repeating Groups

1.1. Examine each relation. If the relation has no repeating groups, it is in INF. Draw an arrow from the unnormal- ized column to tp.e normalized colpmn to show that the analysis i~ complete, and continue.

1.2. If the relation has repeating groups, build a relation from the single nonrepeating fields. The key of the relation is the key of the original relation. Contique.

1.3. Next, for each repeating group, build a new relation of the repeating information. Append the key of the original relation to the repeating information. The key of this relation is the key of the original relation plus the key of the repeating group.

Second Normal Form (2NF)-The Goal of 2NF Is to Remove Partial Key Dependencies

2.1. Examine each relation independently. If the INF relation does not have a compound key, it is in 2NF. Draw an arrow from the relation through the 2NF column to show that it is complete, and continue.

2.2 If the INF relation has a compound key for each nonkey field, ask the following question: Do the data field relate to the whole key? In other words, do YOll need to know the whole key to know the values of the attribute, or do you only need part of the key to know the value of the attribute? If the answer is that you need the whole key for all fields, the relation is in 2NF. Draw an arrow from the relation through the 2NF column to show that it is complete, and continue.

2.3 If by knowing a part of the key we know the value of one or more data fields, then we will build two new types of relations. First, build a relation with the nonkey data field(s) that are wholly dependent on the compound key. The key of this relation is the key of the INf r~lation.

2.4 Second, build one new relation for each partial key identified. The new relation(s) include the nonkey data field(s) and the part of the original !cey on which they are fully dependent.

(Continued on next page)

345

346 CHAPTER 9 Data-Oriented Analysis

TABLE 9- 1 Normalization Rules (Continued)

Third Normal Form (3NF)-The Goal of 3NF Is to Remove Nonkey Dependencies

3.1 If the 2NF relation(s) have only one nonkey data field, it is in 3NF, go to optimization.

3.2 If all data fields in the relation(s) are dependent upon the key and nothing but the key, then the relation is in 3NF. The question here, is "Do nonkey fields relate to the key or do they really relate to each other?"

3.3 If a nonkey dependency exists, build one relation of the nonkey data field(s) that are dependent on the 2NF key (this include the nonkey field that is the key in the step below).

3.4 Build one new relation for each nonkey dependency identified. The new relation(s) include the nonkey data field(s) and the nonkey field on which they are dependent. The key of this relation is the nonkey field from the original relation on which the other field( s) is dependent.

Now, check for anomalies ... conditions that still will cause errors. This is one way of double-checking that your original relationships were correctly defined. Ask two questions.

1. Given a value for a key(s) of a 3NF relation, is there just one possible value for the data? If the answer is NO, then multivalued dependencies exist. Check that the correct data relationships are defined, then treat the multi- valued single fact as a single-attribute repeating group and renormalize the data.

2. All are attributes directly dependent upon their related key(s)? If the answer is NO, then transitive dependencies exist. Treat the transitive dependency like a nonkey dependency and renormalize the data.

Finally, synthesize and integrate the relations.

1. Remove any fields that are computed in the application. This does not mean that these attributes are not stored in the physical database; it means that they are not logically required to define the entity.

2. If two or more relations have exactly the same primary key, combine them into one relation. Make sure that each attribute occurs only once.

At third normal form, synthesis of the resulting relations is performed to

• combine relations that have identical primary keys but different nonkey attributes

• eliminate relations which are exact duplicates, or proper subsets, of other relations

• combine relations for which the primary key of one is a proper subset of the primary key of another

Mter normalization and synthesis are complete, new entities (or relations) and their relationship to

the fundamental entities are added to the diagram as needed to fully depict the information.

Next, the entities and relationships are analyzed to determine if a class structure is needed. The rea- soning process is as follows:

1. Ask if this entity occurs in this, and only this, form (i.e., with all attributes) for every legal occurrence of the relationship being exam- ined? If the answer is yes, continue. If the answer is no, you must define subclasses that describe the contingencies of existence for

the entity. This procedure is described in the next section.

2. Does this relationship hold for all occur- rences of the entity? If yes, continue. If no, follow the reasoning below to determine the subclasses of the entity and their relation- ships.

3. Is this entity ever optional? If no, continue. If yes, follow the reasoning below to determine the subclasses of the entity and their relation- ships.

4. Can only a subset of occurrences of an entity participate in a given relationship? If no, con- tinue. If yes, follow the reasoning below to determine the subclasses of the entity and their relationships.

5. Have several types, or kinds, or categories of an entity been identified? If no, continue. If yes, follow the reasoning below to determine the subclasses of the entity and their relation- ships.

6. Are words like "either, or, sometimes, usu- ally, generally, in certain cases" ever used in describing entity behavior? If no, continue. If yes, follow the reasoning below to determine the subclasses of the entity and their relation- ships.

Before Subclasses:

Customer

Order

Business Area Analysis Activities 347

To determine subclasses, you must determine which information is kept (or which processing is done) for which type (or subclass) of orders. Ask questions about every possible variation of infor- mation and processing until 'if-then-else' logic sur- faces. Use the alternative situations to define the subclasses. That is, one subclass for the if logic, another subclass for every other else if logic. Ask questions of each type of information about every entity. Don't stop just because you find one subclass; there may be others. When you have found all sub- classes, verify them with the user and modify the diagram accordingly.

For instance, in an order fulfillment application, a legal entity relationship describes 'customers place orders' (see Figure 9-16), but that information may not be the same for all customers' orders. Does the time of day or time of month affect the relationship? Do the shipping address differences affect the rela- tionship? Does the sold-to/ship-to arrangement affect the relationship? Does the type of goods ordered affect the relationship? Does the type of pay- ment affect the relationship? In this example, we will say that Cash Order information kept includes Cus- tomer ID and Total Amount, where Credit Order information kept includes Customer ID, Name, Sold- to/Ship-to addresses, Order Date, Shipping Terms,

After Subclasses:

Customer

-c~-I-c~:

/~j I~ _C~h _I_cr~di~

Order

FIGURE 9- 16 Examples of Subclasses in Customer-Order Relationship

348 CHAPTER 9 Data-Oriented Analysis

for each item (Item Number, Item Description, Quantity, Price, Extended Price), Sales Tax, and Total. Here, we know there are subclasses because different sets of data are kept. The next issue is to decide the entity(s) to which the subclasses relate.

To decide which subclasses apply to orders, we ask if the entity Order is affected differently by cash and credit sales. If the answer is yes, different infor- mation is kept for each. In this example, there would be subclasses for Cash Order and Credit Order. Then, we ask if the entity Customer is affected dif- ferently for cash and credit customers. Are all cus- tomers either cash or credit? What are the rules for buying on credit? The common answer is applied here. Some customers are only cash, thus, creating a cash customer subclass. Some customers are qual- ified to buy on credit, but they are not required to buy on credit. That is, credit customers can pay either by cash or by credit. Therefore, knowing which type of order a customer will create is only possible if the customer is a cash customer. Depend- ing on the application, these customer subclasses might be important. Here, we will say they are. The ERD is altered to show the subclasses of each entity class and how they now relate to each other. Notice that the simple before diagram in Figure 9-16 is more complex with subclass additions.

To summarize, first define entities, then relation- ships, then attributes. Promote the many-to-many relationships to associative entity status and modify the diagram to reflect the new entities. Add attribu- tive entities as required for repeating information relating to entities. Identify all new attributes of all entities. If necessary, do tabular normalization of the relations. Analyze each entity to determine if subclasses are required and modify the diagram to describe them. These activities are best documented in a CASE tool with repository (or dictionary or encyclopedia) entries made as the work progresses. At the end of ERD creation, you have not only the ERD, but also the repository definitions for all items in the ERD.

ABC Video Example Entity-Relationship Diagram

The first step (refer to the list on p. 339) in develop- ing the ERD is to identify fundamental entities. A

first-cut definition of the potential fundamental enti- ties in ABC rental processing includes: customer, video, rental, printed rental, clerk, and system. These entities are identified from the ABC Rental Process- ing requirements in Chapter 2. Next we analyze each potential entity to see if it really is in the business area and application.

Customer is a noun. It uniquely defines the people who rent and return videos. By itself, customer does not take on a value; rather, each customer is de- scribed by a set of attributes. The business must keep information about customers renting videos to do business with them. The formal name is Customer.

Video is a noun. It uniquely defines an item from inventory that is available for rent. By itself, video does not take on a value; it has descriptive attributes. The business must keep information about videos to conduct its business. The formal name is Video.

Rental is a noun that uniquely describes videos rented by customers for a specific period. By itself, rental does not take on a value; it combines attri- butes of Customer and Video with attributes of its own. The business must keep information about rentals to provide an audit trail for tax purposes. The formal name is Rental.

A Printed Rental is a noun that describes videos rented by customers for a specific time period. A printed rental is not unique since its definition mir- rors that of rental. However, it is unique in that it shows the customer signature. If there is a legal dis- pute over charges, the business is required legally to provide documentation that rental took place, and the customer knowingly rented. The business does not keep infortnation about a printed rental, though; the information is about a rental. 5 Printed rentals are another medium or form of Rental. Printed Rental is stricken from the list.

Clerk is a noun uniquely describing the person who initiates processing for the application. By itself, clerk does not take on a value. The business does not need to know who did the entry of infor- mation unless Vic changes the requirements of the

5 If customer signature was kept, or if we just left printed rental on the list, when the data were normalized we would find that the primary keys to the printed rental and the rental were identical. That would lead us to combine the fields in one rela- tion called Rental.

Business Area Analysis Activities 349

Customer

Is Request Requested

Contains / Rental Is Contained in "'"

Video

FIGURE 9-17 ABC Rental Processing-First-Cut Entity-Relationship Diagram

application. Since clerk is not required, we strike it from the list.

Similarly, system is a noun that uniquely de- scribes the hardware/software environment that will do rental processing. The system has no personal values, and neither do we maintain information about the system. System is stricken from the list.

N ext we draw a rectangle for each of the three entities that remain: Customer, Video, and Rental. Figure 9-17 shows the entities and relationship( s) between the fundamental entities. Customers request Rentals. Rentals contain Videos. The relationship names are unique verbs describing the interactions. The line connecting Customer and Rental contains crows' feet at the Rental side to show a one-to-many

Customer

-c- Is

Request ( Requested

) by

relationship. That is, each Customer may place one or more Rentals. Each Rental is placed by one and only one Customer.

Similarly, each Rental contains one or more Videos; each Video can be rented by one and only one Rental at a time. We have a problem with the clause at a time in this definition. Relationships are supposed to be defined without regard to time. How do we account for this problem? We might defer a decision on how to deal with this until some later time, making a note ofthe need for 'date' as an iden- tifier for the video-rental relationship. Or, we might remove time from the definition, creating the ERD in Figure 9-18 which shows a many-to-many relation- ship with rental and video. That is, each video may

Rental " ""Contains I /

Video

~s ~ontainedl in"

FIGURE 9-18 ABC Rental Processing-Second-Cut Entity-Relationship Diagram

350 CHAPTER 9 Data-Oriented Analysis

Customer

Request o J\

Rental

Is Requested by

~I /1

Contains

Is Contained ( by )

/""'-

Copy

'\v (D

Is Described by

-t-

Video

Describes

FIGURE 9-19 ABC Rental Processing-Third-Cut Entity-Relationship Diagram

be rented more than once, and each rental may contain more than one video. We take this option at the moment, knowing that it is an incomplete definition of the relationships which need to be refined.

Next we decide the nature of the relationships, whether they are required or optional. A Customer must exist to place Rentals. A Video must exist to be contained in a Rental. Does this make sense; must you have both a Customer and a Video to do a Rental? Yes, this makes sense. Now analyze the other side of the relationships. Are Rentals required for Customers to exist? No, Customers do not nec- essarily have rentals every day. Are Rentals required for a Video to exist? No, Videos can exist without being related to a Rental. Both relationships of Rental to the other entities are optional.

Identify attributes and associative entities. The m:n relationship of Videos and Rentals should be promoted to make an associative entity. The new entity relates to each physical tape being rented. Thus, we have a Video entity and a Copy entity. We reason through this creation in another way. Video information is not detailed enough to keep track of every physical tape in inventory because each video may have many copies. This leads us to add infor- mation about copies. Referring to the case in Chap- ter 2, we find that Vic wants to be able to track the status of any tape. The minimum copy information needed is Video ID, Copy ID, Date Received, and Status. Other information might be considered, for instance, current month rental counts, but we defer this for the moment. Figure 9-19 shows the ERD to this point.

Does the insertion of Copy take care of the many- to-many relationship? We can still have a Copy on many rentals over time and Rentals can contain many items, so the answer is no. Next we look at the Rental to further examine its details. Rentals are sim- ilar to orders. Just as an order has one or more items, each rental can have one or more rental items. There is a one-to-many relationship of Rental to Rental- Item which we add to the diagram. By itself, this does simplify the many-to-many relationship; a Rental-Item belongs to a specific Rental and relates to a specific inventory Copy. Now the entities and relationships look clean with all many-to-many relationships promoted, and all apparent one-to- many relationships explained (see Figure 9-20).

To confirm the original and promoted entities, we will normalize the data using the tabular method. For

Customer

-I-

Is Request Requested

<D by /1\

Rental

Contains - - Is <) Contained

/" by

Business Area Analysis Activities 351

tabular normalization to proceed, first define the attributes of each entity. From the functional requirements in Chapter 2, list all attributes of each entity. The list is shown in Table 9-2, in unnormal- ized form, with the copy information identified as repeating within video information. Make a sepa- rate list for each entity. For each entity, list together attributes that occur only once. Indent repeating groups under the related entity, making sure that all information for each group is together. Underline primary keys for both nonrepeating and repeating information. Remember, the primary key uniquely identifies its information.

Next apply the rules in Table 9-1 to remove repeating groups, partial key dependencies, and non- key dependencies. Synthesize the 3NF results to ensure minimal redundancy. The result of ABC's

Rental-Item ...... Refers to I

Copy '" '-" I

Is Referenced by

'\v <D

Is Described by

-I-

Video

Describes

FIGURE 9-20 ABC Rental Processing Fourth-Cut Entity-Relationship Diagram

352 CHAPTER 9 Data-Oriented Analysis

TABLE 9-2 List of ABC Video Entity Attributes

Unnormalized Form- First Normal Form Repeating Groups, (lNF)-Repeating All Primary Keys Groups Identified

Customer Phone Customer Name Customer Address Customer City Customer State Customer Zip Customer Credit Card Number Credit Card Type Credit Card Expiration Date

Customer Phone All Customer Info

from Above Rental Date Total Rental Fees

*Video ID Copy ID

Video Name Rental Date Return Date Rental Rate Late Fee Due Fees Due

Video ID Video Name Entry Date Rental Rate

Copy ID Date Received Status

normalization is shown in Table 9-3. Now we have six relations to be synthesized and evaluated.

In the synthesis step, several pieces of informa- tion are deleted. 'All Customer information' is not required to maintain rental information; only a Cus- tomer Phone or ID is required as a cross reference, or foreign key, to the Customer relation. Total Rental Fees are calculated and, therefore, not required. The

Second Normal Form Third Normal Form D (2NF)-Partial Key (3NF)-Nonkey e Dependencies Dependencies I

entire relation containing VideoID, Video Name, and Rental Rate is deleted because it exactly duplicates information already in the next relation which has more attributes.

To reconcile the 3NF results to the ERD, we look at the Rental relationships again, and use some 'out of the box' thinking. The relationship we identified for Rental to Rental-Item is similar for many busi-

Business Area Andlysis Activities 353

TABLE 9-3 ABC Video Normalization Results

Unnormalized Form- First Normal Form Repeating Groups, (lNF)-Repeating All Primary Keys Groups Identified

Customer Phone Customer Name Customer Address Customer City Customer State .. Customer Zip Customer Credit Card Number Credit Card Type Credit Card Expiration Date

Customer Phone Customer Phone All Customer Info All Customer Info

from Above from Above Rental Date Rental Date Total Rental Fees Total Rental Fees

*Video ID Copy ID Customer Phone

Video Name VideoID Rental Date CopyID Return Date Video Name Rental Rate Rental Date Late Fee Due Return Date Fees Due Rental Rate

Late Fee Due Fees Due

Video ID Video ID Video Name Video Name Entry Date Entry Date Rental Rate Rental Rate

CopyID Video ID Date Received Copy ID Status Date Received

Status

ness transactions: orders, confirmations, shipping papers, back-orders, and invoices. The question here is: Do we need both entities? We require the Rental- Item entity information because it documents the

Second Normal Form Third Normal Form D

(2NF)-Partial Key (3NF)-Nonkey e

Dependencies Dependencies I e

• ..

.. • X X

Customer Phone VideoID CopyID Rental Date Return Date Late Fee Due Fees Due Video ID Video Name Rental Date ..

• •

business transaction. The question then is: Do we need the Rental information separated? Is it uniquely different? Customer Phone is also in Rental-Item; Rental Date is also related to each video the

354 CHAPTER 9 Data-Oriented Analysis

Customer: xxxxxxxxxxxxxxxxxxxx,xxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxx, xx xxxxx (xxx) xxx-xxxx

Open Rentals:

Video Copy Description Rental Pd Late Pd Other Pd

xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxx 9.99 xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxx 9.99 xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxx 9.99 xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxx 9.99

New Rentals

xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxx 9.99 xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxx 9.99 xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxx 9.99 xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxx 9.99 xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxx 9.99

Total 99.99

Total Amount Due Amount Paid Balance

FIGURE 9-21 Partial Rental Screen

customer has. We eliminate Total Rental Fees as a computed field but we need to redecide if we will ever need this information stored in a file.

Continuing this reasoning, think of the processing to be done. When a customer requests a video, the system should display all open rentals regardless of when they were rented. The Rental-Item information will be listed down the screen in rows, one row per video (see Figure 9-21).6 A total of all open rental fees plus any new fees will be near the bottom. From where will customer information come? If we keep the Rental relation/entity, we either choose one from potentially several for display, or, if Customer Phone is entered first, we ignore them all. This sounds like

6 This is another example of the jumping between levels of detail required to complete each logical step in the process.

x 9.99 x 999.99 x x 9.99 x 999.99 x x 9.99 x 999.99 x x 9.99 x 999.99 x

x x x x x

99.99 9,999.99

9,999.99 9,999.99 9,999.99

a kludge, that is, a mess! If we delete this entity / relation, can we get the information another way? That is, can we recreate this relationship if we need to for any reason? The answer in this case is, yes. Rental-Items all have Customer Phone and the first one accessed can be used to retrieve customer infor- mation. We conclude that we can eliminate the 'Rental' entity entirely. The completed, revised ERD is shown as Figure 9-22. Rental-Item is renamed to Open Rental to avoid confusion about its contents.

After removing the Rental entity, the relation- ships and entities now appear minimal. That is, we must keep all of these entities. If we remove anyone of these entities, we cannot recreate the desired in- formation for the removed entity, nor can we com- pletely describe all data relationships. With these entities, we can represent the entire problem data

space, and we can accommodate all the processing required in the application. The ERD now appears complete. Keep in mind that complete does not mean cast in concrete; the ERD can be modified as re- quired to accommodate new information.

The reasoning process we used to eliminate the rental entity can be used on any similar entities, for instance, orders. In other applications the higher level entity analogous to the Rental entity here might be required. You cannot eliminate an entity when any of the following conditions is true:

• The entity has unique information of its own.

• The entity, or its attributes, cannot be recre- ated through combining other entities.

• The entity is required for legal purposes.

Customer

-f- Is

Request Requested

(D /~

Business Area Analysis Activities 355

An accurate and complete ERD is crucial to developing an application that solves a real-world problem. During development of the ERD, pay par- ticular attention to entity definitions, making sure they are distinct, simple, and precise. Analyze each entity for selectivity in processing, data, or timing to determine if a class structure is warranted. Also analyze each entity for its actual need. If an entity can be recreated from other information, has no unique attributes of its own, and is not required for legal purposes, omit it. Analyze every possible rela- tionship to determine relationship existence. When defining relationship cardinality and required/ optional status, make sure time and current proce- dures are ignored. Do pay attention to legal require- ments and business requirements in defining cardinality and required status.

Open Rental " ,..... Refers to I V 'oJ I

Is Referenced by

Copy

Is Described by

'\v <D

-f-

Video

Describes

FIGURE 9-22 ABC Revised, Complete Entity-Relationship Diagram

356 CHAPTER 9 Data-Oriented Analysis

Decompose Business Functions Rules for Decomposing Business Functions

If a functional decomposition was not yet developed at the enterprise level, it is created now. If a decom- position was developed at the organization level, it is further decomposed here to define details of pro- cesses. In either case, decomposition is independent of the ERD; it can be done before, during, or after the ERD. IE recommends the ERD first, but while you gather data for the ERD, you invariably get process information. Many practitioners concentrate on data first, but begin to build the decomposition simultaneously. The steps to functional decomposi- tion are:

1. Define the enterprise for which the diagram is being developed. Place the enterprise name in a rounded rectangle at the top of the diagram.

2. Define business functions of the enterprise. Using consistent parts of speech for each name, place the functions in rounded rectan- gles on the second row of the diagram. Do not pay attention to current organization, pol- icy, or procedures in defining functions. Use current business practices in the industry to guide the definition and placement of func- tions (and activities and processes).

3. Define the activities that fully define each function. Name them using consistent parts of speech, usually of the form verb-noun. For each function, create a separate diagram with a row depicting the activities of the function.

4. For each activity, fully define the processes that describe work performed for each activ- ity. Name each process using the form verb- noun. Add processes under their respective activities on the diagram in the sequence in which they are performed.

5. Continue to decompose the processes and add them to diagrams depicting successive levels of detail until the definitions are atomic.

6. Verify all diagrams with the user.

7. Define the detailed procedures for accom- plishing each process and document func- tions, activities, processes, and procedures in the repository.

First, identify (or verify) the functions applicable to the BAA activity. An easy way to check functions is to review the list of generic functions on p. 332 and, for each, determine its applicability to the situa- tion. Name each function so it relates to the busi- ness context of the BAA. For example, if the function deals with finance, but in the client context finance includes both Finance and Accounting, use the latter function name. Name each function with a noun, preferably a nonqualified noun. For example, Finance is preferred to Corporate Finance. If users have not participated in this activity, verify the list with a user. Place each function on the decomposi- tion under the enterprise identifier in rounded rec- tangles (see Figure 9-23).

Next, for each function, define the major activi- ties and place them under the function they describe. The diagram resembles an organization chart. When complete, the activities should fully describe each function. Do not pay attention to organizational boundaries or current organization policies and pro- cedures. Do pay attention to legally required actions, actions that specifically relate to goals of the orga- nization, and industry practices that are required. Do identify timing, cardinality, or current business practices for each activity.

Activity names do not have a specific form, but should be consistent in the part of speech used for all names. In the example above, the function Finance might include several activities, such as Corporate Finance, Regional (or Subsidiary-name) Finance, International Finance, Analysis and Reporting, Plan- ning, Budgeting, and Funds Management. In this example, Funds Management might have been called Manage Funds, but the inconsistent part of speech makes this a weak name.

Next, for each activity, decompose the activity to define the processes that fully describe the activity. Processes may have their own subprocesses. Con- tinue decomposing until the elementary, or atomic level, of process is identified. Recall that an elemen- tary process is the smallest unit of work users can

Business Area Analysis Activities 357

FIGURE 9-23 Placement of Functions, Activities, and Processes on a Functional Decomposition Diagram

identify. Name each process with a verb-object name. Within Funds Management, for example, processes might include: Manage Overnight Funds, Manage Cash, Manage Payroll Accounts, and Man- age Savings Accounts. Each of these processes can be further decomposed to identify the details of the procedure used to perform this process. Continuing with this example, Manage Overnight Funds might include: Identify Funds, Identify Options, Analyze Options, Place Funds, Complete Accounting Entries. Each of these is a process, too, but these processes cannot be further decomposed without requiring interrogation of multiple processes to locate all of its

component parts. Therefore, these processes are atomic, or elementary. That is, each can be per- formed as a unit, but cannot be further decomposed without losing its unit identity.

The difficulties of process decomposition lie in achieving parallel levels of abstraction and com- pleteness. The goal is to maintain consistency within a level of process decomposition. The SE and user must work together during this definition because the levels of detail are beyond IS knowledge. Only job incumbents know exactly what they do and how they do it. The user is the main person defining the decomposition, but the SE is the person who actually

358 CHAPTER 9 Data-Oriented Analysis

abstracts the diagram and systems information from the user-supplied information. The user relates each process to all of the other processes, describing each in detail.

Some clues to consistency of abstraction are amount of work, user comfort, same type of inputs and outputs, and timing. If all processes appear to do similar amounts of work, they are probably at a similar level of abstraction. If the user feels comfort- able that the information is similar, it probably is.

If the processes have similar types of inputs and outputs, that is, they have no error processing and no exception processing at the same level, then they are probably at a comparable level of abstraction. Similarly, if one process has error and exception processing, the others also should have error and exception processing at the same level.

For concurrent processes, each process must be performed completely independently of all other concurrent processes. If concurrent processes are independent, then the abstraction level is probably okay. If concurrent processes have dependencies, then determine the relationship between the pro- cesses. Either the dependent process is, in fact, a subprocess, or the processes are not concurrent.

During process identification and definition, mark the diagram for processes that are used in more than one place. This identifies both potential reusable processes for the design activity and possible job consolidation for organizational analysis. Make sure that the names assigned to reusable processes are exactly the same and actually perform the same work.

The larger the organization, the more likely you will need more than one level of process decomposi- tion to describe fully the processes of each activity. Continue to decompose levels of subprocesses until you reach processes that can no longer be described as performing some whole action.

ABC Video Example Process Decomposition

To begin, we ask ourselves what are the functions of ABC that relate to this BAA. The functions of ABC are Purchasing, Rental Processing, Accounting and Personnel/Payroll as shown in Figure 9-6. This

application is concerned only with Rental Process- ing, so we decompose only the Rental Processing function.

First, we define the activities of Rental Processing and place them on the diagram in rounded rectan- gles. Return to the case in Chapter 2 and outline the major activities. If you have difficulty finding activ- ities, look at the entities and define the actions taken for each entity. Obvious activities relate to customer and video maintenance and actual rent/return pro- cessing. Can you identify any others? If not, add these to the diagram and decompose them. Activity identification is not a one-time activity; it is ongo- ing and other activities might become obvious as you work through the processes. Keep in mind that when you identify activities with a user, it is from their experience and not from written text, so it is somewhat more direct.

Both maintenance activities are decomposed into create, read, update, and delete processes (CRUD). Notice that the activity names are of the form verb- object. The resulting additions to the decomposition are shown in Figure 9-24.

Next, we must decide if rent and return are one activity or two. This is the same issue we dealt with in process design (in Chapter 8); here we will have slightly different results because the reasoning process is different. The questions here are: Can we define rental without reference to return? Can we also define return without reference to rental? And, does this completely define rent/return processing? The first two answers are yes, the third is no. Both rentals and returns must accommodate the other process as a subprocess for completeness. Therefore, rent and return processing must be combined as one activity.

An easy way to decompose these processes and be reasonably sure we are complete and correct is to decompose the four options separately. The options are rent without return, return without rent, rent with return, and return with rent. A table listing the four options and their subprocesses is shown as Table 9-4. Several issues can be identified for discussion. First, is Check for Late Fees the same level of abstraction as the other processes? Second, is Print Receipt the same type of process and does it belong on the table? Third, does this look complete? For

Customer Maintenance

Business Area Analysis Activities 359

FIGURE 9-24 Decomposition for Customer and Video Maintenance

instance, where are Create Customer and Create Video when the items are not found in a database? Last, can we consolidate these four lists to develop one list for the decomposition diagram?

First, Check for Late Fees appears to be at a lower level of detail than the other processes. To check this, walk-through the process. To check for late fees, data from an open rental must be in memory.

If the Return Date is not equal to zero, subtract Rental Date from Return Date to get Number Of Days Rented.

If Number Of Days Rented is greater than the allowed amount (here we use two), multiply (Number Of Days Rented - 2) by $2.00 (the late charge) to get Late Fees.

If Late Fees are greater than zero, display Late Fees and add Late Fees to Total Amount Due.

This is all logic; there is no reading or writing to files. Thus, this is a simple process that borders on being too small to be called a process. This logic could be included in another process if and only if the other process has the same execution pattern for each pass of the logic. This means we next look at how often Check for Late Fees is executed. Check for Late Fees is in every list, but is it executed for every rental and return? The answer is that for all open rentals, this process would execute to check for fees owed whether there are current returns or not. Also, for all returns, after the return date is added, the process Check for Late Fees should be executed.

Next we review the logic to see if exactly the same procedure is followed in both cases. The answer to this issue depends on when late fees are considered. So far, we have talked about late fees for

360 CHAPTER 9 Data-Oriented Analysis

TABLE 9-4 Decomposition of Rental Options

Rental Without Return Return Without Rental

Get Customer ID Get Return Video IDs

Get Valid Customer Get Open Rentals

Get Open Rentals Get Valid Customer

Check for Late Fees Add Return Date

Get Valid Videos Check Late Fees

Process Payment and Update Open Rentals Make Change

Create Open Rental Update/Create History

Print Receipt Process Payment and Make Change

Print Receipt

tapes with return dates only. You may be tempted to charge fees every day as they accrue, whether the tape is returned or not. If you do this, you need very complex logic to identify what fees are accrued, what fees are paid, and what fees are still owed. Complex logic is frequently wrong and is always error prone. If possible, use the KISS (Keep It Sim- ple, Stupid) method and charge fees only when a return date is present. To continue this thinking, what rental attributes do we need to deal with late fees? Do we need a late fees field? A flag when late fees have been paid? The case does not tell us what Vic wants; so we need to talk to him about this.

In this case, Vic and the accountant decide that, for accounting purposes, they want to know all charges applied to a rental. Information to be kept includes: regular fees, regular fees payment, late fees, late fee payment, any extraordinary fees, and extraordinary fee payment. Notice they do not care about payment dates. We have two choices for deal- ing with late fee data. First, we can compute fees and add them to the file when paid or second, keep two

Rental With Return Return With Rental

Get Customer ID Get Return Video IDs

Get Valid Customer Get Open Rentals

Get Open Rentals Get Valid Customer

Get Return Video IDs Add Return Date

Add Return Date Check Late Fees

Check for Late Fees Get Valid Videos

Get Valid Videos Process Payment and Make Change

Process Payment and Create Open Rental Make Change

Create Open Rental Update Open Rental

Update Open Rental Update/Create History

Update/Create History Print Receipt

Print Receipt

sets of fields, one for the fee and a flag for fee pay- ment. The data and processing for the first option are simpler, but this now makes the processes creating and updating open rentals dependent on successful Process Payment and Make Change. This is not only an acceptable tradeoff, but a better business practice since we do not want to update with unsuccessful payment processing. We note the new attributes and add them to the repository.

The second issue deals with Print Receipt. Is Print Receipt the same type of process and does it belong on the table? The printed rental orders could be considered an output data flow of Process Pay- ment and Make Change rather than requiring its own process. Since ABC defines printing of orders as a separate process required of the application, we could leave it on the list. Unfortunately, the method- ology does not give guidance in the issue of whether to include or omit data printing processes. In general, if the printing is incidental to another process, that is, it is a record of the processing, then it is not separate. A print process should be distinct if it fulfills legal

obligations, or is independent of all other processes, or is contingent on other processing. On the job, the SE, with the analysis team, decides which method of defining inputs and outputs will be used, then is con- sistent in their definitions. A related issue is the relationship of Print Receipt to the other processes. Does it follow payment processing, does it follow and confirm file creation and updating, or is it inde- pendent? At the moment, printing appears related to payment processing only, but here again is some- thing we need to ask Vic. A similar problem arises with data entry procedures that we will discuss later.

Here is a sample dialogue between Mary and Vic to resolve the relationship issue.

Mary: "We are trying to decide about when to print receipts and how receipts relate to the rest of the process. Can you tell me the legal requirements and if you have any other business requirements?"

Vic: "Hm, now, we write down all the customer numbers, video numbers, amount paid, and rea- sons for each transaction. We don't really give customers a receipt in the manual system. I'm not sure what the legal requirements are; I'll get the accountant in here, too."

The accountant comes in and is asked the same question. She says, "It would be nice to have a paper copy of each transaction in which money is processed so I can locate errors when I do the bookkeeping. Trying to find an error by query- ing the computer might be longer than just adding up the days' receipts in different cate- gories. It would also provide IRS documentation if you don't plan to do that on the computer. Do you?"

This leads to a discussion of the tax processing possible and the potential costs to the project, which are negligible at this stage. The final decision is to require receipts not only for transactions in which money is processed, but to offer a receipt as an option to the customer for nonmoney transactions.

The discussion then digresses into the issue of how long records must be kept on the rental file. If all money-related transactions are printed, records of paid transactions could be deleted. Vic wants access to transaction data for historical analysis but thinks

Business Area Analysis Activities 361

the history files will answer most of his questions. In his manual system, Vic purges the files once a year at tax time, but he says there is too much paper to look at any paper records unless a customer actually dis- putes a charge. In any case, Vic, the accountant, and Mary jointly decide to purge the transaction files monthly and move deleted records to an off-line archive file. This discussion causes a new activity to be added to the decomposition under Periodic Processing.

Next, Mary broaches the subject of keeping track of file updates and printing the receipt only when the file updates (or creates) are successful. Vic has two concerns. He needs the ability to fix a file problem if one occurs, and he wants the ability independently of the rental process. Second, he is leery about using the receipt as notification of a problem. "If users think there is a problem with the computer sys- tem, they might not trust the information we give them about late fees and other charges." Vic decides that printing is independent of file updates and that an operator message should be displayed for errors in writing to files.

The third issue is to evaluate the completeness of the processing defined. In particular, where are Cre- ate Customer and Create Video when the 'valid' items are not found in a database? From a simple evaluation of process names, the processing appears complete. To resolve the issue about the two create processes, we look specifically at those processes. Again, there are two options for dealing with the need to create customers and videos: It can be a sep- arate process or it can be a subprocess to the associ- ated Get Valid . .. process. The question to answer is: How important, in the rent/return activity, are create customer and video? The answer is that they are not very important. They are performed on an excep- tion basis to allow processing continuity. Both pro- cesses are important to the related file maintenance activity. A related issue is the name given to the processes--Get Valid Customer and Get Valid Video. The implication from these names is that both valid and invalid conditions are dealt with within the pro- cedure; only valid customers and videos will be passed for further processing. A missing condi- tion would lead to the initiation of the create proce- dure. The resolution, then, is to leave the process

362 CHAPTER 9 Data-Oriented Analysis

FIGURE 9-25 Partial Decomposition with Details of Get Valid Processing

definitions as they are and to treat the Creates as sub- processes under the associated Get Valid process. Figure 9-25 shows the details of the two get valid processes for the next level of decomposition.

The final issue is to consolidate these four lists to develop one list, completing the decomposition dia- gram. The consolidated list is shown, with sequence implied but without selection, in Figure 9-26, the final decomposition diagram. The fourth activity, Periodic Processing, has been added. At the mo- ment, this activity includes archival, end-of-day, and query processing. Other processes may be added as we continue through analysis and design. We will

use the separate lists of processes again in the next activity, developing the process dependency diagram.

To summarize, process decomposition can be per- formed independently of ERD development. This step concentrates on activities, processes, and sub- processes of all functions in the BAA. First, all activities are defined, then the processes for each activity are identified and defined. Both activities and processes are defined without regard for current organization, timing of processing, or current pro- cedure. Emphasis is on processes and procedures that are required to fulfill business obligations. The

Get Valid Customer

Get Valid Video

Get Return 10

Add Return Date

Get Open Rentals

Check Late Fees

Update Open Rental

Update/Create History

Print Receipt

Business Area Analysis Activities 363

FIGURE 9-26 Completed Decomposition Diagram

final decomposition should be validated through user review.

Develop Process Dependency Diagram Rules for Developing Process Dependency Diagram

Process dependency relates processes and shows cyclical, logical, and data connections between pro- cesses. For each activity and level of processes

decomposed, we examine the processes and se- quence them by order of occurrence: what happens first, then second, and so on. A diagram using rounded rectangles for each process and arrows to connect them shows the sequencing of the processes. Processes that are independent of other processes are placed on the diagram but not connected to anything. One diagram is created for each activity. The steps for creating the process dependency diagram (PDD) are as follows:

1. For each activity, draw the processes on a sheet of paper.

364 CHAPTER 9 Data-Oriented Analysis

Sequential Connectors:

Singular Connection

Multiple Input Connections

Multiple Output Connections L) c'---_~ C _____ --.----)

C ______ )

Iterative Connections:

'---r----- I------t.~( Get Return 10 ) ...

Until No More Open Rentals

Martin's Iterative Connections

@et Open Renta0) • ( Get Return 10 ) ...

FIGURE 9-27 Types of Process Dependency Connections

2. Examine each process to determine how it is initiated. For processes that pass data to begin work, connect the process to its data receivers. These connections depict the sequence of processing.

3. For all connected processes, examine each to determine the cardinality of execution. Define iterative processing and document it on the diagram. Be careful to uncouple to the maximum extent possible based on business requirements.

4. For all connected processes, examine each to determine selection in processing. For mutu- ally exclusive processes, alter the diagram to depict exclusivity. For all selected processes,

add the selection conditions under which processing takes place.

5. For all connected processes, examine each to determine Boolean connections. Alter the diagram to include required Boolean logic.

6. Review all connections with the users to ver- ify correctness.

The types of connections between processes in a process dependency diagram differ from those of data flow diagrams discussed in Chapter 7. In pro- cess dependency, four types of connections are allowed: sequence, iteration, selection, and Boolean (see Figure 9-27). All connections identify the data

Business Area Analysis Activities 365

Simple Selection Connections:

If A

(---p_1---)r~L~~~=·:(:~~~P_2~ __ ~)==:t--~~ If notA ~

Mutually Exclusive Selection Connections: If A PA )

P1 else if B r,.------....... PB

else if C ,.------..... PC

else

\ PD

Boolean Connections: Or process

( A ) • 0 ( B )

( C ) 0 • ( D ) Or process

.. F E (And processes)

.. G

FIGU RE 9-27 Types of Process Dependency Connections (Continued)

passing between processes by writing its name, when known, above the line.

Sequential connections may be singular or multi- ple, with many processes feeding another process, possibly feeding the same data (as in reusable processes) (see Figure 9-27). Multiple entries into (or exits from) a single process do not imply any relationship between the multiple processes. That is, no control structure is required to ensure correct order of execution of the processes. In fact, multiple processes could be concurrent, if needed.

Iterative connections between processes are shown with feedback loops, with an indication of how many iterations are performed. A popular alternative is Martin's notation of iteration which uses cardinality indicators, i.e., crows' feet. This notation implies a coupling between processes that may not exist, so the decoupled, more standard iter- ation loop is used in this text. Both Martin's nota- tion and the decoupled notation are in Figure 9-27.

Selection, or conditional, connections show the alternative choices connected by a solid circle to

366 CHAPTER 9 Data-Oriented Analysis

differentiate Boolean processes. The if-then-else logic conditions are written on each line.

Boolean connections identify 'and' or 'or' types of connections. Boolean connectors use connected lines with an open circle at the junction for optional (or) processes. Simple connected lines which join (or split) to show multiple (' anded') entry (or exit) from processes. That is, any processes not identified as optional or selected are assumed to be executed fol- lowing the preceding process.

A comment about multiple required process con- nections is required. Two options for multiple pro- cesses, one using multiple directed lines, and one using multiple lines joining or splitting into one line, are discussed. These notations have specifically dif- ferent connotations. The first, multiple directed lines, shows that any or all of the multiple processes can be executed and that control over that execution is imbedded in the processes. The second, multiple lines joining or splitting into one line, specifically identifies 'anded' processes and may require logic to ensure that all are executed.

In addition to showing the logical connections between processes, the lines connecting processes identify process data triggers, that is, data flows from a process to its dependent processes. The last step to the dependency diagram is to identify, as much as possible at this stage, the data that triggers the dependent processes. Attribute names, relation names, or other identifier names are written on the connective lines.

ABC Video Example Process Dependency Diagram

The dependency diagrams for ABC vary in their complexity. The maintenance diagrams are simple because all processes are independent (Figure 9-28). Similarly, the periodic processes are also unrelated (see Figure 9-29). The processes of rental and return are complex and are discussed in detail.

The discussion in the preceding sections identi- fied a dependency of processes in rental/return pro- cessing that we need to carry forward: Print Receipt is dependent on Process Payment and Make Change.

There are other dependencies as well. To show the logic behind the final diagram, we show the

( Create)

( Delete) ( Update)

( Retrieve)

FIGURE 9-28 Maintenance Process Dependency Diagram

process dependencies for each process alternative as we did above. The four choices, again, are rentals with and without returns, and returns with and with- out rentals.

The first diagram lists the processes for rentals without returns and, informally, draws connections between them. We draw the informal diagram be- cause changes are expected and changing an infor- mal diagram is easier than changing a formal one.

When you start considering the processes, two apparent features are: first, they are not all sequential and second, they are not all done only once. Repeti- tive processes are Get Open Rentals (with Check for Late Fees in an iterative loop), and Get Valid Videos. Both of these are performed until there are no more

( Query )

(End of Day)

(End of Month)

FIGURE 9-29 Periodic Process Dependency Diagram

( Get Customer ID )

(Get Valid Customer

, ( Get Open Rental0

C) Until No More Open Rentals

FIGURE 9-30 Get Customer ID and Get Valid Customer Process Dependency Diagram

of the items being got. Draw a circular line from the process to itself to show iteration. Is there any data passed from one iteration to the next? No specific data except an indicator to keep going and, maybe, a memory address at which to store the next data, but this is an implementation detail. We do not identify it now.

Next, walk-through the processing to identify dependencies, drawing the appropriate connections as you proceed. Get Customer ID provides the exter- nal display and entry processing and passes a Cus- tomer ID to Get Valid Customer (see Figure 9-30). We cannot go directly to any other process because we only want to process valid customers. Conse- quently, the only dependency is Get Valid Customer. Get Valid Customer retrieves a customer record, cre- ating a new one if required, checks credit status, and passes a valid customer record to the next process: Get Open Rentals (see Figure 9-30). We already said Get Open Rentals iterates until there are no more open rentals, and it proceeds to Get Valid Videos after it is complete (see Figure 9-31). Could we go directly from Get Valid Customer to Get Valid Videos? That is, could Get Valid Videos and Get Open Rentals be concurrent? We need to jump into implementation details again to decide, since there is no business reason why these processes cannot be concurrent. What do we do in these processes? The open rentals procedure reads a file, checks late fees, composes and displays a line, and adds to a total

Business Area Analysis Activities 367

field, as required. The video process gets Video IDs, reads the Video and Copy relations, composes and displays lines, and adds to a total field. Both pro- cesses need access to the screen. So, we need a pro- tocol, or set of rules, to govern where and when a process can display information. They cannot both try to use the same line. The easy method is to force one process to be first and to create an artificial dependency. This is what we will do: Get Open Rentals will be first, because we can also have the clerk verify late fees with the customer. So Get Valid Videos will be dependent for screen location infor- mation (and memory location information, too) on Get Open Rentals.

To continue, Get Valid Videos iterates until there are no more videos to be rented, then proceeds to Process Payment and Make Change (see Figure 9-32). We decided above that both Create Open Rental and Print Receipt are dependent on payment processing but independent of each other, so we draw two lines from Process Payment to each of these processes (see Figure 9-33). Putting all of these dependencies together, we arrive at Figure 9-34.

It is important to practice walking through the diagrams one step at a time to identify dependencies, considering each process alone and as possibly

( Get Open Rentals 1 r- ~ I ~ ~ UntilNo

, ( Check Late Fees 1

More Open Rentals

C ~ UntiiNo

(GetVaiCS

More Open Rentals

Until No More Valid Videos

FIGURE 9-31 Get Open Rental and Check for Late Fees Dependent Processes

368 CHAPTER 9 Data-Oriented Analysis

FIGURE 9-32 Get Valid Videos Dependent Processes

connected to all other processes. Ask if all of these processes get done once, or are some iterated? Can groups of processes that iterate together be identified?

Draw each connection as you consider it so it is translated properly. Finally, identify the data triggers when they are known so that you know what you don't know when your diagram is 'complete,' mean- ing it contains all known information.

Figure 9-34 shows the dependency diagram for Rentals Without Returns. Try to develop the process dependency diagram for each of the options on your own, without looking at the answers. The other process option dependency diagrams are shown as Figures 9-35-9-37. Keep in mind that iteration and sequencing required for each alternative way of pro- cessing are important because we eventually must

Until No More New Rentals

consolidate into one diagram-that is, a composite of the individual diagrams. We only discuss the dif- ficult connections and connections that change the way we think about the rent/return process (i.e., may alter our mental model).

In Figure 9-35, we have several differences from the first dependency diagram (Figure 9-34). First, Get Valid Customer is only done for the first return, so we need a selection connector. If the Video ID (or Open Rental) is not the first, we proceed to Add Return Date and Check for Late Fees. We have two choices on the iteration grouping shown (see Figure 9-38). The first strategy shows coupled logic with the loop encompassing Get Request, Get Open Rentals, Get Valid Customer, Add Return Date, and Check for Late Fees. The second strategy shows sev- eralloops that may look more awkward, but reflect required coupling for the processes. This logic is uncoupled and shows three iterative loops, all iter- ating for all returns. The first loop is Get Open Rentals. The second loop is Add Return Date; the third is Check for Late Fees. Both of these alterna- tives would be acceptable in program specifications and selection for one over the other might be based on the common iteration pattern. At the logical requirement level, however, the preferred method is the more loosely coupled one because the iteration cycles may change when we consolidate the dia-

Until No More Open Rentals

FIGURE 9-33 Process Payment and Make Change Dependent Processes

Until No More New Rentals

Business Area Analysis Activities 369

Until No More Open Rentals

Until No More Valid Videos

)

FIGURE 9-34 Rentals without Returns Dependency Diagrams

grams. If the diagrams were already consolidated, we could go to the program design level of detail to choose the iteration grouping. The more uncoupled dependency is shown in the completed diagram, Fig- ure 9-35. The decisions about preferred looping are deferred until design.

The second difference in Figure 9-35 is that a receipt has selection criteria applied to its creation. A receipt must be printed whenever a payment is made, and may be printed upon request of a customer with returns but no payments. This selection is shown on the diagram.

Figure 9-36 shows rentals with returns. In this procedure, we have two iterative cycles: one for return processing and one for new rental-item processing. These, in effect, consolidate the previous two diagrams. Notice here that the initial input is

from Get Customer ID so returns do not include the selective execution of Get Valid Customer. Also notice that we again have coupling options for re- turn processing (the coupling options for return pro- cessing are shown in Figure 9-38). In the selected option we have three iteration cycles. Get Open Rentals is performed for all open rentals. Get Return IDs and Add Return Date are performed together for all returns which may be a subset of open rentals. Check for Late Fees is performed for all open rentals whether or not returned today. The final difference is for history processing which is selected for open rentals with Return Date equal to today's date.

The last procedure is for returns with rentals (see Figure 9-37) which is similar to Figure 9-36 except for the initial entry of information. If a return is first,

370 CHAPTER 9 Data-Oriented Analysis

Get Valid Customer

Until No More Returns

Until No More Open Rentals

Until No More Returns

Until No More Open Rentals

Print Receipt)

Until No More New Rentals

FIGURE 9-35 Returns Without Rentals Dependent Processes

Get Return ID is the first process and we need the selective execution of Get Valid Customer.

Now, we are ready to consolidate the diagrams into one (see Figure 9-39). The obvious complexity is in dealing with all of the return options, so they will be done first. If we look at the first step of each procedure, the differences are that for returns, Video IDs are entered first and for rentals, Customer IDs are entered first. If these are separate processes, we have a problem knowing which is, in fact, being executed. If we consolidate these processes, we can use program logic to figure out which numbers are for videos and which for customers. This means that Get Return ID and Get Customer ID are replaced with one process we will call Get Request. This process will select either Get Valid Customer or Get Open Rentals depending on the data entered. This

change is reflected back to the decomposition dia- gram also.

Next, for returns, we need selective execution of Get Valid Customer. To consolidate, we need to know whether we are processing a rental or a return. Two changes are required. Get Request must call either Get Valid Customer or Get Open Rental, depending on the type of entry. This is indicated by the selection logic in Figure 9-39. Second, Get Request has to pass some indicator to Get Open Rentals that it is the caller; this means data is trig- gering the process.

The last return issue is what to do with Check for Late Fees. There are three options. First, include it in both Get Open Rentals and Add Return Date to ensure complete processing of late fees for old and new returns. Second, leave it separate and execute it

Until No More Open Rentals

Until No More Videos

Until No More New Rentals

Get Customer 10

Business Area Analysis Activities 371

Until No More Returns

Print ReceiPt)

More Returns

FIGURE 9-36 Rentals with Returns Dependency Diagrams

for all open rentals, including those returned today; as a separately iterated process. The first option guarantees double processing for all returns when rentals are also done. The second option requires somewhat complex logic for memory loop process- ing. Both options are acceptable technically and from a business perspective. The last option is to defer a decision. Since we have no business basis for a decision, we leave the process on the diagram and defer any decision about grouping until design. The final dependency diagram is shown in Figure 9-39 and reflects all of the decisions dis- cussed above.

The dependency diagram for periodic activities is in Figure 9-40. The diagram is somewhat strange

because there is no necessary connection between any of the processes. It reflects the independence of the processes, and is the basis for the PDFD which completes the dependency diagram. This type of independence also identifies possible concurrent processes and is considered a normal diagram. Notice that even though these processes would be connected on a menu for processing, no menu selection options are shown at this logical level. The reason is that the business requires no menu.

To summarize, to develop the dependency dia- gram list the processes for an activity, in sequence. Then, examine each process to determine its rela- tion to all other processes. If complex processing is involved, as we have here, separate out the options

372 CHAPTER 9 Data-Oriented Analysis

Until No More Open Rentals

Until No More Returns

Until No More Open Rentals

Until No More Valid Videos

Until No More Returns

Print ReceiPt)

FIGURE 9-37 Returns With Rentals Dep~ndent Processes

and develop dependency diagrams for each opti,on. Be careful to couple processes based on business requirements rather than on convenience. Conve- nience is decided in design. Consolidate any options and only change processes if required to support integrated process interactions. .

Notice also that in going back to the client for information during this procedure, we obtained information we would otherwise not have about the need for pedodic purging of the file ap.d require- ments for IRS documentation.

Develop Process Data Flow Diagram

Rules for Developing Process Data Flow Diagram

This is a three step process:

1. For each process dependency diagram, exam- ine every process to determine if external events provide information used in the execu-

Business Area Analysis Activities 373

Tightly Coupled Alternative: Get Open Rentals ..... ----,

Loosely Coupled Alternative (Preferred):

Get Return ID

Add Return Date

Check Late Fees

Until No More Open Rentals

No More Open Rentals

Add Return Date

Until No More Returns

Until No More Open Rentals

FIGURE 9-38 Alternative Coupling Strategies in Return Processing

tion of the work. For each external event, add an event trigger and identify the event (or the data it provides).

2. For entities from the ERD, examine their use by processes in each diagram. For known connections, add one file for each entity to the diagram and connect them to processes with arrows depicting the direction of data flow. For all files, when a relation is not the unit of data retrieved, list the attributes that make up the data flow.

3. Review the triggers and files with the user to verify correctness.

U sing the process dependency diagram, first add the information about triggers, that is, the data or events that trigger each process. If arrival of information from another process is the trigger, identify the data on the lines connecting the rec- tangles. Use large arrow outlines for event triggers. Use single-directed lines for data triggers (see Fig- ure 9-41).

374 CHAPTER 9 Data-Oriented Analysis

Until No More Open Rentals

If First

Until No More Returns

Print ReceiPt)

FIGURE 9-39 Consolidated Process Dependency Diagram

( Query )

( EOO Process )

( EOM Process )

FIGURE 9-40 Periodic Activitied Dependency Diagram

Each process must be triggered, or initiated, by either an event or arrival of data. If you have a pro- cess without either data or event as input, then you have missed information during data gathering and should return to the user to obtain the information.

Identifiers for both data and events should link directly to some entity. The trigger may be the arrival of some entity or may be some partial data from an entity. If the identified data does not map directly to an entity from the ERD, you are also missing information and should return to the user to obtain the information.

Next, data files are identified, if they are known. Not all files are necessarily identified at this point of

Event Trigger

Data Trigger

FIGURE 9-41 Trigger Identification on Process Data Flow Diagram

the analysis. However, most information that is required in persistent files will have been identified as entities on the ERD. The files are connected to processes with the appropriate arrow signifying the direction of data flow. If a unit of data other than a logical relation is required, the lines connecting files should be labeled with their contents.

PDFD validation is performed last to guarantee that all functional requirements are satisfied by the processes depicted. The validation walk-through uses both the original text or functional specification, plus the decomposition from the specification, plus any additional user requirements or information obtained throughout the analysis activities.

ABC Video Example Process Data Flow Diagram

To complete the ABC PDFD, begin with the final process dependency diagram. We examine each pro- cess sequentially, adding events and data files as needed to complete the logical processing. For each process, ask: How does this process know to exe- cute? What information does it need? Where does the information come from? Ask these questions without paying attention to current connections from other processes. For each process, when you have the answers, look at the current connections and decide if they completely define the required list of information. If not, define the external 'triggers'- either data or events-that initiate the processing. The individual chunks of the diagram on which we are working are shown with the discussion. They

Business Area Analysis Activities 375

are integrated into one diagram at the end of the discussion.

The first process, Get Request, requires input of either a phone number or a video ID to begin exe- cution. The information is provided by the customer and entered into the computer (either by scanning or typing) by the clerk. Since the information is externally generated by a rental or return request, the data being entered is an event. That is, arrival of a Customer ID or Video ID into the computer triggers the Get Request process which begins the sequence of processes for rental/return processing. The hollow arrow is added to the diagram to show the arrival of the Request event (see Figure 9-42).

After the request is entered, the process deter- mines which data were entered and passes control to the appropriate process. If the Customer ID was entered, the Get Valid Customer process would be triggered. For that process, customer information from storage is required for validation and credit checking. A file symbol for a customer information file with a line indicating data into the process is added to the diagram (see Figure 9-42). Since there is a possibility that the customer is new, an arrow going to the Customer File is also shown.

Next in rental processing, the Open Rentals File should be read to retrieve all information about Open Rentals relating to the present customer. These are formatted and displayed. The file is added as input to the Get Open Rentals process (see Figure 9-42).

If the request entered had been a Video ID, the Get Open Rental process followed by the Get Valid Customer process would have been triggered. The control of these processes is shown by the selection arrows from the dependency diagram. The data and trigger requirements do not change. We might make a note that, in this execution of Get Valid Customer, we do not allow new customers to be added. That is, it should be logically impossible for a new cus- tomer to have a return.

Return processing takes place next with two pos- sible variations. First, if the first Request was a Video ID, there already is a return Video ID in memory and no Get Return ID is triggered. Instead, Add Return Date/Late Fees is triggered. The second variation is in the rental process; returns are entered after open rentals are displayed. The Get Return ID process is

376 CHAPTER 9 Data-Oriented Analysis

Until No More Open Rentals

Open Rental

Until No More Returns

Until No More Open Rentals

FIGURE 9-42 Process Request 'Chunk' of PDFD

triggered but it now needs the Video ID, external information, to process. The event trigger added to the process, then, contains Video IDs (see Figure 9-42). The data is made available to the Add Return Date process. Get Return ID and Add Return Date iterate until the Return ID is ended (exactly how is decided during design). Then all Open Rentals (and returns) are Checked for Late Fees.

Next, the Get Valid Video process executes to identify videos requested for rental. The informa- tion needed for this process comes from an external trigger, the customer-supplied Video ID. The ID is validated by reading the Video File and a Copy File. For rental/return processing, the Video and Copy files are always used together, so they are shown in one file symbol. By doing this, we are reminded that we need a user view that connects the two relations for rental processing. Customers are to be reminded when they have already rented a particular video, therefore, Customer History File is also read during this process. Its file symbol is added to the diagram.

The Total Amount Due is passed to trigger Pro- cess Payment and Make Change. The Total Amount Due is displayed from the previous process and awaits the external entry of Customer Payment Amount to compute change. This requires an event trigger for Customer Payment Amount (see Figure

9-43). The formula used is Total Amount Due - Cus- tomer Payment Amount = Balance. When the Bal- ance is zero, the rental/return process is complete and all files may be updated as required. Each line of the rental/return, signifying either an existing Open Rental with/without Return Date or a new Open Rental is processed separately to determine the next process to execute. This represents the normal pro- cess; now we must also think about exceptions: What if the balance does not go to zero? Can a cus- tomer ever overpay and leave so fast that they are owed change? Can a customer ever owe money and leave without paying? If the answer to either of these questions is yes, we also need an optional event trig- ger End of Payment that forces completion of the Process Payment process and shows that the cus- tomer is owed money. For the present, we assume that we iterate through Process Payment until Bal- ance equals zero. Finally, process payment needs to provide information for End of Day totals. A file for EOD data will be created from this process. Notice that this file is not on the ERD, but should it be? It does not represent an entity that the company keeps information about, or does it? When the ERD was developed, we focused on the rent/return processes only and ignored nonrental activities. By ignoring accounting and its needs for rent/return data, we

Business Area Analysis Activities 377

Customer History

~---I Video, Copy '-----.,......" ..... -===--'

End of Day

FIGURE 9-43 Rental and Payment 'Chunk' of PDFD

missed the entity for financial inform.ation relating to rent/return. We should recheck with the accountant, but it appears that the EOD Accounting Information should be an entity that is added to the ERD, con- nected to the Open Rental entity.

The last 'chunk' of the PDFD is for file update and print processes. For each new rental, Create Open Rental writes the new rental to the Open Rental File (see Figure 9-44). For each unpaid return (i.e., existing Open Rental with Return Date not

Copy

null), the Open Rental is rewritten to the file (see Figure 9-44). For each paid returned video, some logical delete indicator must be set and the Open Rental is then rewritten to the file. If an existing Open Rental had no processing, no action is re- quired, but it might be easier, and more consistent, to only look at return and paid criteria to determine correct writing. No business criteria exist for this option, therefore, this is a design decision not made here.

I-------.J End of Day

FIGURE 9-44 File and Update' Chunk' of PDFD

378 CHAPTER 9 Data-Oriented Analysis

Until No More Returns

Until No More Open Rentals

)4-------; Video, Copy '------.---r-'" Until No

More Valid Videos

End of Day

Copy

FIGURE 9-45 Consolidated Process Dependency Diagram

Next, history file processing updates both Video History and Customer History. Notice that the de- tails of history processing require another level of decomposition, which is left as a student exercise. Last, the receipt is printed. The data trigger from Process Payment initiates this process; the physical output is not on the PDFD.

The composite PDFD with all of the chunks integrated is shown as Figure 9-45. Before you look at the other diagrams, try to develop one or all of

them yourself. The remaining PDFDs are shown as Figure 9-46, for query processing, and Figure 9-47 for customer maintenance processing. The PDFDs for Video Maintenance processing are a practice exercise at the end of the chapter.

Next we evaluate the PDFD for completeness based on the decomposition information from the client and the original statement of the problem (Chapter 2).

Errors to watch for are:

1. Processes on the decomposition that are not self-contained are not processes. For instance, 'end process' is a system action, not a business process. You do not need a process to which all other processes feed to show a termination point on a PDFD (see Figure 9-48).

2. Processes on the PDFD that are not identical to the processes on the decomposition (see Figure 9-49).

3. If process data trigger contents cannot be identified, there may be no dependency. Reevaluate the relationship, talking to the

Customer History

Video History

EOD Process

EOM Process

FIGURE 9-46 Periodic Activitied PDFD

Business Area Analysis Activities 379

user if necessary, to determine what data are required for the dependent process.

4. If a process data trigger exists in a different time, the connection is probably wrong. For instance, you might be tempted to connect query processing to rental/return in some way (see Figure 9-50). These are disjoint activi- ties and the only connection is through the database.

5. For query processing, do not try to simulate a menu selection process in the PDFD (see Figure 9-51). Each type of query has its own event trigger requesting information. Each

Customer

Video

Copy

EODArchive

EOD

Open Rental Archive

Open Rental

380 CHAPTER 9 Data-Oriented Analysis

Add Customer

Update Customer

Delete Customer

Query Customer

FIGURE 9-47 ABC Customer Maintenance PDFD

( Print ~eceipt )

, ( Terminate)

FIGURE 9-48 Nonprocess Problem

( Get Request)

versus

( Do Data Entry )

FIGURE 9-49 Name Names Do Not Match Decomposition

type of query is distinct and separate from all other queries. The queries may share files.

6. When more than one activity is shown on a PDFD, problems are encouraged and the dia- gram no longer clearly delineates any process (see Figure 9-52). Place at most one activity decomposition on a page. Use one side of the paper only. Keep in mind that most of the time, the information will be on a CASE tool until printed for documentation, so there is not really much wasted paper.

Query Customer

FIGURE 9-50 Data Trigger Timing Problem

Business Area Analysis Activities 381

( Main Menu )

If Periodic , ( Periodic Menu

1 , , - ( Query Customer') ( Query Video

FIGURE 9-51 Simulated Menu Problem

Develop and Analyze Entity/Process Matrix Rules for Developing and Analyzing an Entity/Process Matrix

This matrix is composed of the results from the ERD and process decompositions; it requires neither the process dependency nor the PDFD for completion. Along the left margin, list each lowest-level process from the process decomposition diagram. Use the lowest-level processes, such that all elemental pro- cesses for the organization and application area are present. Along the top, list the normalized entities from the ERD, with one entity in each column.

Completely identify which processes are allowed to Create, Retrieve, Update, and Delete (CRUD) each entity. Enter one or more of the letters as allowed by the current organization's policies and procedures for each entity.

When the matrix is complete, entities are grouped by their affinity, or closeness, in processing entities. If you do this step manually, group processes that share create responsibility first. If the number of clusters is reasonable for the size of the project, stop. When you have analyzed the entire matrix, rearrange the matrix by its clusters. You may have several clus- ters that overlap. That is normal and not a cause for worry. If you have only one cluster, reanalyze as nec- essary using first update processing, then delete pro- cessing, then retrieval, as the clustering criteria. When you obtain a reasonable number of clusters, go to the next step of the analysis. A reasonable number

)

, , ( Query Rentals) ( End of Day Process)

may be one to five for a small application, such as ABC, or seven or more for a large application.

To perform manual affinity analysis, perform the following procedure (here we do create affinity only). Keep in mind that you are 'normalizing' process-entity relationships. Look at each process and its entities. For an entity/process cell, look down the column and identify other cells in which create processing is done (C). Make an erasable, colored mark in those cells. Do this for all entities for each process.

Even though it is an extra step, and quite a bit of work, create an interim matrix for each potential cluster. This interim matrix makes your visual inspection of relationships easier and actually speeds the affinity analysis. Iterating, build the interim matrix, analyze it as described below, and add the resulting cluster(s) to the new process/entity matrix. Iteration is required because the interim clusters may change as the relationships of each potential cluster are analyzed.

To analyze a potential cluster, start at the first process and look at the data it shares with the next process in the list. Do these two processes share 80% 7 or more of their data creation (or update, delete) responsibility? If yes, mark the original matrix to show they are together and add the pro- cesses and their entities to the interim matrix. If the percentage is less than 80%, circle the second

7 80% is not a hard number. You adjust the percentage affinity needed to find multiple clusters. If all processes share all responsibilities, the organizational processes must first be redesigned, then this analysis is repeated.

382 CHAPTER 9 Data-Oriented Analysis

Customer History

Video History

Add Customer

Update Customer

Delete Customer

FIGURE 9-52 Combined Process Problem

process for potential deletion from this cluster. If the percentage affinity is between 50-80%, look at the next process that might be related. Do either of the two first processes share more than 80% of their data with the third process? If all three share, clus- ter them all. If the third strongly relates to the second process but not the first, or the third relates to the first but not the second, still cluster all of them for the

Video

Copy

Separate page

I EOD Archive J

I End of b ( EOD Process ,

moment. Continue to do this type of stepwise com- parison of processes using entity affiliation to deter- mine process affinity. Each successive process's functions on the entity are compared to all previous processes, not just the first. If a new process is strongly related to all of them, the cluster remains intact. If the new process strongly relates to some subset of processes, keep it. If the new process

strongly relates to only one process, consider those two processes as a second cluster and set them aside (i.e., create a new interim matrix) with their data for analysis of that cluster. As you complete analysis of a cluster, add it to the new process/entity matrix. Return to the original matrix and draw a line through each process that has been added to a cluster, to ensure that there is no replication of a process and to ensure that processes that are not currently in a cluster are added to a cluster eventually.

As you create the new process/entity matrix, leave several lines and columns of space between each cluster. At the end of the analysis, reanal- yze processes that have not been assigned to any cluster. Look at the interim matrices to which each odd process was compared. Find the cluster to which it has the most affinity, and add the process to that cluster. If there is no affinity, leave the entity separate.

When affinity analysis is complete, the clustered processes are ready for analysis to determine if they can be in the same execution unit, and the data shared by a cluster should be analyzed to determine the physical design of the database and the needed user views of data. These activities are done during the design phase.

ABC Video Example Entity/Process Matrix

To develop the entity/process matrix, list the lowest level processes from the decomposition diagram down the left column. Then, list the entities across the top, one per column. For each process, add which functions it has for each entity. Possible functions are create, retrieval, update, or delete (CRUD, respectively). The ABC process-entity matrix is shown as first completed in Table 9-5. In completing the table refer back to the PDFD. Use the arrows and names of the processes to identify the type of pro- cessing. For entities with arrows only going into a process, the correct code is 'R,' for retrieval. For entities with arrows only going out of a process, the correct codes are 'C,' 'U,' or 'D,' for create, update, or delete, depending on the processing to take place. For example, Create Video has a 'C' under Copy and a 'C' under Video because it creates both.

Business Area Analysis Activities 383

First, we check each entity to see if all possible processing is accounted for. Customer, Copy, and Video all have CRUD processing and appear com- plete. Open Rental has CRUD processing during rent/return and R processing as part of Query. Open Rental, too, looks complete.

EOD Financial Information is created but not ever deleted, which is wrong. It should be archived or deleted at the end of the processing day. We will assume archival at the moment and add both process and data changes back to all other diagrams as needed. We can further assume that the delete process and create process follow. This is an exam- ple of what happens with late identified entities. The processing is not as thoroughly analyzed and some information could 'fall through the cracks.' The entity /process matrix helps assure that processing is completely defined.

Finally, history is only created and retrieved. We have not yet defined history, so the decision may not be final. In general, history files are only created and retrieved. They are permanent records of past trans- actions or business states, so they can not be updated or deleted and still be known as 'history.'

When the matrix is complete, we cluster the processes by their entity affinity (see Table 9-6). There are five possible clusters, but most of the data are used by many of the processes. So, the cluster- ing shown is to give an example of how, at the application level, affinity analysis can be used to de- termine user views for DBMS access. For ABC, be- cause of the extensive retrieval activities, one subject database would be defined. EOD data are kept sepa- rate from all other data. Customer, Video, Copy, and Open Rental data are all used to create individual relations and joint user views.

First we analyze organizational sufficiency as it relates to processes. Is each entity created by one process only? The answer here is yes. Do all pro- cesses creating, updating, and deleting an entity report to the same manager? Here the answer is again, yes. So the organization is sufficient.

Next we look at the entities and their usage to determine subject area databases. Again, ABC is a small company, so the data are mostly in one

(Text continues on page 386)

384 CHAPTER 9 Data-Oriented Analysis

TABLE 9-5 ABC Video Process/Entity Matrix

Entities = Open Customer Video Rental Processes Customer Video Copy Rental History History Archive EOD

Get Request R

Get Valid Customer R

Get Open Rentals R R R

Add Return Date

Check for Late Fees

Get Valid Videos R R R

Process Payment and Make Change U

Create Open Rental C

Update Open Rental U

Update/Create CU

History

Print Receipt

Query Customer R R R

Query Rental R R R R R

Query Video R R R R

Query History R R R R R R R

Query EOD R

EOD Processing CRD

Rental Archive D C

Processing

Create Customer C

Update Customer U

Delete Customer D

Create Video/Copy C C

Update Video/Copy U U

Delete Video/Copy D D

Business Area Analysis Activities 385

TABLE 9-6 ABC Video Process/Entity Matrix Affinity Clusters

Entities = Processes

Create Customer

Delete Customer

Get Valid Customer

Query Customer

Update Customer

Create Video/Copy

Update Video/Copy

Delete Video/Copy

Get Open Rentals

Get Valid Videos

Query History

Query Video

Create Open Rental

Update Open Rental

Query Rental

Rental Archive

Processing

Update/Create

History

Process Payment and Make Change

EOD Processing

Query EOD

Add Return Date

Check for Late Fees

Get Request

Print Receipt

Open Customer Video Copy Rental

R R

C C

U U

D D

R R R

R R

R R R R

R R R

R R R R

Customer Video Rental History History Archive

R R R

I CU CU I

EOD

CRD

386 CHAPTER 9 Data-Oriented Analysis

Data Entities and Final Attributes

Entity

Customer

Open Rental

Video

Copy

Customer History

Video History

EOO

Rental Archive

Attributes (Key underlined)

Customer Phone ID Customer Name Customer Address Customer City Customer State Customer Zip Customer Credit Card Number Credit Card Type Credit Card Expiration Date

Customer Phone ID Video ID CopylD Rental Date Return Date Late Fee Due LF Paid Rental Fees Due RF Pd Other Fees Due OF Paid

Video ID Video Name Entry Date Rental Rate

Video ID CopylD Date Received Status

To Be Defined

FIGURE 9-53 Summary Repository Entries

database. Video, Customer, Open Rental, and End of Day information will be stored together. Video and Customer are separate from the other entities because they are only modified by one process. His-

Activities for Rental/Return and All Processes

Activity

Rent/Return

Periodic Processing

Customer Maintenance

Video Maintenance

Processes

Get Request Get Valid Customer Get Open Rentals Add Return Date Check for Late Fees Get Valid Videos Process Payment and Make

Change Create Open Rental Update Open Rental Update/Create History Print Receipt

Query Customer Query Rental Query Video Query History Query EOD EOD Processing Rental Archive Processing

Create Customer Update Customer Delete Customer

Create Video/Copy Update Video/Copy Delete Video/Copy

Deferred Items for Decision During Design

Item

Check for Late Fees

History

Decision

Separate or Consolidated with either/or both Get Open Rentals Add Return Date

Is video history updating going to be done to a history file or to the current Copy relations? This requires monthly update. The history file requires further decisions about what is on the file.

torical information could be stored in a separate database or set of files. In most companies and ap- plications, history files are kept separate from the other databases and files. For applications with huge

amounts of data, history is even on a different stor- age medium. History is frequently kept on tape and the current databases are on disk. Depending on vol- ume, we would consider tape storage for history here, too.

The entity/process matrix analysis completes the BAA. At this time, all entities, processes, attributes, and their interrelationships should be known based on business requirements. All information is docu- mented in a repository for use in the next phase. The repository entries, without details, are presented in Figure 9-53.

SOFTWARE ____________ _ SUPPORT FOR __________ _ DATA-ORIENTED ________ _ ANALySIS ________ _

There are many CASE tools that support data mod- eling and other data-oriented analysis tasks. Tools that also support Information Engineering are inte- grated toolsets that cover the complete development life cycle. CASE support for Information Engineer- ing is the best of any methodology. Two CASE environments support the entire IE life cycle from enterprise analysis through maintenance. The two tools are IEFTM8 and ADW. The CASE tools are by no means perfect, however; interphase linkages are weak; numerous bugs plague new releases of ADW; and old releases of IEF were designed so rigidly that all graphical and definitional forms were required to use the tool effectively. The positive aspect of both tools is that they can feed code generators that can automate development of as much as 70% of the necessary program code. Both ADW and IEF sup- port all of the graphical and definitional forms dis- cussed in this chapter. The list in Table 9-7 includes many other CASE tools that support some, but not all, graphical, documentation, or mental models of IE and data-oriented analysis.

8 IEPM is a trademark of the Texas Instruments Co., Dallas, TX. ADWTM is a trademark of Knowledgeware, Inc., Atlanta, GA.

References 387

SUMMARy ____________ _

Data-oriented methodologies are based on the notion that data are more stable than processes in business. Organizations and procedures change regularly; the data on which they work does not. Data-oriented methodologies, then, concentrate on data as the ini- tial focus of study. The theory underlying data meth- ods applies semantic modeling to data and system theory to business functions.

Information Engineering's business area analysis (BAA) is the example of data-oriented methodol- ogy described here. BAAs begin with an entity- relationship diagram that is fully identified and nor- malized. Business functions are decomposed to cre- ate process decomposition, process dependency, and process data flow diagrams.

Business processes from the decomposition are coupled to the entities from the ERD to form an entity/process matrix, also called a CRUD matrix (for create-retrieve-update-delete). The CRUD ma- trix defines responsibility for actions on each entity for each process. Affinity analysis of the CRUD matrix clusters processes and data into groups. The affinity groupings are used to decide the need for additional project scoping, future applications, and alternatives for subject database design. All infor- mation from the BAA is documented in a repository.

REFERENCES __________ __

Date, C. J., An Introduction to Database Systems, Vol. 1, 5th edition. Reading, MA: Addison-Wesley, 1990.

Finkelstein, Clive, An Introduction to Information Engi- neering: From Strategic Planning to Information Sys- tems. Reading, MA: Addison-Wesley, 1989.

Knowledgeware, Inc., Information Engineering Work- bench™/Analysis Workstation, ESP Release 4.0. Atlanta, GA: Knowledgeware, Inc., 1987.

Loucopoulos, Pericles, and Roberto Zicari, Conceptual Modeling, Databases and CASE: An Integrated View of IS Development. NY: John Wiley & Sons, 1992.

Martin, James, Information Engineering: Book 2, Plan- ning and Analysis. Englewood Cliffs, NJ: Prentice- Hall, Inc., 1991.

Martin, James, and Carma McClure, Diagramming Tech- niques for Analysts and Programmers. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1985.

388 CHAPTER 9 Data-Oriented Analysis

TABLE 9-7 Data-Oriented Analysis CASE Support

Product

AnalystlDesigner Toolkit

Anatool

Bachman

CorVision

Deft

Design/1

ER -Designer

Excelerator

IEF

IEW,ADW

Company

Yourdon, Inc. New York, NY

Advanced Logical SW Beverly Hills, CA

Bachman Info Systems Cambridge, MA

Cortex Corp. Waltham, MA

Deft Ontario, Canada

Arthur Anderson, Inc. Chicago,IL

Chen & Assoc. Baton Rouge, LA

Index Tech. Cambridge, MA

Texas Instruments Dallas, TX

Knowledgeware Atlanta, GA

Texas Instruments, A Guide to Information Engineer- ing Using the IEF. Dallas, TX: Texas Instruments, 1988.

KEy TERMS ______ _ activity affinity affinity analysis architectures associative entity attribute attributive entity business area analysis

(BAA)

business function business process business redesign cardinality CRUD matrix data administration data trigger direct method of

normalization

elementary process entity

Technique

Entity -Relationship Diagram (ERD)

ERD

BachmanERD

ERD

Functional Decomposition ERD Entity Hierarchy Process Hierarchy Process Dependency Process Data Flow Diagram Entity!Process Matrix

Functional Decomposition ERD Entity/Process Matrix

entity -relationship diagram

optional relationship process data flow diagram

(PDFD) (ERD)

entity type entity/process matrix entity structure analysis event trigger functional decomposition fundamental entity instance many-to-many relationship normalization one-to-many relationship one-to-one relationship

process data trigger process dependency

diagram process relationship relational database theory relationship relationship entity required relationship subject area database tabular method of

normalization trigger

Study Questions 389

TABLE 9-7 Data-Oriented Analysis CASE Support (Continued)

Product

Maestro

Multi-Cam

ProKitVVorkbench

Company

SoftLab San Francisco, CA

AGS Mgmt Systems King of Prussia, PA

McDonnell Douglas St. Louis, MO

Technique

ERD

SVV Thru Pictures Interactive Development Environments San Francisco, CA

ERD

System Engineer

Teamwork

Telon

LBMS Houston, TX

CADRE Tech., Inc. Providence, RI

Pansophic Systems, Inc. Lisle,IL

ERD Entity Life History Diagram

ERD

The Developer ASYST Technology, Inc. Naperville, IL

ERD Organization Chart Operations Process Diagram Matrix Diagram

vs Designer Visual Software Inc Santa Clara, CA

EXERCISES _______ _

1. Complete the PDFD for Video Maintenance. 2. The Get Valid Video process has as its sub-

processes: Get Video Data, Create Video File, and feeds into the Check Previous Rental process. Do a process dependency diagram for these subprocesses. Then add event triggers and data files to complete the PDFD.

STUDY QUESTIONS ____ _ 1. Define the following terms:

affinity attributive entity

Process Flow Diagram

associative entity m:n relationship CRUD matrix normalization elementary process possible number of entity entity relationships entity-relationship possible nature of

diagram entity relationships functional promoted relationship

decomposition trigger 2. Compare data flow diagrams from Chapter 7 to

process data flow diagrams in this chapter. List five similarities and five differences between them.

3. Find a small company and develop an entity- relationship diagram of their data. For each

390 CHAPTER 9 Data-Oriented Analysis

entity, develop an attribute list and normalize the data. Discuss the problems you have in developing the answer with your class.

4. What is a 'promoted relationship' in an ERD and what is the result of the promotion?

5. Normalization assumes that you know the rela- tionships of data within and between entities. What happens if you do not have the data rela- tionships correctly specified in normalization?

6. What does normalization, as performed during analysis, define? What does it not define?

7. What is the purpose of an entity/process matrix?

8. Describe the analysis of an entity/process matrix.

9. What is the significance of subject area data- bases? What do subject area databases have in common with normalization?

10. What is the importance of an organizational ERD? What problems might arise when you begin the ERD definition for an application during the business area analysis?

11. Describe the relationships between the dia- grams developed throughout IE-BAA. That is, how is each diagram used in the creation of successor diagrams?

12. What is the purpose of functional decom- position?

13. What are the three conditions under which you cannot eliminate an entity?

14. On a process dependency diagram, what is the significance of directed lines connecting two processes? Does this meaning change when the processes are connected on a PDFD? If so, how?

15. When should printed items be included ort a PDFD?

16. What is a functional decomposition in IE? Define the diagram and the contents at each level of detail. How do you know when the decomposition is complete, i.e., when to stop?

17. What are the steps to developing a PDFD? 18. Define the allowable inputs to a process on a

PDFD. 19. What is a CRUD matrix? How is it used? 20. What are the allowable connections on a

process dependency diagram? 21. What are the allowable connections on a

process data flow diagram? 22. List four problems and their solutions when

developing a PD FD.

* EXTRA-CREDIT QUESTIONS 1. Develop an IE analysis for the accounting (or

purchasing) function at ABC Video. Refer to other books to obtain details about accounting applications. One such book is Online Business Computer Applications, 3rd edition, Alan L. Eliason, NY: MacMillan, 1991.

2. Compare IE to process analysis. What are the similarities? What are the differences? How are the same terms used differently? Which method has the least ambiguity? Which method results in a more complete analysis?

3. Do an entity-relationship diagram for the AOS Tracking System problem in the Appendix. Nor- malize the data. Compile a list of issues for future resolution dealing with the data. The issues should relate to how many relations are needed, how the data will be used, and how to minimize the number of relations without hav- ing many unused attributes in each relation.

4. Do a process decomposition diagram and a PDFD for the AOS Tracking System described in the Appendix.

5. What do you not know after BAA is complete?

C HAP T E R 10 __ DATA-

------------------------------~--------.. ---- ____ ORIENTED ------------------------__________ r---- ____ DESIGN~~~~~~~~~~~-

INTRODUCTION ____ _

Data-oriented design uses data as the basis for clus- tering processes, building databases, and identify- ing potential distribution of the application. In this chapter, we continue the discussion of Informa- tion Engineering as the example of data-oriented methodology. Since IE has several 'incarnations' that differ slightly, it is important to note that IE in this chapter is consistent with the Martin [1992], Texas Instruments [1988], and Knowledgeware™l versions.

CONCEPTUAL ________ _ FOUNDATIONS ____ _

Information Engineering is the closest to a complete methodology of the methods in common use. It bor- rows from research and practice to build a complete view of the application and its environment. Struc- tured programming tenets describe the importance of limiting program structure, as much as possible, to selection, iteration, and instruction sequence compo- nents. 'Go to' statements should be minimized. Modules should have one entry and one exit. In IE

1 Knowledgeware™ is a product of Knowledgeware, Inc., Atlanta, Ga.

design, these tenets are practiced in structuring the application as well as the' program modules.

Subject area database design is based on theories of relational database and practice of data design. Data should be clustered with processes which cre- ate the data. Those processes determine 'subject areas' of data. Subject databases are stored in the same database environment and their processes are in integrated applications. These topics were dis- cussed in Chapter 5 and are not repeated here. Dur- ing analysis, the data entities are normalized and relations are identified (Chapter 9). Normalized data is the starting point for physical database design. Physical database design may automate the normal- ized relations directly or may denormalize for per- formance purposes. Also, in organizations with many using locations and potential for distribution of data and processes, a strategy for distribution is defined. These two activities, potential denormaliza- tion and distribution, are based on practical guide- lines rather than theory.

From practice, we know that there is more to implementing an application than designing program specifications and a database. We need to design screens, a screen dialogue, provide for unauthorized and unwanted damage to the data, provide for con- version from the old to the new method of data stor- age, design and plan application implementation, install hardware, design and plan application tests,

391

392 CHAPTER 10 Data-Oriented Design

and develop training programs for users. While all of these tasks are discussed in some books on IE, these activities are done regardless of methodology, and to discuss them as pertaining only to IE would be mis- leading. For this reason, the topics in this chapter include screen dialogue design, hardware planning, and providing for data security, recovery, and audit controls, in addition to procedure and database design. Human interface design, conversion, and training are discussed in Chapter 14; testing is the subject of Chapter 17.

DEFINITION OF _____ _ INFORMATION _____ _ ENGINEERING _____ _ DESIGN TERMS ____ _

A full list of the activities in IE design is given here; included are references to chapters in which some topics are discussed.

1. Design security, recoverability, and audit controls

2. Design human interface structure

• Develop menu structure • Define screen dialogue flow

3. Data analysis

• Reconfirm subject area database definition

• Denormalize to create physical database design

• Conduct distribution analysis and recom- mend production data distribution strategy

4. Develop an action diagram and conduct reusability analysis

5. Plan hardware and software installation and testing

6. Design conversion from the old to the new method of data storage (Chapter 14)

7. Design and plan application tests (Chap- ter 17)

8. Design and plan implementation (Chap- ter 14)

9. Develop, schedule, and conduct training pro- grams for users (Chapter 14)

The topics in this chapter are design of data usage, action diagrams (which are program specs), screen dialogues, security, recovery, audit controls, and installation planning. They are discussed in this sec- tion in the order above, by the amount of work involved, and their importance to the application.

The first activity in IE design is to confirm design of the database and determine the optimal data loca- tion. Invariably, when the details of processing are mapped to specifications, data usage changes from that originally envisioned. To confirm database design, the data is mapped to application processes in an entity/process (CRUD) matrix and the matrix is reanalyzed. (See Chapter 9 for a more complete dis- cussion of entity/process matrices.) The entity/ process matrix (see Figure 10-1) clusters data together based on processes with data creation authority. The subject area databases defined by the clusters are stored in the same database environment.

The second step of database design is to deter- mine a need to denormalize the data. Recall that nor- malization is the process of removing anomalies that would cause unwanted data corruption. Denor- malizing is the process of designing storage items of data to achieve performance efficiency (see Figure 10-2). Having normalized the data, you know where the anomalies are and can design measures to pre- vent the problems.

The next activity in data analysis is to determine the location of data when choices are present. A series of objective matrices are developed and ana- lyzed. The matrices identify process by location and data by location and transaction volume. These are used to develop potential designs for distribution of data. The application processes and data are both mapped to locations. Cells of the process/location matrix contain responsibility information, identify- ing locations with major and minor involvement (see Figure 10-3). This information is used to determine which software would also be required to be distrib- uted, if distribution is selected.

Two data/location matrices are developed. The first data/location matrix identifies data function as either update (i.e., add, change, or delete) or retrieval

Definition of Information Engineering Design Terms 393

Entities = Purchase PO Item Order

Processes

Create & Mail Order CRUD CRUD

Call Vendor & Inquire on Order RU RU

Verify Receipts against Order RU RU

Send Invoices to Accountant RD RD

File Order Copy by Vendor R R

Identify Late & Problem Orders R R

Identify Items & Vendors

Call Vendor to Verify Avail/Price

FIGURE 10-1 Example of EntitylProcess Matrix

by location (see Figure 10-4a). The second defines options for data in each location (Figure 10-4b). Together these matrices identify options for distrib- uting data. The options for distributed data are repli- cation, partitioning, subset partitioning, or federation (see Figure 10-5). Replication is the copying of the entire database in two or rriore locations. Vertical partitioning is the storage of all data for a subset of the tuples ( or records) of a database. Subset parti- tioning is the storage of a partial set of attributes for the entire database. Federation is the storage of dif- ferent types of data in each location, some of which might be accessible to network users. The selection of distribution type is determined by the usage of data at each location.

Then, a transaction volume matrix is developed to identify volume of transaction traffic by location. Cells of this matrix contain average number of trans- actions for each data relation/process per day (see Figure 10-6). In an active application, hourly or peak activity period estimates of volume might be pro- vided. During matrix analysis, the data and pro-

Vendor- Inventory Vendor

Item Item

CRU R R

RU R R

RU R

R R RU

R R CRU

RU RU

cesses are clustered to minimize transmission traffic. Then formulae are applied to the irtformation to determine whether the traffic warrants further con- sideration of distribution.

Finally, subjective reasons for centralizing or for distributing the application are developed. The sub- jective arguments ensure that political, organiza- tional, and nonobjective issues are identified and considered. Examples of subjective motivations for centralization/distribution relating to Figures 10-4, 10-5, and 10-6 are in Table 10-1. Recommendations on what, how, and why to distribute (or centralize) data are then developed from the matrices and sub- jective analysis. The recommendations and reason- ing are presented to user and IS managers to accept or modify.

After data are designed, the design of the human interface can begin with a definition of interface requirements. The hierarchy diagram is used to determine the structure of selections needed by the application. A menu structure is a structured dia- gram translating process alternatives into a hierarchy

394 CHAPTER 10 Data-Oriented Design

Unnormalized First Normal Form Second Normal Form Third Normal Form DeRelation

Order Number Order Number Qrder Number Order

Order Date Order Date Order Date

Order Ship Terms Order Ship Terms Order Ship Terms

Order Payment Order Payment Order Payment

Terms Terms Terms

Customer Number Customer Number .. Customer Number Customer Name Customer Name .-

Customer Address Customer Address Customer

Customer Number

*Item Number Customer Name

Item Description Customer Address

Item Quantity Order Number Qrder Number Order Item

Item Price Item Number Item Number

Item Extended Price Item Description Item Description .. Item Quantity Item Quantity

Item Price Item Extended Price X

Item Extended Price

Item Number .. Inv. Item Description

Price

Denormalized Design for Order

ORDER Order Number Order Date Order Ship Terms Order Payment Terms Customer Number Customer Name Customer Address

Order Item Order Number Item Number Customer Number Customer Name Item Description Item Quantity Item Price Item Extended Price

FIGURE 10-2 Example of Denormalized Data for an Order

Definition of Information Engineering Design Terms 395

Function

Purchasing

Marketing

Customer Service

Sales

Product Development

Research & Dev.

Manufacturing

Legend:

X-Major Involvement

\-Minor Involvement

Location A

Location B

x X

Location C

Location D Location E

FIGURE 10-3 Example of Process/Location Matrix

of options for the automated application (see Figure 10-7). In general, we plan one menu entry for each process hierarchy diagram entry between the top and bottom levels. One level of menus corresponds to one level in the process hierarchy diagram. At the

Data Usage by Location Matrix

Subject Data Location A Location B

Prospects AII-UR AII-UR

Customer AII-UR AII-UR

Customer Orders AII-UR Subset-Own Products-UR

Customer Order AII-R AII-R History

Manufactu ring Subset- Subset-

lowest level of the process hierarchy, a process cor- responds to either a program or module. Screens at the lowest level are determined by estimating exe- cute units. These functional screens may not be final in menu structure definition because execute

Location C Location D Location E

AII-R AII-R

Subset- AII-UR Plans own products-R own products-R own site-UR

Manufacturing Subset- Subset- Subset- AII-UR Goods in Process own products-R own products-R own site-UR

Manufacturing Subset- Subset- AII-R Subset- AII-UR Inventory own products-R own products-R own site-UR

U = Update, R = Retrieve

FIGURE 10-40 Example of Data Matrices by Location

396 CHAPTER 10 Data-Oriented Design

Distribution Alternatives by Location

Subject Data Location A Location B Location C Location D Location E

Prospects Replicate- Replicate Central Copy

Customer Replicate- Replicate Central Copy

Customer Orders Central Copy- Vertical Access Access A data Partition by central copy central copy

Product with delay with delay

Customer Order Replicate Replicate Access History Central Copy or access central copy

central copy with delay

with delay

Manufacturing Replicate Replicate Subset- Subset- Plans or access or access own site own site

central copies central copies with delayed with delay with delay access to D

Manufacturing Access D Access D Subset- Subset- Goods in Process and E and E own site own site

Databases Databases with delayed accesstoD

Manufacturing Access D Access D Subset- Subset- Inventory and E and E own site own site

Databases Databases with delayed access to D

FIGURE lO-4b Example of Data Matrices by Location

unit design is usually a later activity. Once the menu structure is defined, it is given to the human inter- face designer( s) for use during screen design (Chap- ter 14).

The structure is then analyzed further to deter- mine the allowable movement between the options on the menu structure. The dialogue flow diagram documents allowable movement between entries on the menu structure diagram (see Figure 10-8). On the diagram, rows correspond to screens and columns correspond to allowable movements. For instance, in the menu structure example (Figure 10-7), Customer Maintenance has four subprocesses. A dialogue flow diagram shows how Customer Maintenance is C\.cti- vated from the main menu (or elsewhere) and the

options for movement from that level. From the Cus- tomer Maintenance menu, the options are to move to the main menu or to one of the four subprocesses. The dialogue flow diagram is used by the designers in developing program specifications, by the human interface designer( s) in defining screens, and by testers in developing interactive test dialogues.

Next, procedure design begins with analysis of the process hierarchy and process data flow dia- grams developed during IE analysis (Chapter 9). Remember, in analysis, we developed one process data flow diagram (PDFD) for each activity. Now each PDFD is converted into an action diagram. An action diagram shows procedural structure and pro- cessing details suitable for automated code genera-

Definition of Information Engineering Design Terms 397

Replication of Data-Data are copied in more than one location.

Location A

~ ~

Location B

~ L:J

Vertical Data Partitioning-Complete 'records' or tuples of data are stored with different data in more than one location.

Location A

~ ~

Location B

~ ~

Horizontal (or Subset) Data Partitioning-Partial 'records' or tuples of data are stored in more than one location.

Location A

A1,A3, A6, B1, C2,C4, G,H,I

Location B

A2,A4, A5, B2, 83, C1, D,E,F

Data Federation-Different data are completely stored in more than one location. Some data may be accessed by remote sites.

Location A

A,C,D,F B - local only, E - local only

FIGURE 10-5 Data Distribution Alternatives

tion. An action diagram is drawn with different types of bracket structures to show the hierarchy, rela- tionships, and structured code components of all processes.

The first-cut action diagram translates the PDFD into gross procedural structures (see Figure 10-9). Then, using detailed knowledge obtained during the information gathering process, the details of each procedure are added to the diagram to develop pro- gram specifications (see Figure 10-10). These pro-

Location 8

Q,R,S,T U - local only, V - local only

gram specifications may then be packaged into mod- ules that perform one function. Data entities are added to the diagram at the level they are accessed (see Figure 10-11). Progressively more detail about data usage is provided about data attributes. Arrows are attached to show reading and writing of data (see Figure 10-12).

When the details are completely specified, the action diagram is mapped to procedural templates to determine the extent to which reusable modules

398 CHAPTER 10 Data-Oriented Design

Subject Database

Location/Function

Customer Service

Sales

Marketing

Customer Service

Sales

Marketing

Manufacturing

Prospect

50 R 20 U

15 R

25 R 20 U

20 R

Customer

100 R 20 U

50 R 30 U

250 R 50 U

25 R 5U

10 R

Customer Order

250 R 400 U

150 R 50 U

10 R

250 R 400 U

10 R 100 U

10 R

Legend: U = Create, Update or Delete; R = Retrieve

Customer History

50 R

70 R

50 R

Mftg. Plan

250 R

50 R 5U

100 R 15 U

Mftg. WIP

250 R

50 R 250 U

200 R 2,500 U

Mftg. Inven.

15 R

1 R

250 R

15 R

500 R 2,000 U

500 R 25,000 U

FIGURE 10-6 Example of Transaction Volume Matrix

can be used in the application, and the changes to the action diagrams required to define modules for reuse. A procedural template is a general, fill-in- the-blanks guide for completing a frequently per- formed process. For instance, error processing and screen processing can be defined as reusable tem- plates (see Figure 10-13). A data template is a partial definition of an ERD or database that is consistent within a user community. For example, the insurance industry has common data requirements for policy holders, third party insurance carriers, and policy information; most companies have similar account- ing data needs, and so on. To be a candidate for template definition, a process must do exactly the same actions whenever it is invoked, and data must be consistent across users.

Mter reusability analysis, the action diagram set is finalized and used to generate code. If the appli-

cation is specified manually, the action diagrams are given as program specifications to programmers who begin coding. If the application uses a CASE tool, automatic code generation is possible. A code gen- erator is a program that reads specifications and cre- ates code in some target language, such as Cobol or C. If the application uses a code generator, the action diagram contains the symbols and procedural detail specific to the code generation software. If the appli- cation uses a 4GL, the action diagram might con- tain actual code. If manual programming uses a 3GL or lower, the action diagram contains pseudo-code consisting of structured programming constructs.

The next activity in IE design is to develop security plans, recovery procedures, and audit con- trols for the application. Each of these designs re- strict the application to performing its activities in prescribed ways. The goal of security plans is to

Definition of Information Engineering Design Terms 399

TABLE 10- 1 Example of Subjective Reasons for Centralization and Distribution

General Measure-Argument

Geographic distribution by function by product makes centralization difficult

Centralized mainframe in a sixth location is not close to distributed sites, nor interested in serving their needs

Little product overlap between sites A and B

Location A Measure-Argument

General Manager in Location A-smallest needs

GM wants 'what is best' for division

Little technical expertise in the location; would increase travel expense required to support hardware/ software

Location B Measure-Argument

Customer service needs fast response to fulfill corporate objectives (90% of requests serviced within one phone call, less than three minutes)

Most application expertise in division is located here

IS manager, located here, wants the applications and data under his control

Location C Measure-Argument

Actions mostly independent of other sites

Delays in retrieval of information could be tolerated

Location D Measure-Argument

Historically, location controls its own hardware/software

Hardware/software not currently compatible with A, B, or C

Location E Measure-Argument

Legend:

Historically, location controls its own hardware/software

Historically, software has been successfully developed/bought as joint activity with IS group in Site B

D/e = Strong argument for Distribution/Centralization d/e = Weak argument for distribution!centralization

400 CHAPTER lODato-Oriented Design

1. Order Fulfillment 1. Order Entry 2. Order Change 3. Order Delete 4. Order Inquiry

[ 2. Inventory Allocation

Customer

1. Create Allocation 2. Change Allocation 3. Inquire on Allocation [

Service -1---- 3. Customer Maintenance

[

1. Customer Create 2. Customer Change 3. Customer Delete 4. Customer Inquiry

4. Management Reports

§ 1. Customer Reports 2. Inventory Reports 3. Orders by Customer 4. Volume by Customer 5. Multifile Inquiry 6. Sal Inquiry

FIGURE 10-7 Menu Structure Example

protect corporate IT assets against corruption, illegal or unwanted access, damage, or theft. Security plans can address physical plant, data, or application assets, all by restricting access in some way. Physi- cal security deals with access to computers, LAN servers, pes, disk drives, cables, and other compo-

- Purchasing Application

- Analyze Business

- Create Purchase Order

- Monitor Purchase Order

- End Purchasing

FIGURE 10-9 Action Diagram Example

nents of the network tying computer devices to- gether. Data security restricts access to and func- tions against data (e.g., read, write, or read/write). Application security restricts program code from access and modification by unauthorized users. Examples of the results of security precautions are locking of equipment, requirement of user pass- words, or assignment of a software librarian for pro- gram changes.

Recovery procedures define the method of restoring prior versions of a database or application software after a problem has destroyed some or all of

~:~:o::::e:iC~ ~ = f ~ -= - = f- = - :::::9 -----------_1 -- ---- -f--------------- ------~ Create Order -

Change Order _ ] ] j Delete Order - - - - - - - = = - - - - - ] ] - - = - : Order Inquiry ___________ ,_ _ _ _ _ ] _ ~

FIGURE 10-8 Dialogue Flow Diagram Example

,...-- Purchasing

I-- Analyze Business

I-- Create Purchase Order

(

Do Until All Items Are Identified

Identify Item and Vendor

EndDo Sort by Vendor, Item

~ Do While There Are Items to Be Processed

r- I F First-Record

Set Last- Vendor = Vendor

I- ELSE IF Vendor = Last-Vendor

Get Price

I- ELSE

Create Order

Mail Order

File by Vendor

'- ENDIF

t. ENDDO

_ End Purchasing Monitor Purchase Order

FIGURE 10- 10 Action Diagram with Create Purchase Order Process Detail

it. Recovery is from a copy of the item. Backup is the process of making extra copies of data to ensure recoverability.· Disasters considered in the plan in- clude user error, hacker change, software failure, DBMS failure, hardware failure, and location fail- ure. Recovery is the process of restoring a previous version of data (or software) from a backup copy to active use following some damage to, or loss of, the previously active coPy. The backup/recovery strat- egy should be designed to provide for the six types of errors above. Several backup options add require-

Information Engineering Design 401

ments to program design that need to be accom- modated.

Next, audit controls are designed to prove trans- action processing in compliance with legal, fidu- ciary, or stakeholder responsibilities. Audit controls usually entail the recording of day, time, person, and function for all access and modification to data in the application. In addition, special totals, transaction traces, or other special requirements might be applied to provide process audit controls.

Last, hardware installation is planned and imple- mented, if required for the application. Again, there is no theory or research about hardware installation, but long practice has given us guidelines on the activities and their timing.

INFORMATION _____ _ ENGINEERING _____ _ DESIGN _______ _

In this section, we discuss each activity in IE design in detail, and relate them to the ABC Video rental application. IE design topics in this section, in order of their occurrence in the application development process, include development of the following:

• data use and distribution analysis • security, recovery, and audit controls • action diagrams • menu structure and dialogue flow • hardware and software installation and testing

plans

Analyze Data Use and Distribution Guidelines for Data Use and Distribution Analysis

The two activities in this section precede physical database design which is assumed to be performed by a DBA. First, data usage analysis is per- formed to confirm the logical database design. Then the potential for distributing data throughout the organization is analyzed. The result is a strategy

402 CHAPTER 10 Data-Oriented Design

....--- Purchasing

f-- Analyze Business

r-- Create Purchase Order

[

Do Until All Items Are Identified -

Identify Item and Vendor

EndDo __________ ---..J

New Releases Vendor Purchase Order

Sort by Vendor, Item

~ Do While There Are Items to Be Processed -----......,

r- IF First-Record

Set Last- Vendor = Vendor

r- ELSE IF Vendor= Last-Vendor

Get Price

- ELSE

Create Order

Mail Order

File by Vendor

-ENDIF

L ENDDO

I-- ( Monitor Purchase Order

L...--- End Purchasing

FIGURE 10-11 Action Diagram with Entities

for data and software location that best fits user needs.

The entity/process (CRUD) matrix from IE analysis is reanalyzed and mapped to the completed action diagram. Each process is identified on the action diagram with its associated data items and the related entity. Recall that the clustering of entities and processes on the matrix is primarily based on which processes have create responsibility for the data. The entities and processes are arranged into a

new entity/process matrix which is compared to the one developed during analysis. If the definition of subject area databases does not change, the distri- bution analysis can begin. If the definition of sub- ject area databases does change, the logical definition of the databases is redone as discussed in Chapter 9.

The second step to data analysis is to determine the potential for data distribution. Distribution analy- sis uses three matrices as the objective basis for determining whether data should be distributed.

Purchasing

Analyze Business

Information Engineering Design 403

New Releases Vendor Name Video Name

Create Purchase Order ____________ =-_ New Releases

[

Do Until All items Are Identified

Identify Item and Vendor

EndDo

Sort by Vendor, Item Vendor ID ItemlD

Vendor Purchase Order

Do While There Are Items to Be Processed ./ Vendo, riD

IF First-Record ~

Set Last- Vendor = Vendor ( Vendor ID

ELSE IF Vendor = Last-Vendor Item ID __ ItemPrice

Get Price

ELSE

Create Order ....... .------1

Mail Order

Vendor ID Vendor Name Vendor Address Order Terms Item ID Item Qty Item Description

File by Vendor Item Price

~purchase Order ENDIF ENDDO

Monitor Purchase Order

End Purchasing

FIGURE 10- 12 Action Diagram with Data Detail

First, a location/process matrix is developed to iden- tify major and minor performance of processes in the application (see Figure 10-14). This location/process matrix determines which software is needed at each location to support the functions. The informa- tion needed to complete the matrix is provided by the users.

Next, a data distribution by location matrix is developed to show creation and retrieval needs by location (see Figure 10-15). This datal1ocation ma-

trix is used to determine the potential age of data required by each location. For instance, retrieval data might be down-loaded from a centralized location each day at the close of business, rather than main- tained at the remote sites. Created data must be available for creation, and therefore, up-to-date at the creating sites. The information needed to com- plete the matrix is provided partly from the entity/ process matrix from the first data analysis, and partly by the users.

404 CHAPTER 10 Data-Oriented Design

Call ErrMsg ErrorFieldlD

Using ErrorFieldlD from Sender, Locate ErrorMessageActions using

ErrorFieldl D. If Highlight, Highlight ErrorField. If Blink, Blink ErrorField. If ColorChange,

Get NewColor Change ErrorField to NewColor.

Display ErrorMessage in line 24. Get User ErrorResponse. Reset ErrorField to NormalColor,

Lowlntensity, and NonBlink. Return ErrorResponse to Sender.

Return ErrorResponse

FIGURE 10-13 Procedure Template for Error Message Processing

The next matrix shows data usage by location (see Figure 10-16). Recall from above that data can be centralized, vertically or horizontally partitioned, or federated. For instance, a bank branch might create data about customers, but it only accesses information about its own customers on a regular

basis. So, for most processing, a vertical partition of the customer database, the branch's customers, could be accessible locally in the branch to speed processing.

Function

Purchasing

Marketing

Customer Service

Sales

Product Development

Research & Dev.

Manufacturing

Legend:

X-Major Involvement

\-Minor Involvement

Location A

Location B

x X

The last objective matrix summarizes transaction volume by process by location (from the process/

Location C Location D Location E

X X

FIGURE 10- 14 Process by Location Matrix Example

Information Engineering Design 405

Subject Data Location A Location B Location C Location D Location E

Prospects AII-UR AII-UR

Customer AII-UR AII-UR

Customer Orders AII-UR Subset-Own AII-R AII-R Products-UR

Customer Order AII-R AII-R AII-R AII-R History

Manufacturing Subset- Subset- Subset- AII-UR Plans own products-R own products-R own site-UR

Manufacturing Subset- Subset- Subset- AII-UR Goods in Process own products-R own products-R own site-UR

Manufacturing Subset- Subset- AII-R Subset- AII-UR Inventory own products-R own products-R own site-UR

U = Update, R = Retrieve

FIGURE 10-15 Data Usage by Location Matrix Example

location table) against each subject database from the data analysis. Two daily transaction volume estimates for each process and location are devel- oped (see Figure 10-17). The first estimate is for transactions that create or update the database. The second estimate is for read-only retrieval processing. Also notice that if no database access is performed by a process, no entry is made. This increases the readability of each matrix.

The analysis of this data is to first identify the location with the highest total transaction count for each database. The example shows a thick box around each such location (see Figure 10-18). If the application were distributed, with centralization of subject databases in one location, the boxes would identify the most likely location for each database. All other transactions, outside the boxes, represent transmission traffic. When the transmission traffic is a high percentage of the total traffic, say over 40%, different types of replication, federation, and parti- tioning are tried. To analyze the data, first box the transaction numbers for the site(s) representing 50% or more of the total processing. If there is one site boxed in a column, that identifies a centralized data- base at the location corresponding to the box. We have two of these in the example (Figure 10-18)-

the Work in Process and Inventory databases at loca- tion E. The initial recommendation would be to centralize this data at E. Even though D's volume is significantly less than E's, the data usage table shows that each site accesses only its own data, so the option to vertically partition data and provide 'home ownership' could be used to support the busi- ness needs.

The other databases all have access competition from two sites (Figure 10-18). Two locations, A and B, have fairly even usage of the Prospect and Cus- tomer, Customer Order, and Customer History data. The options from the Data Usage table show that Replication would be the distributed recommenda- tion since the sites both access all data. Customer History processing differs from the other databases in that it is all read-only and it has a much lower vol- ume than the others. Therefore, it could be central- ized at either site with an access delay at the other site for retrievals. This option might be chosen if there are hardware configuration differences that favor centralization.

Locations Band E compete for the Manufactur- ing Plan data (Figure 10-18). Location B only retrieves the data, while the location E volume of updates is low. The database could either be

406 CHAPTER lODato-Oriented Design

Subject Data Location A Location B Location C Location D Location E

Prospects Replicate or Replicate Central Copy

Customer Replicate or Replicate Central Copy

Customer Orders Replicate or Horizontal Access Access Central Copy Partition by central copy central copy

Product with delay with delay

Customer Order Replicate or Replicate Access History Central Copy or access central copy

central copy with delay with delay

Manufacturing Replicate Replicate Subset- Central Copy Plans or access or access own site or Subset-

central copies central copies own site with delay with delay with delayed

access to D

Manufacturing Access D Access D Subset- Subset- Goods in Process and E and E own site own site

Databases Databases with delayed access to D

Manufacturing Access D Access D Subset- Subset- Inventory and E and E own site own site

Databases Databases with delayed access to D

FIGURE 10- 16 Data Distribution by Location Matrix

centralized at B to provide fast query access, with delayed access by E, or, if politics are involved, the data could be centralized at site E, the owner, with delayed retrieval by B.

The second part of the analysis is to compute the ratio of data retrieval transactions (DR) to data update transactions (Du). If the ratio is greater than one less than the number of locations (L) (or nodes in the network), distribution should be considered (see Table 10-2). In the example, the ratio clearly favors centralization of data (Table 10-2). Keep in mind that centralization here means that each data- base is stored at one location. It does not mean that the databases are all at the same location.

If a delay can be introduced for retrieval process- ing, then the ratio changes. It becomes much easier

to argue for distribution. Distribution should be con- sidered when retrieval volume is less than the ratio of locations to the delay (D). The delay is for update transactions which are now transmitted in bulk once per period to each other location. In the example, with even a 15-minute delay, the numbers over- whelmingly favor distribution. The rationale for these ratios is given in Table 10-3.

This discussion about distribution is important because it highlights an ethical problem in software engineering. The numbers can be made to argue for distribution regardless of transaction activity. If the transaction ratio of retrievals to updates is large, then the no-delay argument is more likely to favor distri- bution. If the retrieval to update ratio is less than one, the delay argument is likely to favor centralization.

As an ethical person, you are bound to tell the client about all computations and how the formulae can make either argument.

Last, a subjective list of reasons for and against centralization and distribution is developed for the organization. The exact topic headings for this list are tailored to the company and application environment.

Critical data should be managed centrally Data is/is not critical to corporation/business

unit Most data can/cannot be stored locally/

centrally

Information Engineering Design 407

Needs/does not need specific DBMS Requires/does not require larger machine than

local sites have Data ownership is/is not an issue Data replication needed in one/many locations Unique data/application in one location Data affects/does not affect central corporate

management Fast response time important/not important High availability important/not important Local staff skilled/unskilled with computers Application/data security is/is not vital to

organization/business unit Centralized operations is/is not at capacity

Subject Database

Customer Customer Mftg. Mftg. Mftg. Location/Function Prospect Customer Order History Plan WIP Inven.

Customer Service 100 R 250 R 5R 2R 2R 20 U 400 U

Sales 50 R 50 R 150 R 50 R 2R 2R 15 R 20 U 30 U 50 U

Marketing 15 R 5R 10 R 50 R 2R 1 R

Customer Service 250 R 250 R 50 R 250 R 250 R 250 R 50 U 400 U

Sales 25 R 25 R 10 R 70 R 2R 2R 15 R 20 U 5U 100 U

Marketing 20 R 10 R 10 R 50 R 2R 5R

Manufacturing 50R 50 R 500 R 5U 250 U 2,000 U

Manufacturing 100 R 200 R 500 R 15 U 2,500 U 25,000 U

Legend:

U = Create, Update or Delete

R = Retrieve

FIGURE 10-17 Summary Transaction Volume Matrix

408 CHAPTER 10 Data-Oriented Design

Subject Database

Location/Fu nction Prospect

Customer Service

Sales 50 R 20 U

Marketing 15 R

Customer Service

Sales 25 R 20 U

Marketing 20 R

Manufacturing

Legend:

U = Create, Update, or Delete

R = Retrieve

Customer

100 R 20 U

50 R 30 U

250 R 50 U

25 R

10 R

Customer Order

250 R 400 U

150 R 50 U

10 R

250 R 400 U

10 R 100 U

10 R

Customer History

50 R

70 R

50 R

Mftg. Plan

250 R

50 R 5U

100 R 15 U

Mftg. WIP

250 R

50 R 250 U

200 R 2,500 U

Mftg. Inven.

15 R

1 R

250 R

15 R

500 R 2,000 U

500 R 25,000 U

FIGURE 10-18 Analysis of Summary Transaction Volume Matrix

Down-loading of yesterday's data would/would not work in local sites

Updates with delay would/would not work in this application environment

Partitioning of data would/would not work in supporting this application

Replication of data would/would not work in supporting this application

Data integrity is/is not paramount to the application

Disaster recovery protection is/is not vital to the application

Operators are/are not at remote sites

Each reason is rated as weak or strong justification of its position. The purpose of list creation is to sur- face and attempt to objectify objections and argu- ments from each stakeholder viewpoint regarding distribution of data in the application. An easy analy- sis is to count the capital and small letters of each type, and compare them. A more elaborate analysis might entail giving a weight to each item and devel- oping a weighted ranking of the central/distributed positions. If the results of this analysis support the objective measures and results, a compelling justifi- cation for the result can be developed and presented to user management for approval. If the subjective

TABLE 10-2 Distribution Ratio Fonnulae

The breakeven point for distribution occurs when

DR/Du > N -1.

If the transaction ratio is greater than N - 1, distribute data.

An alternative is to allow a time delay for update transac- tions with all data replicated at all locations in a network. Then only updates generate network traffic. The break- even point for distribution occurs with this scenario when

Du < N/TimeDelay or Du * TimeDelay < N If the number of changes is less than the number of nodes divided by the time delay, distribution is favored.

Legend:

DR = Number of data retrieval transactions Du = Number of data update transactions N = Number of network nodes D = Total number of data transactions (DR + Du)

Adapted from Martin (1990), p. 360.

analysis contradicts the objective measures, the user manager/champion might have to do some political maneuvering to obtain the desired result. Of course, if the champion is against the recommendation, the numbers in the traffic table still are useful in determining the size and speed of the machine and telecommunications lines required to service the application's data needs.

ABC Video Example Data Use Distribution and Analysis

ABC's one location simplifies the choices for this analysis. Centralization of data and processes is the only possible choice. For the record, a table of trans- action volumes is presented in Figure 10-19.

A secondary issue, if not already decided, is hard- ware selection. ABC could use a multiuser mini- computer or a LAN. This analysis, too, is simple because ABC is a small company without a high vol- ume of processing. A LAN is cheaper, more easily

Information Engineering Design 409

maintained, more easily staffed, and less costly {or incremental upgrades. Therefore, a LAN is the choice. Most multiuser mini-computers allow eight units without major expenditures for an additional I/O controller board. Mini-computers tend to have proprietary operating systems and use packages that tie the user to a given vendor. The strength of

TABLE 10-3 Rationale for Distribution Ratios

If T is the number of traffic units per hour (i.e., transac- tions), and if all data is centralized at one location (not necessarily the same), then the total traffic units per hours is

Tcentralized = (DR + Du) * (N - l)/N Then, if all data is decentralized (i.e., fully replicated at all user locations), only update transactions generate net- work traffic, and

T distributed = Du * (N - 1) Fully replicated, decentralized data generates less traffic than centralization if

T centralized > T distributed, or

(DR + Du) * (N -1)/N > DU * (N -1) This reduces to DR I Du > N -1. This formula means that when the ratio of retrievals to changes (DR I Du = N - 1) is greater than N - 1, favor distribution. When the ratio is equal to N - 1, either choice is acceptable from a network point of view. When the ratio is less than N - 1, favor centralization.

If changes can be applied with a delay, the equations change. Then the breakeven point occurs when

DR < N/TimeDelay

The greater the delay, the more desirable a distributed strategy can be made to appear.

Legend:

DR = Number of data retrieval transactions Du = Number of data update transactions N = Number of network nodes D = Total number of data transactions (DR + Du)

Adapted from Martin (1990), pp. 360-361.

410 CHAPTER lODato-Oriented Design

Subject Database

Customer Video Location/Function Customer Video Item History History EOD Archive

Dunwoody Village Rent/Return 500 R 500 R 250 R 500 R 500 R

15 U 5U 400 U 500 U 500 U

Video 20 R 150 R Maintenance 5U 50 U

Customer 5R Maintenance 5U

Other 15,000 U/ 1,000 U 15,000 U/ Once/Mo Once/Mo

FIGURE 10- 19 ABC Transaction Volume Matrix

multiuser minis is in their added horsepower that allows them to support applications with a high vol- ume of transactions (in the millions per day). A multiuser mini is not recommended here because, for the money, it would be analogous to buying a new Porsche 911 Targa when a used Hyundai would do just fine. To discuss configuration of the LAN, we move to the next section on hardware and software installation.

Define Security, Recovery, and Audit Controls Guidelines for Security, Recovery, and Audit Control Planning

The three issues in this section-security, recovery, and controls-all are increasingly important in soft- ware engineering. The threat of data compromise from casual, illegal acts, such as viruses, are real and growing. These topics each address a different per- spective of data integrity to provide a total solution for a given application. Security is preventive, recovery is curative, and controls prove the other two. Having one set of plans, say for security, with- out the other two is not sufficient to guard against

compromise of data or programs. Trusting individu- als' ethical senses to guide them in not hurting your company's applications simply ignores the reality of today's world. Morally, not having planned for attempts to compromise data and programs, you, the SE, are guilty of ethical passivity that implicitly war- rants the compromiser's actions. Therefore, design of security, recovery, and controls should become an integral activity of the design of any application.

The major argument against security, recovery, and audit controls is cost, which factors in all deci- sions about these issues. The constant trade-off is between the probability of an event and the cost of minimizing its probability. With unlimited funds, most computer systems, wherever they are located, can be made reasonably secure. However, most com- panies do not have, nor do they want to spend, unlimited money on probabilities. The trade-off becomes one of proactive security and prevention versus reactive recovery and audit controls. Audit controls, if developed as part of analysis and design, have a minimal cost. Recoverability has on-going costs of making copies and of off-site storage. Each type of security has a cost associated with it. Keep the cost issues in mind during this discussion, and try to weigh how you might balance the three methods of providing for ABC's application integrity.

Information Engineering Design 411

Security plans define guidelines for who should does not use chemicals near computer have access to what data and for what purpose. equipment. Access can be restricted to hardware, software, and 4. Determine the capability of the facility to data. There are few specific guidelines for limiting withstand natural hazards such as earth- access since each application and its context are dif- quakes, high winds, and storms. Evaluate ferent. Those guidelines are listed here: the facility's water damage protection

and the facility's bomb threat reaction 1. Determine the vulnerability of the physical procedures.

facility to fire. Review combustibility of con- Design the facility without external win- struction. Determine adjacent, overhead, and dows and with construction to withstand underfloor fire hazards. Determine the status most threats. To minimize bomb and terrorist of current fire detection devices, alarms, sup- threats, remove identifying signs, place pression equipment, emergency power equipment in rooms without windows, and switches, extinguishers, sprinklers, and do not share facilities. To minimize possible smoke detectors. Determine the extent of storm damage, do not place the facility in a fire-related training. If the facility is shared, flood zone or on a fault line. evaluate the risk of fire from other tenants. 5. Evaluate external perimeter access controls

Plan·for fire prevention and minimize fire in terms of varied requirements for different threats by using overhead sprinklers, C02, or times of day, week, and year. Determine halon. Develop fire drills and fire contin- controls over incoming and outgoing materi- gency plans. If no emergency fire plans exist, also Evaluate access authorization rules, develop one, reviewing it with the local fire identification criteria, and physical access department, and practicing the procedures. controls.

2. Consider electrical/power facilities. Review Plan the security system to include electrical routing and distribution of power. perimeter lights, authorization cards, physical Review the means of measuring voltage and security access, etc. as required to minimize frequency on a steady-state or transient basis. the potential from these threats. Establish Determine whether operators know how to procedures for accepting, shipping, and dis- measure electrical power and can determine posing of goods and materials. For instance, both normal and abnormal states. Define shred confidential reports before disposal. electrical and power requirements for the Only accept goods for which a purchase new application hardware and software. order is available. Determine power sufficiency for the comput- 6. Evaluate the reliability and potential damage ing environment envisioned. from everyday use of terminals and remote

Correct any deficiencies before any equip- equipment from unauthorized employees. ment is delivered. For instance, install a Plan physical locking of equipment, universal po~er supply (UPS) if warranted backup copies of data, reports, etc. to mini- by frequent power fluctuations or other mize potential threats. Design remote equip- vulnerabilities. ment to minimize the threat of down-loaded

3. Review air-conditioning systems and deter- data from the central database except by mine environmental monitoring and control authorized users. Usually this is done by hav- mechanisms. Evaluate the 'housekeeping' ing PCs without any disk drives as terminal functions of the maintenance staff. devices.

Correct any deficiencies before any equip- 7. Evaluate the potential damage from unautho- ment is delivered. For instance, make sure rized access to data and programs. the maintenance staff cleans stairwells and Protect programs and data against unau- closets, uses fireproof waste containers, and thorized alteration and access.

412 CHAPTER 10 Data-Oriented Design

8. Evaluate the potential damage to the data- base from unwitting errors of authorized employees.

Design the application to minimize acci- dental errors and to be fault tolerant (i.e., recovers from any casual errors).

In general, we consider internal and external physi- cal environment, plus adequacy of data and program access controls. Security evaluation is a common enough event in many organizations that check- lists of items for security review are available. 3

An example of general topics in such checklists follows:

Physical Environment Fire fighting procedures Housekeeping and construction Emergency exits Portable fire extinguisher location and

accessibility Smoke detectors located above, under, and in

middle of floor areas Automatic fire suppression system

Electrical Power Power adequacy and monitoring Inspection, maintenance, safety Redundancy and backup Uninterruptible power supply Personnel training

Environment Air-conditioning and humidity control

systems Lighting Monitoring and control Housekeeping

Computer Facility Protection Building construction and location Water damage exposure Protection from damage or tampering with

building support facilities Building aperture protection Bomb threat and civil disorder

3 Two IBM-user organizations, GUIDE and SHARE, both have active disaster recovery and security control groups that issue guidelines, checklists, and tutorials on the topic.

Physical Access Asset vulnerability Controls addressing accessibility

Perimeter Building Sensitive offices Media storage Computer area Computer terminal equipment Computer and telecommunications

cable

An example of a detailed checklist for building access is provided next.

Facility type: Mainframe, LAN, PC, RJE, Remote, Communications 1. Are entrances controlled by __ locking devices __ guard force __ automated card-key system

anti-intrusion devices __ sign-in/out logs __ photo badge system

closed circuit TV other ___________ _

2. Are controls in effect 24 hours per day? If not, why?

3. Are unguarded doors __ kept locked (Good) __ key-controlled (Better with above) __ alarmed (Best with both of above)

4. If guard force, is it __ trained (Good) __ exercised (Better)

armed 5. Are visitors required to

__ sign in and out be escorted

__ wear distinctive badges __ undergo package inspection

6. If building is shared, has security been __ discussed (Good) __ coordinated (Better) __ formalized (Best)

7. Sensitive office areas, media storage, and computer areas

__ Does access authority for each area require management review?

Is access controlled by __ locking devices __ guard force __ automated card-key system

anti-intrusion devices __ sign-in/out logs __ photo badge system

closed circuit TV other ___________ _

__ Are unique badges required? __ Do employees challenge unidentified

strangers? 8. Control Mechanisms

__ Do signs designate control/restricted areas?

If locks are used __ is key issuance controlled? __ are keys changed periodically?

9. Administration __ Does management insist on strict

adherence to access procedures? Are individuals designated responsibility for

access control at various control points

__ authorizing visitor entry __ establishing and maintaining policy,

procedures, and authorization lists __ compliance auditing __ follow-up on violations

The probability of total hardware and software loss is low in a normal environment. In fact, the probability of occurrence of a destructive event is inversely related to the magnitude of the event. That is, the threat from terrorist attack might be miniscule, but the damage from one might be total. Each type of threat should be considered and assigned a current probability of occurrence. High probability threats are used to define a plan to minimize the probabil- ity. If the company business is vulnerable to bomb threats, for instance, buildings without external glass and without company signs are more anonymous and less vulnerable. Having all facilities locked at all times, with a specific security system for authorizing

Information Engineering Design 413

employees and screening visitors, reduces vulnera- bility even further.

The major vulnerability is not related to the phys- ical plant in most cases; it is from connections to computer networks. The only guaranteed security against telecommunications invasion is to have all computers as stand-alone or as a closed network with no outside access capability. As soon as any computer, or network, allows external access, it is vulnerable to invasion. There are no exceptions, con- trary to what the local press might have you believe. Data and program access security protection reduce the risk of a casual break-in to an application. Mon- itoring all accesses by date, time, and person further reduces the risk because it enables detection of intruders. Encrypting password files, data files, and program code files further reduces the risks; it also makes authorized user access more complex and takes valuable CPU cycles.

The most common security in an application is to protect against unwanted data and program access. Data access can be limited to an entire phys- ical file, logical records, or even individual data items. Possible functions against data are read only, read/write, or write only. Users and IS developers consider each function and the data being manipu- lated to define classes of users and their allowable actions. Allowable actions are to create, update, delete, and retrieve data. A hierarchy of access rights is built to identify, by data item, which actions are al- lowed by which class of users. A scheme for imple- menting the access restrictions is designed for the application.

Backup and recovery go hand-in-hand to provide correction of errors because of security inadequa- cies. A backup is an extra copy of some or all of the data and software, made specifically to provide recovery in event of some disaster. Recovery is the process of restoring a previous version of data or application software to active use following some damage or loss of the previously active copy.

Research by IBM and others has shown that com- panies go out of business within six months of a dis- aster when no backup copies of computer data and programs are kept. In providing for major disasters, such as tornados, off-site storage, the storing of backup copies at a distant site, is an integral part of

414 CHAPTER 10 Data-Oriented Design

guaranteeing recoverability. Off-site storage is usu- ally 200+ miles away from the computer site, far enough to minimize the possibility of the off-site facility also being damaged. Old salt mines and other clean, underground, environmentally stable facilities are frequently used for off-site storage.

The disasters of concern in recovery design are user error, unauthorized change of data, software bugs, DBMS failure, hardware failure, or loss of facility. All these problems compromise the integrity of the data. The most difficult aspect of recovery from the first three errors is error detection. If a data change is wrong but contains legal characters, such as $10,000 instead of $1,000 as a deposit, the only detection will come from audit controls. If a data change is wrong because it contains illegal charac- ters, the application must be programmed to detect the error and allow the user to fix it. Some types of errors, such as alteration of a deposit to a bank ac- count or alteration of a payment to a customer, should also have some special printout or supervi- sory approval required as part of the application design to assist the user in detecting problems and in monitoring the correction process. DBMS soft- ware frequently allows transaction logging, logging of before and after images of database changes and assisted recovery from the logs for detected errors.

DBMS failure should be detected by the DBMS and the bad transaction should automatically be 'rolled-back' to the original state. If a DBMS does not have a 'commit/roll-back' capability, it should not be used for any critical applications or applica- tions that provide legal, fiduciary, or financial pro- cessing compliance. Commit management software monitors the execution of all database actions relat- ing to a user transaction. If the database actions are all successful, the transaction is 'committed' and considered complete. If the database actions are not all successful, the commit manager issues a roll- back request which restores the database to its previ- ous state before the transaction began, and the transaction is aborted. Without commit and roll-back

the entire database or software library. An incre- mental backup is a copy of only changed portions of the database or library. A week's worth of back- ups are maintained and rotated into reuse after, for example, the fifth day. To minimize the time and money allocated to backup, incremental procedures are most common. A full backup is taken once each week with incremental backups taken daily. An active database would be completely backed-up daily with one copy on-site for immediate use in event of a problem. Regardless of backup strategy, an extra copy of the database is created at least once a week for off-site storage.

The extensiveness of backup (and recoverabil- ity) is determined by assessing the risk of not hav- ing the data or software for different periods (see Table 10-4). The less the tolerance for loss of access, the more money and more elaborate the design of the backup procedures should be. The severity of lost access time varies, depending on the availability of human experts to do work manually and the criti- cality of the application. In general, the longer a work area has been automated, the less likely manual procedures can be used to replace an application, and the less time the application can be lost without

TABLE 10-4 Backup Design Guidelines for Different Periods of Loss

Length of Loss

1 Week or longer

1 Day

1 Hour

Type of Backup

Weekly Full with Off-site storage

Above + Daily Incremental/Full

Above + 1 or more types of DBMS Logging

capabilities, partial transactions might compromise 15 Minutes or less Above + All DBMS Logging Capabilities: database integrity.

Other data and software backup procedures are either full or incremental. A full backup is a copy of

Transaction, Pre-Update and Post-Update Logs

severe consequences. The less important an appli- cation is to the continuance of an organization as an on-going business, the less critical the application is for recovery design. An application for ordering food for a cafeteria, for instance, is not critical if the company is an oil company but is critical if the com- pany is a restaurant.

To define backup requirements, then, you first define the criticality of the application to the organi- zation, and the length of time before lost access becomes intolerable. Based on those estimates, a backup strategy is selected. If the delay until recov- ery can be a week or more, only weekly full back- ups with off-site storage are required. If the delay until recovery can be one day or less, then, in addi- tion to weekly backups, daily backups should be done. If the recovery delay can be only an hour, the two previous methods should be supplemented with one or more types of DBMS logging scheme. Finally, if a I5-minute recovery delay is desired, all types of DBMS logging, plus daily and weekly back- ups should be done.

Last, we consider audit controls which provide a record of access and modification, and prove trans- action processing for legal, fiduciary responsibility, or stakeholder responsibility reasons. Audit controls allow detection and correction of error conditions for data or processing. As new technologies, greater dependence on ITs, and interrelated systems that are vulnerable to telecommunications attacks all in- crease, business emphasis on controls also increases. In manual systems of work, control points are eas- ily identified; procedures are observable, errors can be reconstructed, and controls applied by humans. In automated applications, the application is the solu- tion, nothing is directly observable, and complexity of functions makes identification of control points increasingly complex.

A control point is a location (logical or physi- cal) in a procedure (automated or manual) where the possibility of errors exists. Errors might be lack of proper authorization, misrecording of a transaction, illegal access to assets, or differences between actual and recorded data. Control points are identified dur- ing design because the entire application's require- ments should be known in order to define the most

Information Engineering DeSign 415

appropriate control points. Controls are specified by designers in the form of requirements for program validation. For instance, controls for the validity of expense checks might be as follows:

1. Only valid, preauthorized checks can be written.

2. Check amounts may not exceed authorized dollar amounts.

3. Checks may not exceed the expense report total amount.

Application audit controls address the complete- ness of data, accuracy of data, authorization of data, and adequacy of the audit trail. Detection of pro- cessing errors is either through edit and validation checks in programs, or through processing of redun- dant data. Examples of controlled redundancy of data include double entry bookkeeping, cross footing totals and numbers, dual departmental custody of replicated critical data, transaction numbering, and primary key verification. Edit and validation rules are designed to identify all logical inconsistencies as early in the process as possible, before they are entered into the database.

ABC Video Example Security, Backup/ Recovery, and Audit Plans

To design ABC's security, we first review the physi- cal plant and recommend changes to the planned computer site to provide security. The six threats are considered, but the byword from Vic in discussing the possibility of changes is "be reasonable." So, if there is a 'reasonable' chance that a problem will occur, we will recommend a reasonable, and low cost, solution to the problem.

Moving from most to least serious, we consider the six types of threats to application security: loca- tion failure, hardware failure, DBMS failure, soft- ware failure, hacker change, and user error. For each threat, we consider the potential of occurrence for ABC, then devise a plan to minimize the potential damage. All threats and responses are summarized in Figure 10-21.

416 CHAPTER 10 Data-Oriented Design

I 80'

Fire Exit II Drama

000 Sci Fi

Musical

Cheap's Drugs Music

I Check-out Desk

Front Door

FIGURE 10-20 ABC Current Physical Plant

First, we review the physical plant and relate it to location and hardware failures. ABC Video is located in suburban Atlanta, Georgia, 300 miles from the ocean and 25 miles from the nearest large lake. The company is located in a mall, the Dun- woody Village, a clustering of small shops and offices in open-square buildings containing a plaza in the middle of the square. The company occupies 3200 square feet of 80' x 40' space in the southeast corner of Building A. The adjoining spaces are oc- cupied by Cheap's Drugs and Ra-Dan Hair Salon. A schematic of the space is shown in Figure 10-20.

Ra-Dan > Horror Files

~ ~ (') 0- m m-

e---

,,~~ ~ ~

Sci Fi :::r ~

Musical 40' N

Comedy

0 Action

0:: <ii" rn

Current Releases

~ WindowWall

The northeast corner of the area (abutting Ra-Dan's) contains a 12' x 16' office which contains two desks, one supply closet, and a bathroom. The office has no windows and can be locked, although it is fre- quently empty and unlocked. The supply closet has double doors which do not currently have a lock.

The clerk's checkout counter is near the customer doors on the south side of the building in the western corner. The counter is an 'L' shape with the entry on the short side. A fire door, equipped with an alarm bar, is located in the northwest corner of the area and opens on a short alley behind the building.

Location failure usually results from violent weather, terrorist attacks, or government takeover. The chance of violent weather is the only potential major problem in the area. Tornadoes occur in the area regularly. The expectation is that there is a 20% chance of tornado damage some time in the next 10 years (see Figure 10-21). Tornadoes also imply strong thunderstorms which are common to the area. The chance of damage from a storm is about 30% within five years to the windows, and about 65% within two years for lightning to cause electrical spikes.

The response to location threats is to provide off- site backup of all information, with the site far enough away that it is unlikely to be affected by the same storm (see Figure 10-21). Vic should investi- gate the possibility of closing in the window wall in the southeast side of the building to minimize storm damage. He can also install lightning rods on the roof of the building to dissipate lightning when it hits.

The next category of problems relate to the hard- ware selected for the rental/return application. Vendor-cited reliability is 99 years mean time be- tween failure (MTBF) for individual components. When the components are considered as a whole, the probability of component failure is once in two years (see Figure 10-21). The current plan is to have an extra PC in the office that could be moved to the front desk if needed. A hardware service contract with a local company to provide response within 24 hours is recommended.

The planned server location is near the bathroom in the northeast corner of the area. The toilet has a history of overflows during wet spring months. Be- cause of the way the office was constructed, the water is confined to a small area but almost always runs into the supply closet and has been as high as one foot. The probability of component failure to file server and/or disks from water due to toilet over- flow is 50% in two years. The answer to this problem is simple, but expensive: Build a new area, specifi- cally for the computer, away from the toilet area to reduce this probability to near zero. Ideally, if the windows are closed in, the office could be moved to the front of the building and the old office removed. A new enclosure for the toilet facilities could be

Information Engineering Design 417

added or the toilet could also be rebuilt in the new location with whatever precautions are needed to preclude the spring overruns.

There is another problem with the planned server location. The planned location-the supply closet- has no ventilation. If the closet doors are open, ven- tilation for the office is sufficient for the planned equipment, but, ideally, the server closet doors should be locked. If the doors were locked, the prob- ability of server failure due to lack of ventilation is 50% in two years. The solutions possible are to build a new area for the server equipment, or to add ven- tilation to the planned area to reduce this probabil- ity to near zero. Both solutions should be presented to Vic for his decision.

Less serious problems stem from the building location. Glass windows that run along 60' of exter- nal front wall and the drop ceiling are accessible from neighboring companies. Theft and break-ins are somewhat common in the area, but the probabil- ity of a break-in is 50% in 10 years. Most burglars are looking for money, but some might maliciously tamper with the computer equipment. Therefore, the probability of computer damage during a break-in is 60% according to police estimates.

The recommendations to minimize theft have to address the easy access to the company through win- dows and ceiling. If the office remains in its current location, a security system with movement sensors in the ceiling and glass-breakage sensors on all win- dows should be added (whether or not the computer is installed). Long-term, Vic should investigate the possibility of closing-in some or all windows to improve security of the company.

Next, because of the location of the checkout desk at the front of the building, the ability of clerks to monitor approaches to the office is low due to lim- ited visibility. Further, theft of tapes is possible because clerks cannot see down all aisles without moving away from the desk area. For application security, we are concerned with office access; but, as professionals, we can make recommendations that will improve Vic's ability to reduce general theft as well. An easy, but somewhat expensive solution is to move the checkout desk to the center of the floor and assign surveillance duties to clerks. Even if the desk is not moved, mirrors installed in the corners of the

418 CHAPTER 10 Data-Oriented Design

Finding

Location failure-Probability of tornadoes 10% in 10 years. Probability of strong storms causing damage to windows is about 15% within two years. Probability of lightning causing electrical spikes is 15% within two years.

Hardware failure-Vendor-cited reliability is 99 years MTBF for each component. The probability of com- ponent failure is once in two years for some network component.

Hardware failure from external reasons-Planned server location is near bathroom with history of periodic overflows. Probability of component failure to file server and/or disk is 50% in two years.

Hardware failure from external reasons-Planned server location is a closet in the office area without any ventilation. Probability of server failure is 50% in two years.

Hardware failure from external reasons-Current location has glass windows along 60' of external front wall and a drop ceiling accessible from neighboring companies. Probability of break-in is 30% in 10 years; probability of computer damage during a break-in is 60%.

Physical location vulnerabilities-Ability of clerks to monitor approaches to the office is low because of desk location and limited visibility.

DBMS failure-Vendor-stated reliability is two years MTBF. This is one of the best on the market, but each new release is unstable for at least six months.

DBMS failure-Other reasons (e.g., electrical spike). Probability is 100% that electrical surges will occur, since they are common in the summer months.

Probability of brownouts with reduced power are 30% in two years.

Recommendation

Select off-site storage facility no closer than 200 miles.

Investigate closing in the front windows, at least the contigu- ous 40 feet of windows on the southeast corner.

Install lightning rods on the roof.

Move the extra PC in the office to the front desk if needed. A hardware service contract with a local company to provide response within 24 hours is recommended.

Build a new area to reduce this probability to near zero.

Build a new area or add ventilation to the planned area to reduce this probability to near zero.

If the office remains in its current location, add security system with movement sensors in the ceiling and glass- breakage sensors on all windows.

Long-term, investigate the possibility of closing-in some or all windows, moving the office to the front of the building (away from plumbing).

Move the clerks' desk to the center of the floor and assign surveillance duties to clerks.

Install mirrors in corners of room to allow monitoring of customers' actions.

Do not install latest releases until thoroughly tested using regression test package.

Negotiate with vendor for data access software in event of DBMS failure. Include this software access in the vendor contract.

Install a surge protector on the entire ABC electrical system to accommodate spikes (cost is about $100).

Install surge protectors on each individual outlet used by computer equipment to further protect the equipment since whole system protectors do not guarantee integrated chip safety in any devices.

Install a limited, inexpenSive, UPS to provide emergency power in event of electrical failure and for limited use during brownouts (cost about $1,000).

FIGURE 10-21 Security Review Findings and Recommendations

Finding

Software failure-Application failure due to software defects should be less than once in 15 years after the first three months. During the first three months of operation, the probability of application failure is about 75%; no more than one is expected.

Hacker change-Outside user access to the system should be zero since no telecommunications capabilities are planned. However, the untended server and occa- sional lack of clerks at the desk area may provide a local hacker enough time to access and modify the system.

User error-The use of computer novices as clerks guarantees user error. Probability is 100% within one week of system operation.

Information Engineering Design 419

Recommendation

The application is designed for 15-minute recovery of all data and programs. Loss of transactions in process will al- ways occur with any failure; they will have to be reentered.

Program problems will be fixed within one business day. Any lost transactions will be reentered free of charge by Software Engineers Unlimited.

Install security precautions listed above: security mirrors, move desk, assign clerks monitoring responsibility.

Always lock office door; always lock file server door.

Restrict data and process access to those required to per- form each job.

Design application to withstand any casual error-hitting any key on keyboard, scanning any bar code type, etc. A report of such errors can be created and printed on demand by Vic to allow retraining (or other action) for repeated errors by one user.

Application design also includes validation of all fields such that only valid data can be in the database. On-demand reports of new customer and video entries will allow Vic to monitor the typing skills of employees.

New-hire orientation and new-hire mentors should be used to stress the importance of data accuracy.

FIGURE 10-21 Security Review Findings and Recommendations (Continued)

room would allow clerks to monitor customers' actions. Both recommendations are made with the understanding that the mirrors should be installed whether or not the desk is moved.

After physical issues are evaluated, we next look at software security and reliability. Vendor-stated reliability for the planned DBMS is two years MTBF. This SQL software is one of the best on the market, but each new release is unstable for at least six months, and those instability figures are not in the MTBF estimates. The company routinely dis- claims any responsibility for new release errors and loss of data or processing to using companies. The DBMS does stabilize and is usually reliable after a six-month trial period for each new release. The

simple solution to this problem is that unless a fea- ture of a new release is needed, no change from the current stable version should be made. In addition, no software, whether vendor package or customer designed, should be allowed into production use until it is thoroughly tested using the application regression test package that will accompany the system.

A secondary problem with DBMS errors is that, if the DBMS fails, there is no other way to access the data. Part of the contract negotiation should include discussion of such software for the vendor to provide in event of DBMS failure. Other companies have successfully received such commitments from this vendor, although it is not volunteered. Such data

420 CHAPTER lODato-Oriented Design

access software should be included in the vendor contract.

Additional problems that might cause DBMS failure are electrical surges and brownouts due to uneven service in the area. Surges generally occur during the summer months when equipment comes on-line to service air-conditioning in the area. The probability of surges is 100% based on local electri- cal company history. The probability of brownouts with reduced power is 30% within two years, also using electrical history as the basis for the estimate. Problems from both causes can be minimized by a surge protector on the entire ABC electrical system which shuts down power if a particularly large surge is experienced. In addition, one surge protector for each outlet should be installed to further protect the equipment since whole system protectors do not guarantee integrated chip safety. Finally, a limited, inexpensive, uninterrupted power supply (UPS) should be installed to provide emergency power in the event of electrical failure and for limited use dur- ing brownouts to supplement reduced electricity from the local provider.

We consider application software failures next. Failure due to software defects should be less than once in 15 years after the first three months of oper- ational use. During the first three months of opera- tion, the probability of application failure is about 75%; no more than one is expected. The application is designed for IS-minute recovery of all data and programs. Loss of partial transactions will always occur with any failure; they will have to be reen- tered. Program problems will be fixed within one business day. Any lost transactions will be reentered free of charge by Software Engineers Unlimited (Mary's company).

Outside user access to the system should be zero since no telecommunications capabilities are planned. However, the untended server and occa- sionallack of clerks at the desk area may provide a local hacker enough time to access and modify the system. If the physical security precautions recom- mended above are provided, such hacker break-ins would be nearly impossible. Therefore, at a mini- mum the precautions for security mirrors, assigning clerks monitoring responsibility, and locking the of-

fice and file server doors should be implemented (see Figure 10-21).

Finally, the use of computer novices as clerks guarantees user errors. The probability of user errors is 100% within one week of system opera- tion. To prevent any application or DBMS damage from user errors (inadvertent or otherwise), the first line of defense is to restrict what users may do and the data they may access as a way to prevent errors. Each job should be defined and a security access scheme developed to allow access to all processes and data required for the job, and nothing more.

Second, the application should withstand any casual error-hitting any key on keyboard, scanning any bar code type, and so on. If required, a report of such errors can be created and printed on demand by Vic to allow retraining (or other action) for repeated errors by one user. Application design also includes validation of all fields such that only valid data can be in the database. Such checks are not pos- sible for alphanumeric data, however, so on-demand reports of new customer and video entries will allow Vic to monitor the typing skills of employees.

Application training will use computer-based training (CBT) in entering application data. The CBT will use simulated transactions and should min- imize the user errors if taken seriously by clerks. \ New-hire orientation should include discussion of the importance of accuracy of work, especially with the computer. Further, new hires should be assigned a more senior 'mentor' for learning the application after training.

After disaster recovery is planned, application security must be developed. From the recovery plan, we know that each job should be evaluated to deter- mine the data and processing requirements of the position. ABC jobs evaluated include clerks, owner, and accountant. The owner should be allowed to do any functions on the application and system that he desires. However, many owners do not want to become the chief user of the computer. When asked, Vic's reaction is, "Does this mean I can never take a vacation? Do I have to be here in the morning and at night? If so, define a new position that can do most of my functions, just not delete data!" So the posi- tion of chief clerk is also considered.

Clerk and accountant each have different subsets of chief clerk rights.

Owner

Chief clerk

Clerk

Information Engineering Design 421

Chief clerk has a sub set of owner rights.

I Accountant

FIGURE 10-22 ABC Data Security Hierarchy of Access Rights

The owner should be the lead person and still be allowed to perform all functions, access all data, and provide security password changes, and so on (see Figure 10-22). The chief clerk, according to Vic's wishes, has all of those functions except deleting information (see Table 10-5). If there were sensitive data in the system, more discussion of the chief clerk's duties and access rights might take place. The clerks have access rights to rent and return videos, and to create and update customers and videos. Finally, the accountant has limited read-only access to several files.

Backup and recovery are considered next. First we decide the maximum tolerable time loss for a computer outage, then select the backup scheme that best fits the time loss maximum. The rental/return application is critical to ABC's ability to conduct business. Vic knows that when he moves all produc- tion work to the computer that the clerks will quickly forget the manual way of conducting business. Also, we know that if the databases are not kept up to date, the system is next to useless because the clerks won't know whether to look at manual or automated files for returns, fees, and so on. Therefore, the maximum outage should be less than 15 minutes with recov- ery of all fully complete transactions. Even at

15 minutes, if an outage were to occur during a peak time, as many as four transactions could need to be reentered and as many as 15-20 transactions would be queued for entry upon system return to produc- tion. Ideally, the system should be functional during all business hours.

The recovery requirements imply the most backup protection possible. From Table 10-4, a 15-minute recovery requirement means the use of weekly full backups with off-site storage, daily backups, and logging for transactions, preupdate data items and postupdate data items. Therefore, these are the backup and recovery requirements.

Requirements: Application and system availabil- ity during all store open hours, with no more than 15 minutes of down-time from failures of any type.

Backups: Transaction, preupdate, and post- update logs

Transaction logs maintained one week until weekly backups are verified. Pre- and postupdate logs maintained for 72 hours.

Daily complete database backups with on- site copy plus off-site storage at owner's home.

422 CHAPTER 10 Data-Oriented Design

TABLE 9-5 ABC User Classes and Access Rights

FilelFunction Owner Chief Clerk Clerk Accountant

Customer Create X X Retrieve X X Update X X Delete X

Video Create X X Retrieve X X Update X X Delete X X

Open Rentals Create X X Retrieve X X Update X X Delete X

Video History Create Retrieve X X Update X

Customer History Create Retrieve X X Update X X

Startup X X

Shutdown X X

End Of Day Create X X Retrieve X X Delete X X

Initiate End of Month Process X X

Paper copy of transactions maintained for one calendar year in accountant's office.

Weekly complete disk backups with on-site copy plus off-site storage at owner's home and a third copy at

Disaster Prevention Storage 321 Maple Ave. Somewhere, OK (618) 123-1234

X X X X

X X

If ABC's application processed millions of trans- actions each day, we would do further analysis of the cost of backup and recovery, but here that is not necessary.

Finally, we need to decide about audit controls as summarized here:

Data accuracy and completeness-All edit checks possible will be used as data are entered to prevent errors from entering the

system. Sight verification by clerks and cus- tomers will be used to verify alphanumeric information.

Rental transaction accuracy can be veri- fied by customers' signing for all monetary transactions. In case of discrepancy, transac- tion logs and historical paper copies of trans- actions can be consulted.

Data authorization-Security controls will pro- vide sufficient authorization for data process- ing. Only the owner is authorized to perform any delete functions on customer, video, and open rental data. No delete functions for his- tory records are provided.

User ID, date, and time of user to last change data will be maintained in Customer, Video, and Open Rental databases.

Audit trail-A paper trail of receipts should be maintained by the accountant for each calendar year. This is a sufficient trail since ABC is a cash business without any accruals.

These paragraphs would be part of the user procedures:

Customer Maintenance

Information Engineering Design 423

Nonmonetary transactions (e.g., return of on-time tapes), have no paper audit trail. If a question about a tape return arises, the database can be checked to verify the information.

All edit checks possible should be used as data are entered to prevent errors from entering the system. To ensure complete editing, we review the data dic- tionary to check that all nonalphanumeric fields have edit and validation criteria.

On names, addresses, and other alphanumeric fields, little verification can be performed automati- cally. What cannot be done automatically should be done manually. Procedures for operators should be developed to document clerical 'sight verification' and customer verification standards. An example of such a procedure that would be part of the user man- ual is shown as Figure 10-23. Sight verification means that the person entering information into the computer reads the monitor to verify the accuracy of the information he or she entered. The user, then,

When customers are being added to the system, the clerk should read back all information as shown on the screen to verify its accuracy, as the computer cannot verify mixed alphabetic and numeric information.

Video Maintenance

When videos are being added to the system, the clerk should compare all information shown on the screen with the origi- nal printed information to verify its accuracy, as the computer cannot verify mixed alphabetic and numeric information.

Rent/Return Processing

Users should be encouraged to check the information on the printed rental before they sign it to verify that it is correct.

FIGURE 10-23 User Sight Verification Procedure

424 CHAPTER 10 Data-Oriented Design

is responsible for data integrity of items that cannot be computer verified.

Rental transaction accuracy will be verified by customers' signing for all monetary transactions. In case of discrepancy, transaction logs and historical paper copies of transactions can be consulted. If many discrepancies persist (more than one per week), a special history file of transactions can be added to the application to speed the transaction look-up process.

Security controls can be designed to provide suf- ficient authorization for data processing. The secu- rity scheme should be developed to serve two goals: to provide data access and to provide function access to those who need it. To require several layers of security checking for a simple application does not make sense and wastes clerical time. So, once again the KISS (Keep It Simple, Stupid) method of one security access scheme is best. User ID, date, and time of user to last change data will be maintained in Customer, Video, and Open Rental databases. These attributes are added to affected database relations.

To minimize the extent to which damage can be done to data, only ABC's owner should be autho- rized to perform any delete functions on customer, video, and open rental data. No automated delete functions for history records are provided without circumventing the application completely. Changes to files will always be somewhat traceable because the historical record will reflect activity. If unautho- rized file changes are thought to be a problem, Vic can always request a browsing capability for any of the transaction logs to check on problems.

A manual audit trail should be used for ABC to conserve computer resources. All monetary trans- actions can be reconstructed through a paper trail of receipts maintained by the accountant. The receipt form is a two-ply preprinted form on which all monetary transactions are printed. For rentals, customers sign the form as proof of rental responsi- bility. Paper records should be maintained for one calendar year in the accountant's office; this is suffi- cient since ABC is a cash business without any accruals. If a tape audit trail were to be necessary at some time in the future, it can be added to the sys- tem easily.

Nonmonetary transactions (e.g., return of on-time tapes), have no paper audit trail. If a question about a tape return arises, the user ID, date, and time of the return will be on the database and can be checked to verify the information.

Develop Action Diagram Guidelines for Developing an Action Diagram

An action diagram is a diagram that shows proce- dural structure and processing details for an appli- cation. It is built from the process hierarchy and process data flow diagram developed during IE analysis (see Figure 9-45 for ABC's PDFD). The diagram uses only structured programming con- structs to convert the PDFD into a hierarchy of processes that can be divided into programs and modules. First we discuss the components of the diagram, then we discuss how to build an action diagram from the process hierarchy and PDFD.

Action diagrams use different bracket structures to depict the code elements in an application. Basic structured programming tenets-iteration, selection, and sequence-are all accommodated with several variations provided. As Figure 10-24 shows, a sequence bracket is a simple bracket. It is option- ally identified with a process name and ended with the term ENDPROC to represent a program module consisting of a sequence of instructions.

When a module is designed and detailed in another document or diagram, a rounded rectangle containing the module name is drawn between the brackets (see Figure 10-25). When the module is not yet defined in detail, a rounded rectangle with ques- tion marks down the right side is shown. Reusable

.....-- PROC Process Name

The sequence of instructions is entered within the sequential brackets.

I...-- ENDPROC

FIGURE 10-24 Simple Sequence Bracket Format

~ Module Name

Sequential instructions.

~bedded, defined lJ:: module name. Module Name

Q Module Name

Embedded, undefined module name.

__ Module Name

I~eused Module ~ Name

Adapted from Martin (1990), p. 543.

FIGURE 10-25 Module Designation Fonnat

modules are drawn with a vertical bar to repre- sent reuse.

Selection of modules from the PDFD is shown by a selection bracket (also called a condition bracket) which begins with an IF condition and ends with the termENDIF (see Figure 10-26a). If the conditional statement has multiple conditions, two other options are allowed. The condition can be stated as an IF statement with one or more ELSE conditions (see Figure 10-26b), or a condition can be stated as a mutually exclusive selection list as in Fig- ure 10-26c; this selection list is eventually translated into an IF statement.

Repetition is shown with a double bracketed fig- ure. The repetition bracket name begins with either DO or DO WHILE + condition (see Figure 10-27). The bracket ends with either an UNTIL + condition

Information Engineering Design 425

(Figure 10-27a), or ENDDO (Figure 10-27b). DO WHILE implies that the condition is checked before the conditional statements are executed. Do while processing may occur zero times. Conversely, DO UNTIL implies that the condition is checked after the lower statements are executed. Do until pro- cesses occur at least once.

Miscellaneous items include goto, exit, and con- currency identification. A goto is shown by an arrow leaving one level and pointing to the line for the des- tination level with a goto statement and destination at the right of the arrow (Figure 28a).

An exit is shown as an arrow leaving one level and pointing to the line for the destination level with the word exit at the right of the arrow (Figure 28b). Unless an exit destination is named with the exit, exit always means that the calling module is the exit destination. For example, if Rent/Return calls Cus- tomerAdd, the exit from CustomerAdd returns to Rent/Return. Further, if CustomerMaint calls Cus- tomerAdd, the exit from CustomerAdd returns to CustomerMaint. That is, the calling module, regard- less of what it is, is the return module.

Processes can be sequential or concurrent. Con- current processes execute at the same time. There are two types of concurrent processes: independent and dependent. Independent concurrent processes are those which execute at the same time but do not synchronize their process completion. For example, when Process Payment and Compute Change is complete in ABC's application, printing and file updates of several types could all be concurrent. If there is no checking on the success of their comple- tions with subsequent action for any failures, these processes are independent. Independent concurrency is shown on the diagram by an arc which connects the module brackets (Figure 10-28). Dependent concurrent processes are those which must be syn- chronized to coordinate further application actions. Dependent concurrency is shown on the diagram by an asterisk (or some other special character) on the arc connecting the modules (Figure 10-28d). Depen- dent concurrent processes require the development of a synchronization module, if not already in the application, to ensure complete, accurate processing.

Now that you know the bracket symbols used to define action diagrams, we move to discuss the steps

426 CHAPTER lODato-Oriented Design

a. Simple IF Condition

~ IF condition

action sequence of instructions

ENDIF

b. Multiple IF Conditions

~ IF condition

... else IF condillon

else IF condition

... ENDIF

c. Multiple IF Conditions using case logic

~ .. IF condition 1

condition 2 ... condition 3

condition n

... ENDIF

;-- IFA=1

~ A=3

~ A=4

t-- A=5

--=- ENDIF FIGURE 10-26 Conditional Bracket Design Fonnats

to developing one. The steps to define an action dia- gram are to translate processes into levels of action using structured constructs, design modules, perfonn reusability analysis, decide module timing, add data to the diagram, and optionally, add screens to the diagram.

The first step is to translate processes into levels of action. The first-level diagram is developed from the process hierarchy diagram to identify the major activities being perfonned by the application. The activities themselves are added to the diagram as they are written on the hierarchy diagram. The struc- tured constructs should identify sequence and any selection or conditional processing relating to the activities. Most often, when the diagram is begun at

the activity level, the alternative processes are mu- tually exclusive. When the diagram starts at the process level (Figure 10-29), any construct might apply. The example shows a mutually exclusive selection from among the three alternatives.

Now we shift to the process data flow diagram (Figure 10-30) to add process details to the action diagram. Remember that the processes on the PDFD must match exactly the processes on the hierarchic decomposition diagram. We use the PDFD to trans- late the structural relationships between the pro- cesses correctly. The structural relationships are on the PDFD and not on the decomposition; they refer to the sequential, conditional, and repetitive relationships between processes.

a. Perform actions zero to n times based on condition.

[

DO WHILE condition

ENDDO

b. Perform actions one to n times based on condition.

[

DOUNTIL

condition ENDDO

FIGURE 10-27 Repetition Bracket Design Formats

In developing the second-level action diagram, we first add the processes, in sequence, from the PDFD. Then the brackets are drawn to reflect the sequential, conditional, and repetitive structural rela- tionships. In the example (Figure 10-31), the main processes are Identify Item and Vendor, Sort by Ven- dor and Item, Get Price, Create Order, and Mail Order. Between these processes, there are two repet- itive blocks: one based on New Releases, and the other based on Vendors (see Figure 10-32). We iden- tify the repetitive blocks by looking at the circular loops and the conditions for repeating the pro- cess(es}. Notice that the Sort is not included in either loop.

Next, evaluate each process grouping. Identify Item is alone within its loop. Sort is also alone. The last three processes are together and are analyzed. The processes are sequential but according to the PDFD, they are not all processed in sequence. If the vendor has not changed from the previous item, we Get Price and Create Order. When the Vendor changes, we File and Mail the order. These state- ments from the PDFD translate into the IF condi-

Information Engineering Design 427

The diagram is correct in interpreting the PDFD, but it is incomplete as a program specification. First we need to deal with the First Vendor. The First Ven- dor will not equal Last Vendor, and to file an order for a nonexistent vendor is wrong. Second, think about what an order looks like (Figure 10-34). There are one-time Vendor information and variable lines of Item information. Where the PDFD says Create Order, it really means Add Item to Order. When the Vendor changes and an order is complete, we want to format Vendor information for the new order. Fig- ure 10-35 reflects these details and is ready for the next step. The purpose of this example is to show

a. GOTO bracket format

tTO Main Menu

b. Exit bracket format

[;xit to Error Routine Exit to CALling Module

~H c. Concurrent processes bracket format

tional statement in the action diagram as shown in FIGURE 10-28 Miscellaneous Bracket Figure 10-33. Design Formats

428 CHAPTER 10 Data-Oriented Design

Process Hierarchy

Analyze Business Create Purchase Order Monitor Purchase Order

Identify Item & Vendor

Sort by Vendor, Item

Get Price

Create Order

Mail Order

First Level Action Diagram File Order by Vendor

Purchasing Application

Analyze Business

Create Purchase Order

Monitor Purchase Order

END Purchasing Procedure

FIGURE 10-29 Process Hierarchy and First-Level Action Diagram

how a correct PDFD may need elaboration to trans- late into program specifications.

U sing the action diagram, modules are defined. There are few guidelines on this aspect of Informa- tion Engineering. In general, you should try to define modules that perform one well-defined process and nothing else. The guidelines presented in Chapter 8 for module definition can be applied here. For the example in Figure 10-35, the IF ... ELSE IF ... ELSE processing is the module's control flow.

Within the control flow we have stand-alone pro- cesses that conveniently define modules. Figure 10-36 shows the module names, each enclosed in its own rounded rectangular box to indicate that there are more details for each module. The submodules are each further diagrammed or, if fully documented in a data dictionary, refer to the dictionary entry in the module box.

For Create Purchase Order processing, then, we have a main module and submodules for Create Ven-

dar Info, Get Price, Create Order Item, File Order, and Mail Order. Notice that Create Vendor Info is used twice.

Next, the action diagram modules are compared to templates already in use to determine whether reuse of existing modules is possible. As reusable modules are identified, the process details are removed from the action diagram and replaced with a call statement. The called module name should indicate whether the reused module is customized for this application or not. The conventional way to identify customized reused modules is by a prefix or suffix on the name. For example~ a date compare

Information Engineering Design 429

routine might be used to determine lateness. If not modified, the name of the routine might be Date- Compare. If customized, the name of the routine might be RentDateCompare or LateReturnDate- Compare. In the example in Figure 10-36, Sort uses a utility program, a special class of reusable mod- ule. The Sort statement is removed from the diagram and replaced with a call statement (Figure 10-37). No other modules in this example are general enough for reuse.

When reusability analysis is complete, the action diagram should show the mainline logic of the application with modules for the processes and

Vendor

More = No

Vendor = Last-Vendor

Vendor *- Last-Vendor

Mail Order

Open Orders

FIGURE 10-30 Sample Process Data Flow Diagram

430 CHAPTER 10 Data-Oriented Design

FIGURE 10-31 Diagram

Purchasing

Analyze Business

Create Purchase Order

Identify Item and Vendor

Sort by Vendor, Item

Get Price

Create Order

Mail Order

File by Vendor

Monitor Purchase Order

END Purchasing Procedure

Second-Level Action

subprocesses. At this point, timing of processes is decided and added to the diagram. Recall that pro- cesses can be sequential or concurrent, and that con- current processes can be either independent or dependent. Frequently, user requirements will iden- tify required concurrency. If no user requirements identify concurrent operations, a design decision to offer or not offer concurrency is made by the SEs. Concurrency is expensive and adds a level of main- tenancecomplexity to the application that the user might not want.

Optional concurrency is determined by evaluat- ing module interrelationships again. Only groups of

sequential modules are evaluated at first. Then the groups themselves are evaluated for possible con- currency. In Figure 10-36, two groups of two or more modules are present. The first is Get Price with Create Order Item. The second group is File Order, Mail Order, and Create Vendor Information on Order. Working backward, we ask if the modules are dependent on each other. Could we create an order item without knowing the price? In this case, the answer is no, we must know the price. Therefore, the modules are dependent and cannot be concurrent. In the second group, we might perform File and Mail

Purchasing

Analyze Business

Create Purchase Order

(

Do Until all items are identified

Identify Item and Vendor

EndDo Sort by Vendor, Item

Do While there are Items to be processed

IGet Price

Create Order

Mail Order

File by Vendor

ENDDO

Monitor Purchase Order

END Purchasing Procedure

FIGURE 10-32 Repetitive Blocks on Second- Level Action Diagram

- Purchasing

-( Analyze Business

-- Create Purchase Order

[

Do Until all items are identified

Identify Item and Vendor

EndDo Sort by Vendor, Item

;:: Do While there are Items to be processed

r- IF Vendor= Last-Vendor

Get Price

Create Order

~ ELSE

Mail Order

File by Vendor

Set Last-Vendor = Vendor

'-ENDIF

t ENDDO

H Monitor Purchase Order ~ END Purchasing Procedure

FIGURE 10-33 Conditional Statements on Second-Level Action Diagram

Order at the same time, IF success of the file opera- tion is not an issue. Create Vendor cannot be done until the last order is fully processed. To decide on concurrency, we need to know the details of error handling. In this case, we find that errors are checked and handled in the module in which they can occur. If a fatal error occurs, the application does no other processing on this order. This process definition implies sequence to the processes. If the processes were concurrent and a fatal error occurred, some undesired processing would occur. Therefore, in this example, concurrency is not an option.

Information Engineering Design 431

ABC Video, Inc. 123 Dunwoody Village Dunwoody, GA 30392

Purchase Order

TO: Paramount Video Entertainment 1947 Ave. of Americas New York, NY 10021

Terms: Net 30 Days

Item Qty Description

019421 50 Aladdin 019427 10 A Few Good Men 019497 1 Mon Amour C'est Soir

FIGURE 10-34 Order Example

1/11/94

Price

14.95 14.95

5.95

Next, the entities and data elements used by the processes are added to the diagram(s). By the time this action is complete, every attribute of every relation must, at least, have been identified for cre- ation and deletion (Figure 10-37). Any attributes not included in the processing should be reconsidered for elimination from the application. These process definitions should include attributes added to the relations as a result of design activities.

If the action diagrams are developed manually, screen identifiers can be added to the diagram with entities and attributes linked to screens (see Figure 10-38). The diagram'then links data sources and des- tinations to both processes and screens. This type of diagram does manually what linkages in a CASE tool automate.

ABC Video Example Action Diagram

The steps to developing the action diagram are to de- velop the levels of action using structured constructs, perform reusability analysis, design modules, decide module timing, add data to the diagram, and option- ally, add screens to the diagram (refer to p. 434). Only the first-level action diagram includes all of the processes. The lower-level diagrams consider Rent/ Return processing and Video Maintenance only. The other processes are left as an exercise.

432 CHAPTER 10 Data-Oriented Design

Purchasing

Analyze Business

Create Purchase Order

[

Do Until all items are identified

Identify Item and Vendor

EndDo

Sort by Vendor, Item

Do While there are Items to be processed

IF First-Record

Set Last- Vendor = Vendor

ELSE IF Vendor = Last-Vendor

Get Price

Create Order

ELSE

Mail Order

File by Vendor

Set Last- Vendor = Vendor

ENDIF

ENDDO

Monitor Purchase Order

END Purchasing Procedure

FIGURE 10-35 Order Fonnat Details on Action Diagram

The first-level action diagram is based on the process hierarchy (Figure 10-39). First we draw the general bracket and add the module names, indicat- ing the structural relationships between the modules by the bracket type (Figure 10-40). In the ABC dia- gram, the processes are all mutually exclusive.

Then, using the PDFD as reference (Figure 10-41), we develop the next level of procedural detail. The subprocess names are added to the dia- gram as shown in the PDFD (and process hierarchy). For each subprocess, the structural brackets indicat- ing modular control are added.

The subprocesses for Video Maintenance are for create, retrieval, update, and delete processing. These processes are all mutually exclusive, so the di- agram is simple (Figure 10-42). At the lowest level, we identify modules that refer to the dictionary for process details.

Rent/Return has all of the complexity in the application. Each cluster of modules is discussed separately. First, Get Request is always executed whenever Rent/Return is invoked (Figure 10-43).

Purchasing

Analyze Business

Create Purchase Order

[

Do Until all items are identified

Identify Item and Vendor '--"

EndDo

( Sort by Vendor, Item)

Do While there are Items to be processed

IF First-Record

Set Last- Vendor = Vendor

(create Vendor Info)

ELSE IF Vendor = Last- Vendor

( Get price)

( Create Order)

ELSE

( Mail order)

( File Order )

Set Last- Vendor = Vendor

(create Vendor InfO)

ENDIF

ENDDO

Monitor Purchase Order

END Purchasing Procedure

FIGURE 10-36 Module Boxes on Action Diagram

Information Engineering Design 433

Create Purchase Order New Releases

----------------------~~-VendorName

(

Do Until all items are identified

( Identify Item and vendor)

EndDo

( Sort by Vendor, Item)

Vendor ID ItemlD

Do While there are Items to be processed (~:~~~~ ~ame IF First-Record ____ Vendor

~ Address

Set Last-Vendor = Vendor Ord:~J~rr~s

( Create Vendor Info) QI!:~ ~y ELSE IF Vend - L t-~ d If ./ Item Description

or - as en a/" Item Price

( Get price)

Create Order

ELSE

( Mail Order )

Vendor ID Vendor Name Vendor Address Order Terms

Video Name

C~ ___ F_i_le_o_rd_e_r ______ ~ Purchase Order

Set Last-Vendor = Vendor

( Create Vendor Info)

ENDIF ENDDO

END Create Purchase Order Procedure

FIGURE 10-37 Data Addition to High-Level Action Diagram

Then the conditional statement for determining the type of request is added (Figure 10-43). The two options are If Customer and If Video ID, and each has its own processes.

Next, Open Rentals are read and displayed until all Open Rentals for this customer are in memory (Figure 10-44). The Open Rental loop is a simple Do While process.

Then video returns are processed using a repeti- tion with a conditional structure (Figure 10-45). Late fees are checked in a repetitive loop for all Open Rentals (Figure 10-46). New rental Video IDs are entered for all new rentals (Figure 10-47). Pro- cess Payment and Make Change is a stand-alone module. Then, for all open and new rentals, the Open Rentals file is updated; for all oftoday's returns, his- tory is updated; and if payment is made or a user

requests, a receipt is printed (Figure 10-48). The consolidated action diagram is shown in Figure 10-49.

Next, evaluate the diagram to identify program modules. As in the example above, we have natu- rally identified modules as part of process definition. For instance, Get Valid Customer is a small, self- contained module that does one thing only. The module uses a Customer ID to access the Customer relation. If the entry is present, the credit is checked. The name, address, and credit status are returned. The remaining modules, that we originally defined as business processes doing one thing, should each be reviewed to ensure that they are, in fact, single purpose. This is left as a class activity.

In addition, we can now resolve the issue held over from analysis about whether to keep separate or

434 CHAPTER lODato-Oriented Design

Do Until all items are identified

Identify Item and Vendor

EndDo Vendor ID

( Sort by Vendor, Item) Item ID -----+t~

. (vendor ID Do While there are Items to be processed Vendor Name

. Vendor IF Flfst-Record ~ Address

Order Terms Set Last-Vendor = Vendor Gendor ID

C ) Item ID

Create Vendor Info Item Qty

ELSE IF" cJ, - L t-" d'or ./ Item Description ven or - as ven /" Item Price

( Get price) Gendor ID

( Create Order ~ Vendor Name Vendor Address

ELSE Order Terms

( Mail Order )

(~ ___ F_i_le_o_~_e_r ____ ~~

Set Last-Vendor = Vendor

( Create Vendor Info)

ENDIF ENDDO

END Create Purchase Order Procedure

Purchase Order

FIGURE 10-38 Optional Screen Processing on Action Diagram

New Releases Vendor Name Video Name

Scrldltem

ScrCreOrder

consolidate Get Open Rentals, Add Return Date and Check for Late Fees. Individually, each of these processes is singular (i.e., does one thing). If they are consolidated, they would remain singular but be placed within the same repetition loop. The issue here, then, is which method is easier to program and implement in the intended language, and which pro- vides the better user interface. We need to visual- ize the user interface and memory processing for each alternative.

prompted for new videos or for returns. If we prompt for returns every time, many wasted entries to deny return processing will be made. If we prompt for either new or return Video IDs, we need a method of knowing which is entered. Assuming we figure that out, we then get all returns and enter today's date for returned videos. Then all entries on the screen are scanned to determine new late fees.

If the modules are kept separate, all Open Rentals are read first and displayed. Then the clerk can be

If the modules are consolidated, as each Open Rental is read, Late Fees are computed for tapes with return dates and no late fees (see Figure 10-50). There are two options for this process. Either we

Get Customer 10

Get Valid Customer

Get Valid Video

Get Return 10

Add Return Date

Get Open Rentals

Check Late Fees

Process Payment & Make Change

Create Open Rental

Update Open Rental

Update/Create History

Print Receipt

Information Engineering Design 435

FIGURE 10-39 ABC Video Process Hierarchy Diagram

assume there are no more returns or the clerk must respond to each Open Rental. With the first option, the clerk would have a selectable option for more return processing. When chosen, each return Video ID is entered and Late Fees are computed for that video.

Notice that both alternatives have problems. The separation alternative has a problem in dealing with returns, and there will be a slight delay for Late Fee processing. The consolidation option actually modi- fies the processes from the PDFD somewhat for Late Fee processing.

Data storage for a rental in memory is the same for both alternatives. We need a location for cus- tomer information, a table for open rentals, a table for new rentals, and locations for payment informa- tion. We will have three iterations through the table for Open Rentals in the separate alternative, and one, or two if returns are present, iteration(s) in the con- solidated alternative.

The alternatives are approximately the same in implementation complexity, although three iterations are more likely to contain bugs than one. The human interface design is the same for both alterna-

436 CHAPTER 10 Data-Oriented Design

r--- Rent/Return

- Customer Maintenance

I-- Video Maintenance

'--- Periodic Processing

FIGURE 10-40 ABC First-Level Action Diagram

tives. The difference in the human interfaces is the speed and timing for data to appear on the Open Rentals lines. In this case the consolidated alterna- tive is slightly faster. The difference in memory pro- cessing is the number of iterations through Open Rental data. Again, the consolidated alternative is preferred somewhat because it is less likely to con- tain bugs. With no overwhelming evidence for or against either alternative, this amounts to a judgment call. We will choose the consolidated alternative to minimize the probability of errors and the number of iterations through the data. The action diagram, reflecting consolidated open rental processing, is in Figure 10-50.

The next activity is reusability analysis. ABC has no library of reusable modules to consider since it currently has no computer processing. The types of modules the consultants are likely to have might be relevant to error processing or to screen interactions. For our purposes, we assume no reusable modules.

To assess module timing, we analyze the module clusters. The only modules that could be concur- rent are those in the last cluster to update files and print a receipt. Before deciding concurrency, we must decide the details of history processing that were deferred from analysis. We have two types of history files: Customer and Video. Customer His-

tory is a separate file that contains the Customer ID and all Video IDs rented by that customer. No counts, dates, or copy information are anticipated. This de- scription complies with the case requirements in Chapter 2.

Video History contains Video ID, Copy ID, Year, Month, Number of Rentals, and Days of Rental for each entry. This data description also complies with the case requirements in Chapter 2. The issue to be decided is whether or not Video History is main- tained during on-line processing, or if the current month's activity is kept with Copy information. If the second alternative is chosen, we need a monthly process to update the Video History and reinitialize the counts in the Copy relation. If the first alternative is chosen, we have two more alternatives. First, we might need update and create processing because, for anyone copy, we would not know in advance whether it has a historical entry or not. This alter- native requires bug-prone processing that is more complex than keeping counts in the current Copy relation. Second, we could create an empty entry for every tape at the beginning of every month. This alternative is not attractive because it generates many empty records on history. Both of these alter- natives would require history to be on-line. Keeping current counts with Copy relations does not require history to be on-line. The final argument for keep- ing the counts in Copy information is that, to main- tain status of a given tape, Copy information must be updated upon video return anyway. As long as the tuple is being read, updating it with count informa- tion requires adding lines of code rather than a new module. From this discussion, it should be clear that keeping current counts in the Copy relation is the preferred alternative. We document this and the other changes in the Data Dictionary.

Now we can discuss module timing for the last group of modules. In this group we create and/or up- date Open Rentals, update Copy, and Print Receipt. Recall from analysis that Vic does not want file update success to be known to the customers. The receipt should be printed regardless of updating suc- cess. This implies that printing could be concurrent with the file processes. The file updates cannot be concurrent because they will all be on the same device. Since there is already contention for the file

Until No More Open Rentals

Information Engineering Design 437

Until No More Returns

r;:.:::-;~:;V:::k~:4----':'''::':'':=~--f Video, Copy

Copy

FIGURE 10-41 ABC Video Process Dependency Diagram

among the users, it is unlikely that we would want to increase contention by having the updates concur- rent. If printing is the only concurrent process, it is not worth the cost to provide concurrency. There- fore, the processes will be made sequential for pro- duction operation. Figure 1 0-50 is not changed at this point.

The entities and data attributes are added to the diagram next to show input and output process- ing. Two entities, EOD and Rental Archive, are

still undefined, having been deferred in analy- sis. These are left as an exercise. The entities refer- enced in Rental/Return processing, Customer, Open Rental, Video, Copy, Customer History, and EOD are all shown in Figure 10-51. When an action dia- gram arrow is from an entity to a process, it means that the entire tuple is accessed. The final action is to add screens to the action diagram, but they are not yet defined, so this activity will be left as a future exercise.

438 CHAPTER 10 Data-Oriented Design

r-- Rent/Return

t-- Customer Maintenance

- Video Maintenance

Create Video

Update Video

Delete Video

Query Video

----- Periodic Processing

FIGURE 10-42 ABC Video Maintenance Second-Level Action Diagram

Define Menu Structure and Dialogue Flow Guidelines for Defining the Menu Structure and Dialogue Flow

The interface structure includes design of a menu structure and design of dialogue flow within the menu structure. Both designs are based on the PDFD and process hierarchy diagram developed during IE analysis.

First, the menu structure is developed. Recall that the menu structure is a structured diagram translat- ing process alternatives into a hierarchy of options for the automated application. The task hierarchy is analyzed to define the individual processing screens required to perform whole activities, and to identify the other processes and activities in the hierarchy which must be selected to get to the processing screens.

Let's walk through the development of the sam- ple menu structure shown in Figure 10-7. The related process hierarchy diagram is shown as Figure 10-52 with the individual processing screens, selection alternatives, and hierarchy levels identified. For each level in the hierarchy, we identify a level of menu

processing. Using simple bracket structures to trans- late from the top to the bottom of the hierarchy, we first define the options for the first level menu (see Figure 10-53). Next, the menu options for the first process level of the hierarchy are shown in Figure 10-54. Finally, the remaining detailed processes are added to the diagram (see Figure 10-55).

If, for any reason, the hierarchy or lower-level processes are in doubt, review the proposed menu structure with the users before proceeding. If the

r- Rent/Return Procedure

Get Request

r-- IF Customer Phone

Get Valid Customer

r--- ELSE IF Video 10

Set 10, 10Type to Video

... GOTO Open Rentals '-- ENOIF

~ END Rent/Return Procedure

FIGURE 10-43 Request Processing Action Diagram Constructs

- Rent/Return Procedure

Do While More Open Rentals for this Customer

Get Open Rentals (using ID, IDType)

[

IF First

Set ID, IDType to Customer

Get Valid Customer

ENDIF

ENDDO

'--- END Rent/Return Procedure

FIGURE 10-44 Open Rental Action Diagram Constructs

process hierarchy diagram is accepted as correctly mirroring the desired functions in the application, proceed to the next step, defining the movements between menu items.

Traditionally, applications were constrained to moving top-to-bottom-to-top with no deviation. Anyone who uses such an interface for long knows it is irritating to wait for some menu that is unwanted and to enter choices purely for system design rea- sons. The decisions should relate to application requirements as much as possible. For instance, security access control requirements can be partially

Information Engineering Design 439

_ Rent/Return Procedure

r-- IF Returns

[

DO Until no more returns

Get Return ID

Add Return Date

ENDDO

'-- ENDIF

'---- END Rent/Return Procedure

FIGURE 10-45 Video Returns Action Diagram Constructs

met by restricting movement to functions as part of dialogue flow. The decisions about legal movement should be made by the users based on recommenda- tions by the designers; although frequently, dialogue flow decisions are made by the SEs. In general, if the users are functional experts, an open design that allows free movement should be used. If users are novices or not computer literate, a more restrictive design should be used to minimize the amount of their potential confusion.

Figure 10-56 shows types of arrows used to de- pict movement between levels of a menu structure.

440 CHAPTER 10 Data-Oriented Design

RenVReturn Procedure

[

DO Until No more Open Rentals

Check for Late Fees

ENDDO

END Rent/Return Procedure

FIGURE 10-46 Late Fee Action Diagram Constructs

In a small diagram, with less than ten screens, only single-headed arrows are used, and at least two arrows are drawn for each entry: one entering and one leaving (Figure lO-56a). In a large diagram, with over ten screens, the triple-headed arrows can be added to the diagrams to depict call-return process- ing (Figures lO-56b and lO-56c).

An example of restricted screen movement that might be designed for novice users is shown in Fig- ure lO-57a. In the diagram, all movement is to or from a menu. The diagram in Figure lO-57b shows that any level of upper menu might be reached from

Rent/Return Procedure

[

DO Until no more new video rentals

Get Valid Video

ENDDO

END Rent/Return Procedure

FIGURE 10-47 New Rentals Action Diagram Constructs

the lower levels. This speeds processing through menus and is preferred to the design shown in Figure lO-57a which only allows a process to return to the menu level from which it was activated. Restrictive dialogue flow (Figure lO-57a) is the type of design that is most likely to waste user time and become annoying.

Experts and frequent users usually are provided more alternatives for interscreen movement because they become proficient with the application. Unre- stricted screen movement is desirable for these users. An example of unrestricted movement in screen

r--- Rent/Return Procedure

Process Payment and Make Change

:: DO Until all Rentals in memory are processed

r- IF Return Date = Today's Date

Update History

I- ELSE IF Return-Date NOT = spaces

Update Open Rental

f-- ELSE IF Return-Date = spaces

Create Open Rental

'-- ENDIF

L ENDDO

[

IF Payment> zero or Receipt Requested

Print Receipt

ENDIF

- END Rent/Return Procedure

FIGURE 10-48 Payments, File Update and Printing Action Diagram Constructs

design is shown in Figure lO-57c. In the example, the user begins at the main menu and may move down the hierarchy in the same manner as a novice, or may move directly to a process screen, at the user's option. Unrestricted movement requires the design and implementation of a command language or sophisticated menu selection structure that is con- sistent with the basic novice menu selections, but adds the expert mode.

Unrestricted movement can be costly and error- prone, which are the main reasons why it is not prevalent. The added cost is due to increased access

Information Engineering Design 441

control structure that must accompany an open movement design. The added errors are from a need to provide a specific location on the screen for entry of the expert's direct screen requests. Each request must be checked for access control and legality, plus the current context (i.e., screen and memory information) might need to be saved for return processing.

,--- Rent/Return Procedure Get Request r- IF Customer Phone

Get Valid Customer I- ELSE IF Video ID

Set ID, IDType to Video foIiI"'I--+---GOTO Open Rentals

'-- ENDIF

[

DO Until NO More Open Rentals for this Customer Get Open Rentals (using ID, IDType)

[

IF First Set ID, IDType to Customer Get Valid Customer

ENDIF ENDDO

~ IF Returns

~ DO Until no, more returns

Get Return ID Add Return Date

ENDDO

ENDIF

~ DO Until No more Open Rentals

Check for Late Fees ENDDO

b DO Until no more new video rentals

Get Valid Video ENDDO

Process Payment and Make Change

~ DO Until all Rentals in memory are processed

~ IF Return Date = Today's Date

Update History ELSE IF Return-Date NOT = spaces

Update Open Rental ELSE IF Return-Date = spaces

Create Open Rental ENDIF

t= ENDDO

[

IF Payment> zero or Receipt Requested Print Receipt

ENDIF

- END Rent/Return Procedure

FIGURE 10-49 ABC Consolidated Action Diagram

442 CHAPTER 10 Data-Oriented Design

,---- Rent/Return Procedure Get Request

- IF Customer ID Get Valid Customer

- ELSE IF Video ID Set ID, IDType to Video

"","t--+---G8,OTO Open Rentals - ENDIF

= DO Until NO More Open Rentals for this Customer Get Open Rentals (using ID, IDType)

[

IF First Set ID, IDType to Customer Get Valid Customer

ENDIF

[

IF Returned Get Return ID Add Return Date

ENDIF Check for Late Fees

t:: ENDDO

C DO Until no more new video rentals

Get Valid Video

ENDDO

Process Payment and Make Change

~ DO Until all Rentals in memory are processed

~ IF Return Date = Today's Date

Update History ELSE IF Return-Date NOT = spaces

Update Open Rental ELSE IF Return-Date = spaces

Create Open Rental ENDIF

~ ENDDO

[

IF Payment> zero or Receipt Requested Print Receipt

ENDIF

'---- END Rent/Return Procedure

FIGURE 10-50 ABC Action Diagram with Consolidated Open Rental Processing

Upon completion, the menu structure and dia- logue flow diagrams are given to the human inter- face designers to use in developing the screen interface (see Chapter 14). The dialogue flow dia- gram is also used by designers in developing pro- gram specifications. Before we move on, note that even though the menu structure is identified, the human interface may or may not be structured exactly as defined in the menu structure diagram. The human interface designers use the menu struc-

ture information to understand the dependencies and relationships between business functions, entities, and processes; they may alter the structure to fit the actual human interface technique used. If a tradi- tional menu interface is designed, it could follow the menu structure diagram.

ABC Video Example Menu Structure and Dialogue Flow

The menu structure is derived from the process hierarchy diagram in Figure 10-58 (reprint of Figure 9-26). First, the activities from the decomposition form the main menu options (see Figure 10-59). The processes are used to develop submenu options. Then, the lowest level of processing completes the simple structure (Figure 10-60).

Notice that all Rent/Return processing is ex- pressed in the first menu option even though we have many subprocesses in the hierarchy. Rental/return has many subprocesses performed as part of the hierarchy diagram. Unlike the other subprocesses, rental/return does not have individual menus and screens for each subprocess. Rather, rental/return requires a complex, multifunction screen with data from several relations and processing that varies by portion of the screen. The subprocesses for rental/ return, then, describe actions on portions of the screen. You cannot tell from the decomposition dia- gram that rental/return has this requirement; rather, you know from application requirements (and expe- rience) what type of screen(s) are needed. An incor- rect rendering of the menu structure, such as the one in Figure 10-61, would look weird and should make you feel uncomfortable about its correctness.

Second, notice that we do not indicate access rights for any of the processing options on the dia- gram. The security access definition is superimposed on the menu structure by the interface designers to double-check the design thinking of the process designers. If there is an inconsistency, the two groups reconcile the problems.

Next we develop a dialogue flow diagram from the menu structure diagram. The rows of the dia- logue flow diagram correspond to the entries in the menu structure (Figure 10-62). Rows are entered by level of the hierarchy by convention.

FIGURE 10-51

Information Engineering Design 443

Rent/Return Procedure

Get Request

------------------~

( Get Valid Customer ~ ust Phone Name j;

F Customer Phone

SeIIO, 10Type 10 Video ~redil Stalus ELSE IF Video 10 Address GOTO Open Rentals

ENOIF

Customer

DO Until NO More Open Rentals for this Customer Open Rentals Open Video

(Get Open Rentals (using ID, IDType)~ Rental Copy

[

IF First Set 10 IDT e to cust9mer ust ID Get Valid Customer Name

ENDIF Address Credit Status

[

IF Returned (Get Return ID )

(Add Return Date) ENDIF

( Check for Late Fees)

ENDDO ----------------------~

Customer

DO Until no more new video~r;.e~nt:al:s,.;;====~:r Video ( Get Valid Vide~'" Copy

Process Payment and Make Change }-------.... EOD

DO Until all Rentals in memory are rocessed Number Days

~ IF Return D~te = Today's Date Number Rents ( Update History) Cust History

ELSE IF Return-Date NO - es (update Open Renta~copy

ELSE IF Return-Date = spaces Status

E~~~ate Open Rental~open ENODO Rental

Cust History Open Rental Copy

END Rent/Return Procedure ____________ ----J

ABC Action Diagram with Data Entities and Attributes

We need to decide how much flexibility to give users, keeping in mind the security access require- ments and the users' computer and functional skills. Users are mostly novices with little computer expe- rience. The average job tenure is less than six months. Data and function access for clerks are unrestricted for customer, video, and open rentals add, change, and retrieve functions. Other options are more restricted in terms of which user class can perform each function.

First we define the options. We could define flex- ible movement between those options only, and restrict movement to other options through the hier- archy. Top-down hierarchic access is possible. We could allow hierarchic access combined with flexible 'expert' mode movement throughout the hierarchy, constrained by access restrictions.

For each option, ask the following ques- tions. Does Vic have a preference? Which best fits the user profile? Which is the cleanest

444 CHAPTER lODato-Oriented Design

Special Products Div.

Ereonord~

Prospect Maintenance

Change Allocation

Delete Allocation

Inquire on Allocation

Change Customer

Delete Customer

Inquire on Customer

FIGURE 10-52 Example of Process Hierarchy Diagram

implementation, least likely to cause testing and user problems?

Vic, in this case, has no preference. Having never used computers, he has no background that allows him to make a decision. He says, "Do whatever is best for us. I let that up to you. But I would like to see whatever you decide before it is final." This statement implies interface prototyping, which should always be done to allow users to see the screens while they are easily changed.

Most of Vic's employees work there for 1 Y2 years and have little or no computer experience. There- fore, screen processing that is least confusing to new users should be preferred. Usually, novices prefer hierarchic menus, providing the number of levels do not become a source of confusion. Also, the sim- plest implementation is always preferred; that is, the hierarchic menu option.

Based on the answers to the questions, we should design a restrictive, hierarchic flow. As Figure 10-63

Main Menu

1. Customer Service

2. Sales

3. Marketing

FIGURE 10-53 First-Level Menu Structure

shows, this design is simple and easy to understand. The dialogue flow and screens should be prototyped and reviewed with Vic at the earliest possible time to check that he does not want an expert mode of operation.

You might question whether the movement from rent/return to cm:tomer add and video add should be on the dialogue flow diagram. This is a reasonable concern since the process of rent/return does allow adding of both customers and videos within its process. The issue is resolved by local custom. In general, given the option, such flexibility should be shown on the diagram for clarity and completeness. Sometimes, local convention or a specific CASE tool requirement do not allow such completeness.

Information Engineering Design 445

Plan Hardware and Software Installation and Testing Guidelines for Hardware/Software Installation Plan

The guidelines for hardware and software installa- tion planning are developed from practice and iden- tify what work is required, environmental planning issues, responsibility for the work, timing of materi- als and labor, and scheduling of tasks.

Installation requirements should always be de- fined as far in advance of the needs as possible and documented in a hardware installation plan. In- stallation planning tasks are:

1. Define required work Define hardware/software/network

configuration Assess physical environment needs Identify all items to be obtained Order all equipment, software, and services Define installation and testing tasks

2. Assign responsibility for each task 3. Create a schedule of work

If the SE team has no experience with configuring installations, their work definition should always be checked by someone who has experience. In general,

1. Customer Service

Main Menu

[

1. Order Fulfillment 2. Inventory Allocation 3. Customer Maintenance

2. Sales

[ 1. Order Fulfillment 2. Inventory Inquiry 3. Customer Maintenance 4. Prospect Maintenance

3. Marketing

[ 1. Query Order 2. Query Manufacturing Plans 3. Query Goods in Process 4. Query Inventory 5. Query Customers

FIGURE 10-54 Second-Level Menu Structure

446 CHAPTER lODato-Oriented Design

Main _ Menu

r-- 1. Customer Service

r-- 1. Order Fulfillment

[

1. Create Order 2. Change Order 3. Delete Order 4. Order Inquiry

2. Inventory Allocation

[

1. Create Allocation 2. Change Allocation 3. Delete Allocation 4. Allocation Inquiry

3. Customer Maintenance

[

1. Create Customer 2. Change Customer 3. Delete Customer

'-- 4. Customer Inquiry

r-- 2. Sales .------ 1. Order Fulfillment

[

1. Order Create 2. Order Change 3. Order Delete 4. Order Inquiry

C 2. Inventory Inquiry 3. Customer Maintenance

[ 1. Create Customer 2. Change Customer 3. Delete Customer 4. Customer Inquiry

4. Prospect Maintenance

[

1. Create Prospect 2. Change Prospect 3. Delete Prospect

~ 4. Prospect Inquiry

-- 3. Marketing

[

1. Query Order 2. Query Manufacturing Plans 3. Query Goods in Process 4. Query Customers 5. Query Prospects

FIGURE 10-55 Final Menu Structure

you define the complete hardware, software, and net- work configuration needed, match the application configuration requirements to the current installa- tion, get approval for all incremental expenditures, order all equipment and software, and install and test all equipment and software. In a mainframe environ- ment, this task is simplified because the first step, configuration definition, can be abbreviated and done with help from an operations support group.

The operations support group also would install and test hardware and install software.

When the configuration is defined, it is matched to the current installation to determine what items need to be purchased. In new installations, the phys- ical installation environment is as important as the equipment. Building, cooling, heating, humidity control, ventilation, electrical cable, and communi- cations cable needs should all be assessed. If you have no experience performing these analyses, hire someone who does. Do not guess. You only do the client a disservice, and chances of making a costly mistake are high.

Once needed items are identified, they should be ordered with delivery dates requested on the orders . The delivery dates should conform to the expected installation schedule which is discussed below. The goal is to have all equipment and parts when they are needed and not before. For capital expenditures, this delays the expense until it is needed. Planning for large capital expenditures should be done with the client and accountant to stagger charges that might be a financial burden.

As items to be installed are identified and or- dered, responsibility for installation and testing should be identified. The alternatives for who should do hardware and software installation are varied. Choices include consultants, unions, contractors, subcontractors, or current personnel. In many cases, there are three types of installations being made: software, hardware, and the network, and each has its own installation responsibility.

Software should be installed by system program- mers in an operations support group in a mainframe installation, and by the software builders for a PC installation. Contracts, whether formal or informal, should state what work is to be done, timing of work, penalties for failure to meet the time requirements, and price. Other items such as number of hours and dates of access to the site might also be included.

Hardware, in a mainframe environment, is man- aged, ordered, and installed through an operations department. You, as an SE needing equipment, must know what you need, but must trust the operations department to obtain, install, and test the equipment. Most PC computer equipment is simplified enough that special assistance is not usually required. If

Information Engineering Design 447

Row = Screen Column = Movement

I I

a. Screen movement to directed arrow screen.

b. Screen movement to one of several possible screens.

c. Movement is down the arrow with return to calling screen.

d. Movement is down the arrow with further selection at called screen, no necessary return.

FIGURE 10-56 Dialogue Flow Movement Alternatives

~:::::~: =~= J =fj= = = ~ = Order Fulfillment j Create Order _

Change Order _ _ _ _ _ _ _ __ _ __ _ _ _ _ _ _ _

Delete Order

Order Inquiry

FIGURE 10-57 a Example of Restrictive Screen Movement

448 CHAPTER lODato-Oriented Design

:::::~e:i~ ~ = f ~ ~ = = = = = - - =_ =_ =_ =_ =_ =_ =_ =_ =_ =_ =

:::~:::::r~ ~ ~ ~ ~ ~ ~ ~ ! j~ _ ~ ~ J ~ f f -~ ~ ~ :- - Order Inquiry _ _ _ _ _ _ _ _ _ _ _' _ _ __ j __ 1 _ _ _ _

FIGURE lO-57b Example of Less Restrictive Screen Movement

desired, you can usually negotiate with a hardware vendor to burn-in equipment and set it up for a small fee. Burn-in means to configure the hardware and run it for some period of time, usually 24-72 hours. If there are faulty chips in the machine, 90% of the time they fail during the bum-in period.

At least two terminals or pes should be config- ured during installation of network cable for testing the cable. For LAN installation, hire a consultant if you've never done this before. The consultant helps you

Main Menu _

• define what is to be done • define required equipment (e.g., cabling; con-

nectors, etc.) • get permits from the government and building

owners • obtain zoning variances • identify and hire subcontractors • supervise and guarantee the work.

As the user's representative, you can prepare the installation for the work to be done. Mark walls

-,,-jr- -

-----.. -.-------------- Customer Service ~ ~ = Sales_ - - - i- - Marketing - - - r - ~ ...., - - - - - - - - - - - - - -

OrderFulfillmenL ; ~ -, -, - - - - - - - -r-{ A Create Order - - - - -, - - - - r- - - 1- ...., :::~:::::~-----:-·~tj- --~-, ----------- --- -:

, -_, -j-_, --r-----------"l Order Inquiry _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

FIGURE lO-57c Example of Less Restrictive Screen Movement

Get Valid Customer

Get Valid Video

Get Return ID

Add Return Date

Get Open Rentals

Check Late Fees

Create Open Rental

Update Open Rental

Update/Create History

Print Receipt

FIGURE 10-58 ABC Process Hierarchy

Main Menu {

1. Rental/Return

2. Customer Maintenance

3. Video Maintenance

4. Other

FIGURE 10-59 ABC First-Level Menu Hierarchy

Information Engineering Design 449

where all wires should be, using colored dots. For instance, you can use blue dots for phone lines, red dots for LAN cable, and green dots for electrical outlets. Number all outlets for identification of wires at the server end. Colored tape shows where cable runs should be placed in false ceilings and walls. Configure one PC, with the network operating sys- tem installed, in the location of the file server. As cabling is complete, move the second PC to each wired location, start-up the network, and send mes- sages. Make sure the location is as expected and that

450 CHAPTER 10 Data-Oriented Design

r-- 1. Rent/Return Processing

~ 2. Customer Maintenance

[

1. Create Customer 2. Change Customer 3. Delete Customer 4. Customer Inquiry

r--- 3. Video Maintenance

[

1. Create Video 2. Change Video 3. Delete Video 4. Video Inquiry

~ 4. Other

[

1. End of Day 2. Startup 3. Shutdown 4. End of Month

[ 1. Update Customer History 2. Update Video History

5. Query

FIGURE 10-60 ABC Menu Structure

the wiring works. Test all wires because they will be wrong. Make sure all wiring is correct before the electrical contractor is paid and leaves.

The important issue is to make a choice of who will do what work long before the work is needed, and plan for what is to be done. Use a lawyer to write all contracts using information provided by you, as the client's representative, and the client.

Timing of installations can be crucial to imple- mentation success. When different types of work are needed, such as air-conditioning and electrical cabling, the work should be sequenced so the con- tractors are not in each other's way, and in order of need. For instance, a typical sequence might be building frame, building shell, false floor/ceiling framing, electrical wiring, plumbing, air-condition- ing, communications cabling, false floor/ceiling fin- ishing, finishing walls, painting, and decorating. Any sequences of work should be checked with the peo- ple actually performing the work to guarantee that they agree to the work and schedule.

In general, you want to end testing of all equip- ment to be available for the beginning of design at the latest. This implies that all previous analysis work is manual. If CASE is to be used, the latest pos-

sible date for equipment and software availability is the beginning of project work.

Cabling is needed before equipment. Equipment is needed before software. Software is needed before application use. Some minimal slack time should be left as a cushion between dates in case there is a problem with the installation or the item being installed. Leave as big a cushion between installation and usage as possible, with the major constraint being payment strains on a small company.

ABC Video Example Hardware/Software Installation Plan

For ABC, a local area network is to be used. A file server with one laser printer, three impact print- ers, and five PCs are planned. The LAN will be a

Main Menu

1. Rental/Return

1. Get Request 2. Get Valid Customer 3. Get Open Rental 4. Get Return ID 5. Add Return Date 6. Check Late Fees 7. Get Valid Video 8. Process Payment and Make

Change 9. Create Open Rental

10. Update Open Rental 11. Create/Update Customer History 12. Update Item 13. Print Receipt

2. Customer Maintenance

[

1. Create Customer 2. Retrieve Customer 3. Update Customer 4. Delete Customer

3. Video Maintenance

[

1. Create Video 2. Retrieve Video 3. Update Video 4. Delete Video

4. Other

[

1. End of Day 2. Startup 3. Shutdown 4. End of Month 5. Query

FIGURE 10-61 Incorrect Rental/Retum Menu Structure

Rent/Return Processing Customer Maintenance

Create Customer Change Customer Delete Customer Customer Inquiry

Video Maintenance Create Video Change Video Delete Video Video Inquiry

Other End of Day Startup Shutdown End of Month

Update Customer History Update Video History

Query

Information Engineering Design 451

FIGURE 10-62 ABC Dialogue Flow Diagram Menu Structure Entries

Main Menu-ABC RlR Rent/Return Processing Customer Maintenance

Create Customer Change Customer Delete Customer Customer Inquiry

Video Maintenance Create Video Change Video Delete Video Video Inquiry

Other

., ,

r ~ •••

, " ~~ , '" ~ , .. ~

, '" ~

~~~;~~ 111'11' Update Customer HistOry _____________________ r-+_-_t-±.L...---- Update Video History

Query

Unrestricted access within a function (subject to access rights) except for 'other' processes. All 'other' processes are invoked from and returned to the menu.

FIGURE 10-63 ABC Dialogue Flow Diagram

452 CHAPTER 10 Data-Oriented Design

Novell ethernet with SOL-compatible DBMS soft- ware, Carbon Copy, Word Perfect, Lotus, Norton Utilities, Fastback, and Symantek Virus software. The goal is for all hardware to last at least five years if no other business functions are added to the sys- tem. The configuration details are shown in Figures 10-64 and 10-65. There should be adequate capac- ity to add accounting and order processing software if needed. The current average daily rentals of 600 is expected to double in five years. The current num- ber of customers is 450, and is expected to be 1,000 in five years.

To develop a plan, assume that the current date is January 1, and that the application installation is scheduled for August 1. Design has just begun. The PCs and laser printer were installed five months ago for availability during planning, feasibility, and . analysis. The currently installed software includes a CASE tool on two machines, Word Perfect, Norton Utilities, Fastback, the SOL DBMS, and SAM Virus software. The remainder of the software and hard-

0 PC -

BarCode m Reader :T

0 CD 3 ~ OJ c en

o PC-

FIGURE 10-64 ABC Configuration Schematic

ware must be ordered, installed, and tested as part of this plan.

First we determine what we need. A compari- son of currently installed items to the list of re- quired items shows the following items need to be planned:

Network cable and connecters File Server Novell Software Network Interface Cards (NICs, i.e., ethernet

boards) Impact printers Bar Code Reader and Imprinter Carbon Copy (network version) Word Perfect (network version) Norton Utilities (network version) Fastback SOL DBMS (network version) SAM (network version) Lotus (network version)

BarCode Reader

0 BarCode Reader

File Server Modem

Tape Backup

Automated Tool Support for Data-Oriented Design 453

Hardware Characteristics:

File server 12 Mb Memory 800 Mb Disk Super 486, SCSI Channel Color monitor

1 Laser printer 8 Page/Minute

3 Impact printers for two-part forms (or 4 cheap lasers with tear-apart forms)

5 PCs 2 Mb Memory 1.4 Mb Floppy disk for startup No hard disk Local printer (see above)

1 2400 Baud Modem for long distance troubleshooting

1 Streaming tape backup 100 Mb/Minute

FIGURE 10-65 ABC Hardware and Software Details

Everything should be ordered as soon as possi- ble to ensure availability. Equipment and software ordering is the first item on the plan.

The group has installed network software before but not the cable, so they obtain approval from Vi~ to engage another consultant, Max Levine, from their company to perform that work. Max has been installing mainframe and PC networks for over 20 years and knows everything about their installa- tions and problems. He immediately takes over the network planning tasks. He first obtains a rough idea of the planned locations for equipment, computes cable requirements, and orders cable and connectors. Then, for the plan, he adds tasks for mapping spe- cific cable locations for the installers, for installing and testing the file server, and for installing and test- ing the cable (see Table 10-6).

At the same time, Mary and Sam work at plan- ning the remaining tasks. Each software package must be installed and tested. These tasks are planned for Sam and one junior person. The tests for all but the SQL package are to use the tool and verify that it works. For the SQL package, Sam and a DBA will install a small, multiuser application to test that the single and multiuser functions are working as expected. Of all the software being used, it is the

one with which they are least familiar, so they use the installation test as a means of gaining more experience.

All tasks relating to new equipment and software are scheduled to take place during a six-week period in January and February. This allows several months of cushion for any problems to be resolved; it also allows disruptive installations (e.g., cable) to be scheduled around peek hours and days. The schedule does not show elapsed time, but other work is taking place beside the installations. For instance, design work is progressing at the same time. As the application is implemented and the users have need for the equipment, the PCs and printers are moved to their permanent locations. This occurs in late spring for data conversion. The last stand-alone PCs are scheduled to be added to the network in late July, long before the application implementation date of August 15.

AUTOMATED TOOL ------- SUPPORT FOR DATA- ------ ORIENTED DESIGN ------- Many CASE tools support aspects of data oriented design (see Table 10-7). Two specifically support IE as discussed in this chapter. The IE CASE tools are Information Engineering Workbench4 (lEW) by Knowledgeware, Inc., an4 Information Engineering Facility (lEF) by Texas Instruments, Inc. Both prod- ucts receive high marks of approval and satisfaction from the user communities. Because of their cost, both products are used by mostly large companies. The products offer enterprise analysis in addition to application analysis, design, and construction (i.e., coding). Both IEF and lEW work on PCs, networks, an:d mainframes.

A typical IEF installation could include a main- frame version with the centralized repository. Users check out portions of a repository to work with on a PC. Then, when the work is complete and checked on the PC, it is merged with the mainframe reposi-

4 lEW for a OS/2 environment is called the Advanced Develop- ment Workbench (ADW).

454 CHAPTER 10 Data-Oriented Design

TABLE 10-6 Installation Plan Items

Due Date Responsible

1/10 Mary/Sam

1/15 ML

2/1 ML

2/1 Sam, Jr. Pgmr.

2/5 Sam, Jr. Pgmr.

2/5 DBA, Sam

2/10 ML, Union Contractor

2/15 DBA,Sam

5/15 Sam, Vic's LAN Administrator

7/30 LAN Administrator

8/30 Mary, Sam

tory for official storage. When the merge takes place, the checked-out items are revalidated for consis- tency with all mainframe repository definitions. Both products offer automatic SQL schema generation for data. IEF offers automatic code generation for Cobol with imbedded SQL, and can interface to generators for other languages.

lEW and IEF differ in important ways. lEW is more flexible in that it does not require the com- pletion of any matrices or diagrams. However, to take advantage of the interdiagram evaluation software that assesses completeness and syn- tactic consistency, all matrices and diagrams are required during a given phase. This means that you might not have the diagrams or analyses from

Item

Order equipment and software

Order cable and connectors

Plan cable, printer, PC, server locations

Install and test file server and one PC

Install and test impact printers

Install and test bar code reader and printer

Install and test Carbon Copy (network version)

Install and test Word Perfect (network version?)

Install and test Norton Utilities (network version)

Install and test Fastback

Install and test Lotus (network version)

Install and test SAM (network version)

Install and test SQL DBMS (network version)

Install and test cable

Install test application and verify SQL DBMS

Move 2 PCs, bar code reader, and 3 printers to permanent locations and test

Move remaining three PCs to permanent locations and test

Remove CASE tools from PCs, remove single user soft- ware from PCs and file server

planning, but you still can create levels of ERDs within the analysis tool. Similarly, you might not have the analysis tool, so action diagrams can be cre- ated directly within the design tool. IEF's strength is that its rigorous adherence to Information Engi- neering has led to substantive intelligence checking within the software. Both tools easily manage and sort large matrices that result from several of the analyses.

The weakness of the tools differs for each tool. lEW is primarily a PC-based product that can be unstable when used for large projects. lEW also pro- vides DFDs, not PDFDs, and is not a pure data methodology tool. A strength of lEW is that Knowl- edgeware was an IBM partner in its repository defi-

Automated Tool Support for Data-Oriented Design 455

nition; as a result, lEW is compatible with AD-cy- cle software from IBM.

CASE products, but the requirement to complete every table, and so on does not make sense for all projects. TI has recognized the severity of this short- coming and is increasing the flexibility of the prod- uct without compromising its capabilities. The mainframe version of IEF uses DB/2 for repository management and can generate C, Cobol, DB/2, SQL, and other languages' codes.

IEF's strength is also its biggest weakness. IEF requires completion of every table, matrix, and dia- gram at this time. 5 The level of intelligent checking that can be performed is higher than with most other

5 1993

TABLE 10-7 Automated Tool Support for Data-Oriented Methodologies

Product

Analyst/Designer Toolkit

Bachman

CorVision

Deft

Design/1

ER-Designer

IEF

Company

Yourdon, Inc. New York, NY

Bachman Info Systems Cambridge, MA

Cortex Corp. Waltham, MA

Deft Ontario, Canada

Arthur Anderson, Inc. Chicago,IL

Chen & Assoc. Baton Rouge, LA

Texas Instruments Dallas, TX

Technique

Entity-Relationship Diagram (ERD)

BachmanERD Bachman IDMS Schema Bachman DB2 Relational

Schema and Physical Diagram

Action Diagram Dataview ERD Menu Designer

ERD Form/Report Painters Jackson Structured Design

(JSD)-Initial Model

ERD

ERD Normalization Schema generation

Action Diagram Code Generation Data Structure Diagram Dialog Flow Diagram Entity Hierarchy ERD Process Data Flow Diagram Process Hierarchy Screen Painter

(Continued on next page)

456 CHAPTER lODato-Oriented Design

TABLE 10-7 Automated Tool Support for Data-Oriented Methodologies (Continued)

Product

lEW, ADW (PS/2 Version)

System Engineer

Teamwork

vs Designer

Company

Knowledgeware Atlanta, GA

LBMS Houston, TX

CADRE Tech. Inc. Providence, RI

Visual Software Inc. Santa Clara, CA

SUMMARY ________ ~ __ _ Data-oriented methods assume that, since data are stable and processes are not, data should be the main focus of activities. First, design focuses on the usage of data to develop a strategy for distributing or centralizing applications. Several matrices summa- rize process responsibility, data usage, type of data used, transaction volumes, and subjective reasons for centralizing or distributing data.

Next, processes from a process hierarchy diagram are restructured into action diagrams in design. The details of process interrelationships are identified from the PDFD and placed on the action diagram. Each process is fully defined either in a diagram or in the data dictionary. Process details are grouped into modules and compared to existing modules to determine module reusability. Modules are analyzed from a different perspective to reflect concurrency

Technique

Action diagram Code generation Database diagram ERD Normalization Schema Generation Screen layout

ERD DFD Menu Dialog Transaction Dialog Entity Life History Module Sequence DB2, ADABAS, IDMS, Oracle Table Diagram

Control Flow Code Generation ERD Process Activation table Program Design Tools Testing Software

Process flow diagram Action Diagram

opportunities or requirements on the action diagram. Entities are added to the diagram and related to processes. Lines connect individual processes to attributes to complete the action diagram specifica- tion of each application module. For manually drawn diagrams, an optional activity is to identify screens and link them to attributes and processes, to give a complete pictorial representation of the on-line portion of the application.

Data-oriented design focuses on the needs for se- curity, recovery, and audit controls, relating each topic to the data and processes in the application.

The menu structure and dialogue flow for the application are defined next. The menu structure is constructed from the process hierarchy diagram to link activities, processes, and subprocesses for menu design. The structure can be used to facilitate inter- face designers' application understanding. The dialogue flow documents the flexibility or restric-

tiveness of the interface by defining the allowable movements from each menu level (from the menu structure) to other levels of menus and processing.

Finally, installation plans for all hardware and software are developed. A list of tasks is defined, responsibilities are assigned, and due dates are allo- cated to the tasks.

There are two fully functional CASE tools that support data-oriented methodology as discussed in this chapter, lEW and IEF. They are popular in com- panies that use data-oriented methods.

REFERENCES ----,--- Date, C. J., An Introduction to Database Systems, Vol. 1,

5th edition. Reading, MA: Addison-Wesley, 1990. Finkelstein, Clive, An Introduction to Information Engi-

neering: From Strategic Planning to Information Systems. Reading, MA: Addison-Wesley, 1989.

Knowledgeware, Inc., Information Engineering Work- bench™IAnalysis Workstation, ESP Release 4.0. Atlanta, GA: Knowledgeware, Inc., 1987.

Loucopoulos, Pericles, and Roberto Zicari, Conceptual Modeling, Databases and CASE: An Integrated View of IS Development. NY: John Wiley & Sons, 1992.

Martin, James, Information Engineering, Vol. 3: Design and Construction. Englewood Cliffs, NJ: Prentice- Hall, Inc., 1990.

Martin, James, and Carma McClure, Diagramming Tech- niques for Analysts and Programmers. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1985.

Texas Instruments, A Guide to Information Engineering Using the IEF. Dallas, TX: Texas Instruments, 1988.

KEY TERMS action diagram application security audit control backup bum-in candidate for template code generator computer verification concurrent processes condition bracket control point controlled redundancy

-----:-----:-:---:-:--:------ data distribution by

location matrix data security data usage by location

matrix denormalization dependent concurrent

processes dialogue flow diagram DRlDc>N -1 DR <NID federation

full backup hardware installation plan horizontal data partitioning incremental backup independent concurrent

processes menu structure normalization off-site storage physical security procedural template process/location matrix

Study Questions 457

recovery recovery procedures repetition bracket replication security plan selection bracket sequence bracket sight verification subset partitioning structural relationships transaction volume matrix vertical partitioning

EXERCISES _______ _

1. Analyze Figures 10-8 to 10-11 and Table 10-1. Develop and present a recommendation for cen- tralization or distribution. Define all recom- mended data and software locations. Explain your reasoning for each choice.

2. Complete the action diagram for miscellaneous processing. Define the contents of the EOD File.

3. Go visit a local small business such as a video store, restaurant, or supermarket. Assess their security and physical layout. Develop a list of recommendations you would make if installing a computer system for this company. Present your findings to the class and the reasons for your recommendations.

STUDY QUESTIONS ____ _

1. Define the following terms: action diagram repetition bracket code generator replication control point security controlled transaction volume

redundancy matrix recovery vertical data partitioning

2. What are structured programming tenets and why are they important in IE design?

3. What is the purpose of an action diagram? 4. Discuss this assertion: "Normalization to the

third normal form and higher is always desir- able for a physical database."

458 CHAPTER 10 Data-Oriented Design

5. Define the four types of database distribution. 6. Describe how security, recovery, and audit con-

trols complement each other. 7. There are six types of disasters considered in

recovery planning. What are they and what data/application problems do they cause?

8. What are common methods of securing data against unwanted access?

9. What is the purpose of off-site storage? How off-site should off-site storage be?

10. What are the trade-offs in security and recov- ery design? Why not build a fortress to secure everything?

11. Discuss the differences between full and incre- mental backup.

12. What features of computers make audit con- trols difficult?

13. How is a menu structure diagram constructed? What is its purpose?

14. How can dialogue flow diagrams be used to partially provide for access control?

15. What are the structural relationships on an action diagram? Where do they come from?

16. List the steps in developing an action diagram. 17. For what types of applications does concur-

rency analysis become important? 18. What is reusability analysis? Why is it

important?

19. Why, when developing an action diagram, must the processes sometimes change from what is on the PDFD?

20. Describe the matrices and formulae used to determine centralization or distribution of data. In the absence of subjective reasoning, would the matrices and formulae lead to a rational decision? Why or why not?

21. Why is an installation plan important? How can installation be used as a teaching exercise for junior people?

22. What aspects of physical environment should be considered in an installation plan for new equipment?

23. Describe the diagram interrelationships for data and processes from enterprise analysis to analysis to design.

EXTRA-CREDIT QUESTION

1. Analyze the Advanced Office System (AOS) case in the Appendix. Develop all of the distri- bution matrices and subjective reasoning for/against distribution. Develop recommenda- tions and explain your reasoning for each choice.

C HAP T E R 11 ____ OBJECT- __ ORIENTED

---------------------------, .......... r-----

__ ANALYSIS ----------------------------,. .......... ----- INTRODUCTION ____ _

In this chapter, we reanalyze the requirements for the ABC Video's rental processing application using an object-oriented approach. This approach requires the definition of many new terms and a fundamentally different way of thinking about applications and their components. Keep in mind that object orienta- tion is very much an immature methodology class that is still evolving.

Several distinct schools of thought have emerged on how best to represent object thinking. Since they discuss the same topics, the schools have consider- able conceptual overlap. The first school is object orientation that uses many graphical forms parallel- ing those of other methodologies. Authors using this approach are Coad and Yourdon and Rumbaugh et al. (see References at the end of the chapter). The second school of object orientation is tabular, using mainly tables to list and define objects and their parts. This approach is used by Booch and Berrard. The graphical methodologies lack the reasoning processes of Booch's approach, while the tabular method is not easily communicated because of the extensive detail generated. Therefore, the Booch and Coad and Yourdon approaches are both modified and integrated throughout this discussion. Since few people dispute the need for analytical rigor and graphical richness, this type of object methodology

is preferable to either one or the other approach used singly.

CONCEPTUAL _____ _ FOUNDATIONS OF ___ _ OBJECT-ORIENTED _______ _ ANALYSIS ______ _

Two key concepts define object orientation: encap- sulation and inheritance. Encapsulation is a prop- erty of programs that describes the complete integration of data with legal processes relating to the data. In addition, encapsulated objects have pub- lic and private selves (see Figure 11-1). The public part of an object defines what data are available in the object and the allowable actions of the object. The private part of an object defines local, object- only data and the specific procedures each action takes.

The second major property of object orientation is inheritance. Inheritance is a property that allows the generic description of objects which are then reused by related objects. Objects are grouped into classes that are defined as like objects that have exactly the same properties, attributes, and pro- cesses. Object classes are arranged in hierarchies of relationships. Within a hierarchy, objects at lower

459

460 CHAPTER 11 Object-Oriented Analysis

Class/Object: Customer

CustomerPhone CustomerName

Processes: Add Update Delete

Public Parts

~ Object Name

Attributes

Processes

Public Parts Private Parts

FIGURE 11- 1 Encapsulated Object: Public and Private Parts

levels inherit the data and processes of the superior classes. Hierarchies can also be linked to form lattice-like networks of hierarchies of objects.

An example of an object class is employees (see Figure 11-2). Each employee has a name, address, social security number, and so forth. Some employ- ees are also managers. Managers are a subclass of the employee class. By subclass, we mean that man- agers have the same properties as employees (be- cause they are employees), and that, in addition, they have additional properties that only managers have. Managers might have an additional subclass of man- agers who are on a management committee. The

management committee subclass is said to have multiple inheritance because it inherits the proper- ties, attributes, and processes of employees and managers as well as having its own.

Object orientation is an approach to thinking about problems that, when properly applied, repre- sents a substantive improvement in the resulting analysis, design, and code modules. For 30 years, we have known that the key goal of software engineer- ing is to manage the complexity of the problems we automate. We have also known that the best way to manage complexity is to decompose the larger prob- lems into intellectually manageable, small tasks, that hide their internal workings from other modules, and that are coupled only by communicating messages. 1

These are the goals of analysis and design that lead to well-structured and well-formulated programs and modules. Object orientation, when properly applied, appears to come closer to automatically resulting in these desirable outcomes than other ways of thinking.

Thinking in objects requires a paradigm shift. A paradigm is a generally agreed upon way of thinking about a situation. In the process methods we concen- trate on functional thinking, or the steps taken to perform some procedure. In data methods, we con- centrate on entity thinking, or the data objects and their interrelationships that dictate much process- ing. Entity thinking is a difference in degree rather than a difference in kind-a foreground/background shift. We move from processes that change data to emphasizing data that require processing (see Fig.., ure 11-3).

1 See the works of CAR Hoare, David Parnas, Nicklaus Wirth, and Edsger Dijkstra. In particular, the discussions are summa- rized in the following references: Hoare, C. A. R., "The Emperor's Old Clothes," Dijkstra, Edsger, "The Humble Pro- grammer," both in AMC Turing Lecture Awards, NY: ACM Press and Addison-Wesley, 1987, and Parnas, David, "A Technique for Software Module Specification with Exam- ples," Communications of the ACM, Vol. 15, #5, May, 1972, pp. 330-336; Parnas, David, "On the Criteria to be Used in Decomposing Systems into Modules," Communications of the ACM, Vol. 15, #12, December 1972, pp. 1053-1058; and Wirth, Nicklaus, "Program Development by Stepwise Refine- ment," Communications of the ACM, Vol. 14, #4, April 1971, pp.221-227.

Employee

• l 1 Manager

Intramural Baseball Team

Management Committee

FIGURE 11-2 Example Object Class Hierarchy

Process Methodologies

Function = Group of Activities Describing Business Processes

Definition of Object-Oriented Terms 461

In object thinking, we can identify data and pro- cesses somewhat independently, but they are mar- ried early on and must be thought of together, forever after, to reason properly about their behav- ior and contents. The paradigm shift to object think- ing is from thinking of data and processes as separate to thinking of data and processes as one.

Several times in this discussion, we have men- tioned the term "if properly applied." Object orien- tation is no different than any other methodology in that it requires consistency and correct reasoning to result in the desirable properties described. When improperly applied, object orientation results in a badly designed application that might actually be less efficient than the same application designed poorly using some other methodology.

DEFINITION OF ____ _ OBJECT-ORIENTED ___ _ TERMS ______________ _

Object orientation is based on the notion of objects which encapsulate both data and processes on that

Data Methodologies

Entity = Class of Business Thing which the Application tracks

c::::J-tO------

Entity- Relationship

Diagram

Entity = Bus. Entity Relationship = Bus.

Constraint

FIGURE 11-3 Process and Data Methodologies as Flip Sides of the Same Paradigm

462 CHAPTER 11 Object-Oriented Analysis

data. An object is an entity from the real world whose processes and attributes (that is, the data) are modeled in a computerized application.

Processes are variously called functions, actions, services, programs, methods, properties, or modules; these terms mayor may not have the same meaning to the people using them. For that reason, we stick to the term process to mean the transformational pro- gram language code that acts on its object data.

An abstract data type (ADT) is the name used in some languages (e.g., C) for the new, user-defined data type that encapsulates definitions of object data plus legal processes for that data. In this text, we use the terms encapsulated object, object, and abstract data type interchangeably.

The major analysis activities focus on defining objects, classes, and processes. Class/objects are the lowest level of logical design entity. Class/objects define a set of items which share the same attributes and processes, and manage the instances of the col- lection. The class defines the attributes and pro- cesses; the objects are the instances of the class definition.

There are different types of class-object relation- ships. First, classes can occur without having any real data associated with them. Classes whose instances are other classes are called meta-classes. For instance, we might define a class Customer with subclasses for CashCustomer and CreditCustomer. The class is a meta-class; the subclasses are class/ objects which manage the data of Customer.

Classes can be composed of class/objects to describe a composition relationship of whole and part. A whole class defines the composed object type. The part class defines all the components of the whole class. For instance, a car, as a whole class, contains parts that include motor, wheels, doors, seats, and so on.

Classes can also be defined to allow specialized versions of an item. The meta-class is called a gen- eralization class, or gen class for short. The sub- classes are called specialization, or spec, classes. A generalization class defines a group of similar objects. For instance, vehicle is a generalization on car. The specialization class is a subclass that reflects an is-a relationship, defining a more detailed

description of the gen class. For instance, a car, truck, or tank are all specializations of the general class vehicle. These could be further specialized themselves. For instance, car could have specializa- tions by type car: full-size, mid-size, or economy.

Each type of class and its subclasses form a hier- archic, lattice-like arrangement of relationships. Through the relationships, the lower-level classes inherit the data and processes of the related higher- level classes. Thus, if we were to refer to an econo- myCar object, we would have information and processing for vehicles, cars, and economy cars all available.

Messages are the only legal means of communi- cations between encapsulated objects. Messages are clear in their intention but not clear in their imple- mentation, which is completely determined by the language (see message types in Figure 11-4). For instance, at the moment Ada does not imple- ment message communication. In this text, a mes- sage is the unit of communication between two ob- jects. Messages contain an addressee (that is, the object providing the process, also called a service object), and some identification of the requested process.

A major difference between object orientation and other methodologies is the shifting of responsi- bility for defining the data type of legal processes from supplier (or called) objects to client (or calling) objects. This shift, along with the notions of inheri- tance and dynamic binding, support the use of poly- morphism, which is the ability to have the same process take different forms when associated with different objects. Dynamic binding is a language property that selects actual modules to execute dur- ing application operation. The concept is completely described in Chapter 12.

A supplier object is one that performs a re- quested process. A client object is one that requests a process from a supplier. For instance, I might need to have a date translated from month-day-year format to year-month-day format. As a client object, I request the translation of the supplier object and pass it the date to translate. If the language supports polymorphism, I also pass the data type of the date to be translated.

Object-Oriented Analysis Activities 463

Unary Message: Addressee Service Identifier

Customer : Create

Binary Message: Addressee Service Identifier Arguments

ComputeTotal PastDueFees, CurrentFees

Keyword Message: Addressee Service Identifier Keyword Expression(s)

FIGURE 11-4 Example of Message Types

An example of polymorphism is, for instance, a process to perform comparison of two items to iden- tify the 'larger' of the two. One object might be alphabetic, requiring a logical comparison; another object might be decimal numeric, requiring a nu- merical comparison; a third object might be an array, requiring numerical array comparisons. This poly- morphic object has three implementations of its process to compare and determine the larger of two items. The client object requests a specific comparison process, here either alpha, numeric, or array.

To summarize the terms, objects are encapsula- tions of data and processes that have both public and private parts. Objects can communicate via mes- sages which differ by language. Objects are arranged into classes of similar objects, and can belong to more than one class. By the property of inheritance, an object exhibits the attributes and provides the ser- vices of the classes of which it is a part. Polymor- phism is a desirable property of objects but requires a client-server view of objects along with dynamic binding capabilities.

Field=Dateln, DataType=lnteger

OBJECT-ORIENTED ___ _ ANALYSIS ______ _ ACTIVITIES ______ _

The documentation for object-oriented analysis 2

includes a series of tables and graphics (Figure 11-5). The tables are lists that document individual components of the analysis--objects, processes (and their assignment to objects), attributes, and classes. The graphics show relationships between objects and object classes, state transitions of intraobject changes in the application, and time-ordering interobject-event processing. Each documentation representation is elaborated by tracing the object- oriented analysis of ABC Video's rental processing system.

2 The analysis documentation builds primarily on the work of Booch [1983, 1991] and Berrard [1985]. The Class diagrams, subject summary, gen-spec and whole-part diagrams are all from Coad and Yourdon, 2nd ed. [1990].

464 CHAPTER 11 Object-Oriented Analysis

Summary Paragraph

TableslLists

Object List

Process List

Object-Attribute List

Process-Attribute List

Diagrams

Object Relationship Diagram

Class Hierarchy Diagram

Generalization/Specialization Structure Diagrams

Whole/Part Structure Diagrams

Subject Summary Diagram

State Transition Diagram

Provides a brief summary of all major functions to be performed.

Contains potential objects (nouns) from the paragraph. Each entry is evalu- ated to determine that it is an object, to classify it as solution space or problem space related, and to assign it a unique, formal name.

Contains potential processes (verbs) from the paragraph. Each is evaluated to determine that it is a process, to classify it as solution space or problem space related, and to assign it a unique, formal name. All solution space class/ objects are tentatively related to processes and the relationships are evaluated.

Contains field name attributes with each object they describe. Each class/ object's entries are normalized and other class/objects are created as needed.

Contains formulae, constraints on processing, and state/status changes for each process as required; some processes have no attributes.

Identifies objects with connecting lines showing different types of interobject relationships.

Shows objects arranged in one or more lattice hierarchies to link shared data/processes and to depict inheritance of those data/processes.

Depicts objects which express is-a relationships. This diagram is optional.

Depicts objects which are compositions for which the whole class is composed of one or more of the part subclasses. This diagram is optional.

The highest level of independent classes or class/objects in each leg of a hier- archy are promoted to subjects for inclusion in this diagram which provides a summary of the classes in the application. This diagram is optional.

Contains system states (i.e., statuses) and the events (process outcomes) that cause those states to exist.

FIGURE 11-5 Summary of Object-Oriented Analysis Documentation

Develop Summary Paragraph Rules for Summary Paragraph

The first, and most important, step of object-oriented analysis is to develop a single summary paragraph describing the problem. The purpose of the para- graph is to focus your attention on the most concrete, yet high-level description of the problem. Hidden within a good summary are the main class/objects and the main processes to be provided by the appli- cation. In a large application, development will be iterative with a series of more detailed summary

paragraphs developed to elaborate the individual sentences from a summary. In a smaller problem, like ABC Video's, we only need one level of summary.

The guidelines for writing the paragraph are as follows:

1. Write only declarative sentences of the form: Noun-Verb Noun-Verb-Object Verb-Object

2. For ease of quality assurance, write each sen- tence on its own line.

3. Review the paragraph carefully to ensure:

• All desired functions are represented. • All major information and processes are

identified. • All sentences are at the same level of

abstraction, detail, and importance.

These are guidelines because the development of the paragraph is an individual activity performed by the SE with the user, and specific to each application. It is one result of interviews and other data collections that take place before and during analysis. Object orientation assumes that you have the requirements for the application in hand and understand what the application is supposed to do. 3 There are no graphi- cal representations for paragraph information.

ABC Video Example Paragraph

Refer back to Chapter 2 for the description of ABC Video's rental processing requirements. The initial paragraph reads:

Customers select one to n videos for rental. Customer phone number is entered to retrieve customer data and create an order. Bar code IDs for each tape are entered and video information from inventory is displayed. The video inventory file is updated (decrease the count of available copies by one). When all tape IDs are entered, the system computes the total. Money is collected and the amount is entered into the system. Change is computed and displayed. The rental is cre- ated, printed, and stored. The customer signs the rental form, takes the tape( s), and leaves. To return a tape, the video Bar Code ID is entered into the sys- tem. The rental is displayed and the tape is marked with the date of return. If past-due amounts are owed, they can be paid at this time; or the clerk can select an option which updates the rental with the return date and calculates past-due fees. Any outstanding video rentals are displayed with the amount due on each tape and a total amount due. The past-due amount must be reduced to zero when new tapes are taken out.

3 Lorenz [1993] recommends the development of 'use cases' which track all variations of each transaction through its pro- cessing. This is, in essence, what you do in interviews with users during a normal data collection activity.

Object-Oriented Analysis Activities 465

1. Customers select one to n videos for rental. 2. Customer phone number is entered to retrieve

customer data and create an order. 3. Bar code IDs for each tape are entered and video

information from inventory is displayed. 4. The video inventory file is updated (decrease the

count of available copies by one). 5. When all tape IDs are entered, the system com-

putes the total. 6. Money is collected and the amount is entered into

the system. 7. Change is computed and displayed. 8. The rental is created, printed, and stored. 9. The customer signs the order form, takes the

tape(s), and leaves. 10. To return a tape, the video Bar Code 10 is entered

into the system. 11. The rental is displayed and the tape is marked with

the date of return. 12. If past-due amounts are owed, they can be paid at

this time; or the clerk can select an option which updates the rental with the return date and calcu- lates past-due fees.

13. Any outstanding video rentals are displayed with the amount due on each tape and a total amount due.

14. The past-due amount must be reduced to zero when new tapes are taken out.

15. For new customers, the customer information is entered into the system and added to the customers.

16. For new videos, the video information is entered into the system and added to inventory.

FIGURE 11-6 Initial Paragraph in Numbered Sentence Format

For new customers, the customer information is entered into the system and added to the customers. For new videos, the video information is entered into the system and added to inventory.

The paragraph is reformatted as a numbered list of sentences (see Figure 11-6). This numbered sen- tence format is recommended because it simplifies discussion, quality assurance, and reviews.

Once the paragraph is drafted, you examine each sentence carefully to make sure all the pertinent information is present and clearly stated. In this paragraph, there is confusion about a 'new order' in sentence 2 and an 'outstanding video rental' in

466 CHAPTER 11 Object-Oriented Analysis

sentence 13. You ask yourself, What do we mean by an 'order'? If you do not know, you may need to ask the client what he means by an order.

Vic wants an order to have information that is linked to video information whenever customers have any videos out on rent, that is, they are an 'active' customer. An order should contain informa- tion about all current rentals, dates returned, and late fees. Any other fees owed, for instance, penalties assessed for late payment, should also be present until they are paid. In other words, Vic uses the word order to describe what we have termed a rental. This confusion is cleared up immediately because differ- ent words for the same items always cause confu- sion. Vic does not mind changing the term order to rental. He uses the term order because he thinks his business is similar to order-entry processing which he managed in an old job. The major differences between these two activities is that Vic has a cash business and order-entry applications are usually used in accrual accounting businesses that link to accounts receivable accounting. Vic is correct; there is similarity between rentals and order processing, but the term rental fits this particular business and will be used.

To be consistent in the use of terms, we modify sentence 2 to read:

2. Customer phone number is entered to retrieve cus- tomer data either to create a rental or to retrieve active rentals.

This change also implies a status for rentals of 'active' or 'inactive' which we will need to further clarify.

The term video information from inventory in sentence 3 should be more specific. Knowing the actual fields to be displayed will be helpful in the class analysis and in attribute definition. Upon fur- ther conversation with Vic, you change the informa- tion to read:

3. Bar code IDs for each tape are entered. 3a. Video name and rental price from inventory are

displayed.

The next unclear issue is: When is money col- lected for new rentals? Can a customer rent a video, pay past-due fees, and pay for the current video

rental upon its return? Again, we go back to Vic, the client, and ask him what he wants.

Vic says, "I would like as little bureaucracy as pos- sible in this system. Since 80% of videos are returned on time, I want new rentals paid in advance-when they are rented. About 90% of my customers return their videos through a slot in the door during nonworking hours. Any videos that have late fees are checked in, and a note of past-due fees must be made.

"For legal reasons, I must be able to prove how past-due fees are derived. To meet this obligation, the past-due fee amount, rental date and return date must all be maintained.

"Also, I do not want to encourage 'dead- beats' who do not pay for their rentals, so I insist that any outstanding fees be paid before any new rentals."

With the above information supplied by Vic, we evaluate the sentences dealing with payments. Al- though they remain somewhat ambiguous, they would be sufficient if we chose not to change them. The information is clearer if sentences 13 and 14 are moved between sentences 2 and 3 and are renum- bered 2a and 2b for the present.

One remaining ambiguity might be computations for the 'total' and 'change.' If the computations are understood, they are not required in the paragraph. We do not need the computations for the paragraph, but we do need it soon. So, if the computations are not understood, you again go back to Vic and ask how the computations are performed.

Vic: "There are two basic totals: one for set- tling past-due fees and one for the current rental. They may be computed together as the rental total equal to the sum of all past- due items, fees, taxes, and current rentals. Change is computed as the rental-total less amount paid."

Vic's definition of the rental-total raises a new question about the paying of late fees and sentence 2b. If past-due fees must be settled before any cur- rent rentals are allowed, how can you add the infor- mation together to create the rental-total?

Old# New #

2. 2.

2a. 3.

2b Note

3. 4.

3a. 5.

5. 6.

6. 7.

7. 8.

10.

11.

4. 12.

8. 13.

Object-Oriented Analysis Activities 467

Sentence

Customer phone number is entered to retrieve customer data either to create a rental or to retrieve an active rental.

Any outstanding video rentals are displayed with the amount due on each tape and a to- tal amount due.

The past-due amount must be reduced to zero when new rentals are made.

Bar code IDs for each tape are entered.

Video name and rental price from inventory are displayed.

When all tape IDs are entered, the system computes the total (= L past-due fees + L other fees + L current video rental fees).

Money is collected and the amount is entered into the system.

Change is computed (= amount entered-order-total) and displayed.

If the change amount is negative, that is, the customer did not pay for all fees, the clerk asks for more money.

If the customer gives the clerk more money, return to step 7, else, when the clerk presses an order complete key, the system 'pays-off' the fees on a first-in-first-paid order until the amount entered is used up. The rental is redisplayed. Past-due items 'paid-off' are marked paid and the status of the current video rentals are either paid or due.

If the amount entered paid for one or more current rentals, they are updated as paid and the videos are given to the customer; else when the clerk presses the rental complete key again, the current rentals not paid for are removed and placed back in stock.

When the clerk presses a rental complete key (to be defined by the system), this order is complete and the video inventory file is updated (decrease the count of available copies by one).

The rental is stored and printed.

FIGURE 11-7 Partially Renumbered Paragraph

"Oh," says Vic, "I meant that the clerk should not give the customer the video tapes until all of the past-due fees plus current rental fees are paid. They can still process the current rentals on the computer at the same time. Remember, my motto is no bureaucracy."

This new information does change at least the order of sentences 2 through 8 (see Figure 11-7). At the end of the paragraph, add the following so the in- formation is not lost.

2b. NOTE: The amount paid less change must be equal to the rental-total or the clerk should politely refuse to give the customer the current tapes.

The new sentences 9, 10, and 11 add needed information to our understanding of the problem, but now they are at a different level of detail from the other sentences. They constitute processing that accompanies change. So, to keep the level of ab- straction consistent, they should be removed from this paragraph and kept for use during the next iter- ation of change processing. To indicate that other steps are needed to process change, modify sentence 8 to read:

8. Change is computed (= amount-entered-rental- total), displayed, and further processed by the clerk as required.

At the moment, the final paragraph for ABC Video's rental processing system should read like the

468 CHAPTER 11 Object-Oriented Analysis

one in Figure 11-8. All major functions, data entities, information sources, and destinations are identified. All sentences are at the same level of abstraction, detail, and importance.

Identify Objects of Interest Rules for Identifying Class/Objects

The next step is to identify and analyze all of the class/objects of interest. The items are called class/ objects because they identify a collection (class) of like instances (objects). The rules are summarized here:

1. Underline all nouns in the summary paragraph.

2. List the underlined verbs on a separate sheet of paper, using the exact same sequence and spelling as in the paragraph.

3. Evaluate each noun to make sure it is an object. (Common errors are to include attrib- utes objects, that are not of interest to the solution of this problem, or physical objects we do not keep information about).

4. Determine whether the object is in the solu- tion space (must be present both to describe the problem and to develop a solution) or the problem space (must be present to describe the problem).

5. Name each unique object in the solution space. Ignore the processes in the problem space. Use the convention '=name' to iden- tify duplicates of already named objects and to show that you know it is a duplicate.

The mechanics of the identification are to underline the nouns in the paragraph. Once the underlining is done, make a list of the nouns on a separate sheet of paper. When making the list, keep the nouns in exactly the same sequence as they occurred in the paragraph and use exactly the same spelling as occurred in the paragraph!

Next, evaluate each noun to make sure it is an object. Evaluate similar criteria for identifying enti- ties in the data methodology: people, places, events, applications, organizations, or other abstractions about which the application must keep information

To rent tapes,

1. Customers select one to n videos for rental. 2. Customer phone number is entered to retrieve

customer data either to create a rental or to

retrieve an active rental.

3. Any outstanding video rentals are displayed with the amount due on each tape and a total amount

due. 4. Bar code IDs for each tape are entered.

5. Video name and rental price from inventory are

displayed. 6. When all tape IDs are entered, the system com-

putes the total (= L past-due fees + L other fees + L current video rental fees).

7. Money is collected and the amount is entered into

the system.

8. Change is computed (= amount entered - order- total), displayed, and further processed by the clerk as required.

9. When the clerk presses an 'order-complete' option

key (to be defined by the system), this rental is complete and the video inventory file is updated

(decrease the count of available copies by one). 10. The rental is stored and printed. 11. The customer signs the order form, takes the tape,

and leaves.

To return a tape,

12. The video bar code 10 is entered into the system. 13. The rental is displayed and the tape is marked

with the date of return. 14. If past-due amounts are owed, they can be paid

at this time; or the clerk can select the 'order- complete' option which updates the rental with the

return date and calculates past-due fees.

To add a customer:

15. Enter customer information.

16. Create customer.

To add a new video:

17. Enter video information.

18. Create video inventory.

NOTE: The entire amount owed must be paid before any rentals are allowed. That is, the amount paid less

change must be equal to the rental total or the clerk should politely refuse to give the customer the current

tapes.

FIGURE 11-8 Final Paragraph for ABC Order Processing

Object Name ... -

Attributes ...

Processes ...

FIGURE 11-9 Class/Object Diagram Format

or for which processing is required. If the items in the list fit any of these criteria and pass the other tests, keep them on the list.

There are no hard and fast rules for this process, only heuristics or rules of thumb. Ask yourself the following sets of questions. Does the noun identify something from the real world you want to store in- formation about? If so, keep going. If not, it is not an object in this system, so cross it off.

Does the noun identify something that takes on values itself, for instance, a social security number, balance, or rental total? If so, these are attributes (or fields) describing an object. Cross them off this list and put them on a list of attributes somewhere. If not, then keep going.

Does this name uniquely identify a set of things with the same attributes? If so, keep going. If not, if it identifies one unique thing, it may still be an object but you should look for commonalities and combine with some other class/object.

Once you have crossed off all nonobjects in this application, you are ready for the next analysis on objects: Determine if it is in the problem space or in the solution space. The problem space includes objects that are required to describe the problem but are not required to describe the solution. For in- stance, you might need to know something about IRS reporting requirements to properly define the length of time you need to keep an accounting file of transactions. But the IRS does not factor into the solution, nor do you keep any information about the

Object-Oriented Analysis Activities 469

Class/Object: - ~ Customer

CustomerPhone CustomerName

- CustomerAddress CustomerCreditRating

Processes: - Add Query

Update CheckCredit Delete

IRS in the application. In this example, the IRS would be a problem space object.

The solution space includes objects that are required both to describe the problem and to de- velop a solution. In ABC Video, 'customer' is nec- essary to both the problem definition and to the automated application solution. So, it is in the solu- tion space.

When you are done evaluating all entries in the list, the solution space objects are given a class/ object name by which they are known for the life of the application. During this step, we eliminate dupli- cates of each object. By convention, the name in the list is entered as either ObjectName or =Object- Name. The format ObjectName identifies a unique class/object. The format =ObjectName identifies a synonym of a class/object. The =ObjectName ensures quality assurance reviewers that you have accounted for all objects and have considered every entry on the list.

Finally, a class/object diagram is begun. A class/ object is a collection of like things in a class; the objects are the individual instances of the things in the class. Class/objects are drawn as a rounded verti- cal rectangle with a shadow rectangle. The class/ object is divided into three parts to depict the name, attributes, and processes (see Figure 11-9). The three areas identify public information relating to the class/object. Eventually other details are added for private information during design. Now, let us return to ABC's application to develop the object list.

470 CHAPTER 11 Object-Oriented Analysis

ABC Video Example Object List

First, we underline the nouns from the paragraph (see Figure 11-10). Objects represent people, organi- zations, events, applications, or other abstractions from the real world about which we need to keep information. These are all identified by nouns. The underlined nouns represent all of the potential objects from the paragraph. If the paragraph is com- plete, this action should result in the identification of all major objects relating to the application.

Next, list the objects exactly as they are spelled and ordered in the paragraph. The first-cut object list is shown in Figure 11-11. The dispositions for each object are discussed here.

The first analysis is to eliminate attributes from the list. In the first-cut object list, attributes are crossed out and their respective objects are listed. Attributes change value for each related object instance. To identify an attribute, we ask, Can this name take on a value? If the answer is yes, it is an attribute. Attributes are set aside for use in a future step.

Figure 11-11 shows Rental attributes includ- ing AmountDue, TotalAmountDue, RentalTotal, Amount, and Change. Attributes of Videos on Rentals include RentalPrice, ReturnDate, and Past- DueFees. Video attributes include BarCodeld and VideoName. Finally, PhoneNumber is an attribute of Customer.

Next, we evaluate remaining nouns to determine if they are objects. The nouns that are clearly objects are the following:

customers videos rental (4 times) tape (4 times) money clerk (3 times) video inventory file rental form system

The objects in the above list do not take on values of their own. They are material and distinct, and they are of interest to the application. Therefore, they are objects.

To rent tapes,

1. Customers select one to n ~ for rental. 2. Customer phone number is entered to retrieve

customer data either to create a mn.tal or to retrieve an active rental.

3. Any outstanding video rentals are displayed with the amount due on each :tape. and a total amount

4. Bar code IDs for each:tape. are entered. 5. Video name and rental price from inventory are

displayed. 6. When all ~ are entered, the ~ com-

putes the rental total (= L past-due fees + L other fees + L current video rental fees).

7. ~ is collected and the ammmt is entered into the~.

8. ~ is computed (= amount entered - order- total), displayed, and further processed by the ~ as required.

9. When the ~ presses a 'rental-complete' option ~ (to be defined by the system), this mn.tal is complete and the video inventory file is updated (decrease the count of available copies by one).

10. The mn.tal is stored and printed. 11. The customer signs the rental form, takes the

:tape., and leaves.

To return a tape,

12. The video bar code 10 is entered into the ~. 13. The mn.tal is displayed and the :tape. is marked

with the date of return. 14. If past-due amounts are owed, m can be paid

at this time, or the ~ can select the 'rental: complete' option which updates the mn.tal with the return date and calculates past-due fees.

For new customers,

15. Enter customer information. 16. Create customer.

For new videos,

17. Enter video information. 18. Create Yi.d.e.Q.

FIGURE 11-10 Underlined Nouns

At this point we are not concerned that there are duplicates on this list, or that we will not keep auto- mated information about all entries on this list. The less obvious, remaining entries we need to eval- uate are:

Object-Oriented Analysis Activities 471

Noun from Paragraph Disposition Noun from Paragraph Disposition

Customers Object rental Object videos Object customer Object b~stSFRSF I3l:isRs R~FRBSF Attribute of Customer, rental form Object

Rental tape Object customer data Object isss QeF bSSS IQ Attribute of Video, VOR rental Object ~ What we are creating active rental Object rental Object outstanding video rentals Object tape Object tape Object sets sf FSt~FR Attribute of Video on tstel eFRS~Rt s~s Attribute of Rental Rental QeF ssss IQs Attribute of Video, 13est el~s eFRS~Rts Attribute of Rental,

VideoOnRental (VOR) VOR tape Object tI:Ie;' (meaning Attribute of Rental "ielss ReFRS Attribute of Video past due amount) FSRtel 13FiSS Attribute of Video, VOR clerk Object

~ Attribute of Video, VOR 'FsRtel 8sFR13lets' Event trigger system Object ~ FeRtel tstel Attribute of Rental rental Object Money Object FSt~FR elets Attribute of Video on ~ Attribute of Rental Rental

~ What we are creating 13est el~s fsss. Attribute of Video on

~ Attribute of Rental Rental clerk Object 8~stSFRSF iRfsFFRetisR All attributes of clerk Object customer 'FsRtel 8sFR13lsts' Event trigger customer Object

s13tisR I(s~ isss iRfsFFRetisR All attributes of video rental Object Video Object video inventory file Object

FIGURE 11-11 Initial Object List for ABC Rental Processing

active rental outstanding video rentals 'rental complete' option key (2 times) customer information video information

'Active' is an adjective describing a state of a rental. As soon as we say describing we know this is an attribute of some sort. The allowable states most probably are 'active' and 'inactive,' in which case this is the status of a rental, an attribute. We may want to reevaluate what an active/ inactive rental is to make sure this is correct. Active, in the sense used here, appears to mean open rental with rentals, based on the paragraph. Then inactive would imply no rentals outstanding. If this

status were to remain in the application, it would be appropriate to change the wording to be more pre- cise to open/closed rental. At some point, the analy- sis should be reviewed with Vic. So, for the active rental issue, for instance, we might ask Vic the following:

We have talked about active rentals. Does active really mean an open rental? If not, what other kinds of rentals are there? If yes, do we need to keep that sta- tus separate or is it implicit? For instance, is an open rental any for which a rental is not returned or is returned with late fees owed?

The next action on active rentals is based on the answers to these questions. Vic decides that active does mean open rentals and that a specific status is

472 CHAPTER 11 Object-Oriented Analysis

not required as long as he has access to open rental infonnation.

Outstanding video rentals is also an adjectival description of videos on a rental that appears to be a status. Other statuses of videos on rentals that we might identify so far are combinations of:

outstanding/returned on-time/late paid/not paid.

We note these for the attribute list and eliminate them from further discussion here.

Last is the rental complete option key. This is a noun phrase describing an implementation detail- a key on the keyboard to be pressed to indicate the end of rental processing. It is not an object because it has no attributes, and we do not keep data about it in the application. It is an event trigger that will initiate some processing, but it does not enter into this level of analysis so it is eliminated from the object list.

Last are customer information and video infor- mation. These two items are similar in that they both reference a collection of attributes describing two entities. As such we could either list their attributes (then omit them from the list because they are

attributes) or call them objects. We opt for calling them 'collections of attributes' and eliminating them from the object list.

Now we return to the objects we did find to decide if they are in the problem space or the solu- tion space. Problem space objects are required to describe the task domain but not to develop an auto- mated solution. Solution space objects are required to describe both the task domain and the automated solution. Once problem space objects are identified, they drop out of the remaining analysis. We de- cide which space each object describes (see Fig- ure 11-12).

The last stages are to name each object with a unique name by which it will be known in the sys- tem and to eliminate duplicate names for the same object. When we find a duplicate, we indicate the name by an equal sign ('=' ) appended to the front of the name to signify that the name already ap- peared once.

During this exercise, we have two options for dealing with repeating infonnation and relationship objects which describe one-to-many relationships. We can define them for later nonnalization or we can define them as fully as possible now. We opt for more completeness now because it usually means

Object Space Justification

Customers

Video

Rental

Tape

Money

Clerk

Video Inventory File

Rental Form

System

3 S, 1P

Need automated customer information

Need automated video information

Need automated rental information

Three references are tape information to be maintained in the system. One reference is to the tape taken home by customers; this reference is in the problem domain.

Real money is outside of the system. We are concerned with the amount which is data entered into the system and related to rental.

We do not keep statistics or other information on clerks in the system.

Need automated video information.

Just a different media than 'rental' ... not relevant by itself to the solution.

This is irrelevant because 'system' is what we are building.

FIGURE 11-12 Object Space Justification

Noun from Paragraph

Customers

videos

rental

active rental

outstanding video rentals

tape

rental

video inventory file

rental

tape

rental

customer

video

Solution or Problem Space

ObjecCName

Customer

Videolnventory

Rental

=Rental

VideoOn Rental

= VideoOn Rental

= VideoOnRental

=Rental

= Videolnventory

=Rental

= VideoOn Rental

=Rental

=Customer

= Videolnventory

FIGURE 11- 13 Object List for ABC Rental Processing

less reworking later. For example, a rental has one or more related videos. We could define both of these as 'rental,' or we could define Rental and VideoOn- Rental separately. We opt for the normalized form because it results in a more complete analysis. This results in four class/objects: Customer, Rental, VideoOnRental, and Videolnventory.

Figure 11-13 shows the class/objects from this analysis in their final form (for this step). Notice the objects are still in order by their sequence in the paragraph, all have a space designation, and all solu- tion space objects are named.

Finally, we depict class/objects from this list. We switch from the term object to the term class/object to acknowledge both the shared attributes and pro- cesses and the instantiation of them. ABC has four class/objects corresponding to Customer, VideoOn-

Object-Oriented Analysis Activities 473

Rental, Rental, and Videolnventory. The four class/ objects are depicted in Figure 11-14 for further elab- oration in future steps. Information that we know at this point is also in the diagram.

Identify Processes Rules for Identifying Processes

The next step is to identify processes. The rules for identifying processes are summarized as follows:

l. 2.

Circle all verbs in the summary paragraph. List the circled verbs on a separate sheet of paper, using the exact same sequence and spelling as in the paragraph. Evaluate each verb to make sure it is a process. (Common errors are to include sta- tus, physical actions, or comments.) Determine whether the process is in the solu- tion space or the problem space. Name each unique process in the solution space. Ignore those processes in the·problem space. Use the convention '=name' to iden- tify duplicates of already named processess and show that you know it is a duplicate.

6. Assign objects to verbs if the object is trans- formed by the process or if the object data is read by the process.

7. Evaluate the object assignments:

If there is only one object assigned to a process, continue.

If all objects are read-only, continue.

For processes with more than one object transformation, evaluate the transformation process:

If all processes are exactly the same, and all data types acted on are exactly the same, then mark the process for creation of a reusable module.

If all processes are exactly the same, but all data types are not the same, mark the process for polymorphic module creation.

If all processes are not exactly the same, redevelop the paragraph to more specifically define the processing.

474 CHAPTER 11 Object-Oriented Analysis

Customer

CustomerPhone

VideoOnRental

CustomerPhone BarCodeld ReturnDate LateFeesDue

FIGURE 11-14 ABC Class/Objects

Processes are actions described by verbs. We iden- tify the verbs in the summary paragraph, circling them to distinguish them from the nouns. Once the circling is done, make a list on a separate sheet of paper of the verbs. When making the list, keep the verbs in exactly the same sequence and use exactly the same spelling as occurred in the paragraph!

Then, evaluate each verb to make sure it is a process. Ask yourself if the verb is a process that the application must provide. If yes, keep going; if not, cross the verb off. For instance, if the paragraph said "The clerk enters the customer's phone number into the system," the clerk has been removed as a problem space object. But, the verb enters as applied to the customer's phone number is required data entry to begin the rental entry process. So, enters remains in the system. If we had included the terms To rent a tape or To return a tape in the list, these are summary descriptions of entire procedures

Video Inventory

VideoName RentalPrice VideoCountOfCopies BarCodeld

Rental

CustomerPhone BarCodeld ReturnDate LateFeedDue TotalAmtDue TotalAmtPaid Change

and the verbs rent and return would be excluded as nonprocesses.

After the first evaluation, review each verb again to determine if it is in the solution space or the prob- lem space. The meanings of solution and problem space are the same as for class/objects. Problem space means the process is required to define the problem but not the automated solution. Solution space processes are required both to define the prob- lem and to define the solution.

Next, review each verb carefully and give it a meaningful name. Try to define meaningful process names that indicate both the process and the class/ object on which it acts. So, for enter a customer phone number, the process name might be enter- CustPhone.

For any processes that use the same verb descrip- tor, or that you think are exactly the same, mark with an asterisk for further evaluation in the design phase.

Include an asterisk on processes that work on objects with different data types. Name them the same verb appending a unique identifier for each instance. These unique names make recognizing these pro- cesses in the next step easier. One possible naming convention4 is to describe the situation, such as enterTapeIdRental, enterTapeIdReturn, and enter- TapeIdRenew. The idea is to assign names that you can live with for the entire life of the object and its processes. In design, if these processes are all defined as the same, we simply truncate the names to enterTapeId.

The last step in identifying processes is to assign class/objects to operations. List each object with all processes that use or transform it. When this identi- fication is done, reevaluate all processes with more than one object assignment.

The three questions you ask in this evaluation are summarized in Figure 11-15. First, ask if only one object is actually transformed by this process. If the answer is yes, go to the next process to be evalu- ated. If the answer is no, then continue with the evaluation.

Next, for the processes being transformed, does the exact same processing occur to each object? That is, are the data types and the process steps identi- cal? If the answers to these questions are all yes, no further analysis is required. You have identified a candidate for development as a reusable module. If the answer is no, then you must identify the specific differences with the next set of questions.

Third, are the data types different or identical? Are the processes different or identical? If the data types are different and the process is the same, these process-object combinations are candidates for poly- morphic module creation and sQolllp be noted with an asterisk. If the processes are· different, then you must refine your paragraph to define the specific processes for each object, and redo this part of the analysis from the beginning.

When you have evaluated all of the multi object processes and resolved any inconsistencies, you are ready to perform the next step. Next, we identify the processes for ABC Video's rental application.

4 A convention is a locally agreed upon way to do some activity.

Object-Oriented Analysis Activities 475

1. Is only one object actually transformed by this process?

If yes, this process is complete.

If no, continue.

2. Does the exact same processing occur for each object? This means the same steps and the same transformations.

If no, go to step 3.

If yes, are all object data types the same?

If yes, this process is complete; create one reusable module for this process.

If no, mark for polymorphic module creation.

3. Redefine the sentence(s) to identify the specific processing of each object. Then, reevaluate the processes beginning at step 1.

FIGURE 11-15 Multiobject Process Evaluation

ABC Video Example Process List

The steps we follow here are to circle the verbs, evaluate them as processes of interest, define solu- tion and problem space processes, assign class/ objects to processes and evaluate those object assignments (refer to the summary list on p. 473).

The first step is to return to the paragraph and cir- cle the verbs. Analyze each verb to ensure that it is a process. For instance, if you include in your list the terms 'To rent tapes' and 'To return a tape,' the verbs 'to rent' and 'to return' are omitted from the list because they are identifying the entire process, but are not processes in the system. All verbs in the para- graph are processes. Figure 11-16 shows the verbs circled in the final paragraph.

Next, list verbs and identify their space. Remem- ber, problem space identifies processes needed to describe the problem but not the solution; solution space processes are needed to describe both the problem and the solution. Figure 11-17 identifies the space of each process listing a reason for exclusion of problem space items. The problem space pro- cesses all refer to physical actions which are not tracked by the application. The verb is complete is the only nonprocess in the list. Is complete refers to

476 CHAPTER 11 Object-Oriented Analysis

1. Customer~one to n videos for rental.

4. Bar code IDs for each ~ € enter~ 5. Video name and rental price from inventory ~display®

6. When all tape IDs re entere the system ~he rental to a = L past-due fees + L other fees + L current video rental fees).

9. When the clerk~ 'rental-complete' option key (to efined by the system), this rental com Ie and the video inventory file~pdat~decrease the count of avail- able copies by one).

10. The rental~n~i~~

==.:,;.:.:=~:.:;::o the rental form,~he

To return a tape,

12. The video bar code 10 ~n~r~nto the system.

13. The rental~nd the ~ <i[mark@:>.vith the date of return.

To add a customer:

15.Bustomer information.

16.@ustomer.

To add a new video:

17@ideo information.

18. Svideo inventory.

FIGURE 11- 16 Paragraph with Verbs Circled for ABC Rental Processing

a rental status in the procedure which signals differ- ent processing. This status is an attribute of the pro- cess that we will deal with in the next step.

Next we name solution space processes, eliminat- ing duplicates. Figure 11-18 shows the list of solu- tion processes with names. The duplicate actions are EnterBarCode, DisplayRental, DisplayVideoOn- Rental, RetrieveRental, RetrieveVideoOnRental, and WriteRental.

Several actions deserve further comment. Sen- tence 5 for tape rental says, 'Video name and rental price from inventory are displayed.' This sentence implies that name and prices are retrieved from inventory, so the sentence should be modified to reflect this action. Sentence 13 for tape return is sim-

ilar in saying 'The rental is displayed .... ' The rental cannot be displayed until it is retrieved. The word 'tape' in the same sentence is ambiguous. Does this refer to the VideoOnRental or to VideoInventory? In fact, both are affected by this action. The VideoOn- Rental is updated with the return date and the Video- Inventory is updated to add one to a count of available tapes (the opposite of the action in sentence 9). The sentence should be rewritten to reflect these differences. The new sentence now reads:

13. The rental, related video(s) on the rental, and video(s) in inventory are retrieved and displayed. The return date is added to tpe video( s) on the rental. One is added to the count of available tapes in inventory. Inventory is updated.

Verb from Paragraph Disposition

select P-Customer physical action-delete

is entered P-process (could be more mean- ingful if called, e.g., read- from-terminal)

to retrieve S-process to create S-process to retrieve S-process are displayed S-process are entered S-process are displayed S-process are entered status-attribute computes S-process is collected P-Clerk physical action-delete is entered S-process is computed S-process displayed S-process processed P-Clerk physical action-delete presses P-Clerk physical action-delete is complete status-attribute is updated S-process is stored S-process

Object-Oriented Analysis Activities 477

Verb from Paragraph

printed

signs

takes

leaves

is entered is displayed is marked are owed can be paid

can select

updates calculates enter create enter create

Disposition

S-process

P-Customer physical action- delete

S-process S-process S-process

Rental status-attribute P-optional physical action-

delete P-Clerk physical action-

delete S-process S-process S-process S-process S-process S-process

FIGURE 11- 17 Process Dispositions for ABC Rental Processing

A similar ambiguity is present in sentence 14 which states that' amounts ... owed ... can be paid. ' This process, can be paid, refers to sentences 6-8 in the tape rental process. Because these processes are present, we do not need to change the paragraph, but we must reference those sentences so the actions are clear. Sentence 14 now reads:

14. If past-due amounts can be paid at this time (repeat sentences 6-8 above); else the past-due fees are calculated and the rental is updated.

This new sentence omits the extraneous informa- tion previously present. Both the object list and the process list are reevaluated to reflect these changes. The verbs in sentences 6-8 are also reviewed to ensure identical processing and are added in the proper sequence to the process list. The old verbs are replaced with 'are calculated' and 'is updated.' We review that the nouns from sentences 6-8 and 14 are accounted for in the object list.

The last step is to review the sentences once more, using the object list as reference to assign objects to processes. Figure 11-19 shows the result of this activity. The rule for performing this activity is that any object that is read or acted on by this process is identified.

All processes relating to multiple objects are reanalyzed to determine if they are the same pro- cesses. RetrieveRentalVOR is identified in the fig- ure as requiring two actions which we discuss here. The processes dealing with Rental and VOR take information that is separate and process it as if it were integrated. The Rental information identifies the customer and the VOR describes a video. There is one Rental per transaction and one VOR per video. The question then becomes one of definition: Is it necessary to maintain this Rental, or can it be added to each VOR and eliminated?

As in the other methodologies, the Rental infor- mation and the Customer information are essentially

478 CHAPTER 11 Object-Oriented Analysis

Verb from Paragraph Space Process Name Object Assignment

is entered S EnterCustPhone

to retrieve S ReadCust

to create S Create Rental

to retrieve S RetrieveRentalVOR

are displayed S DisplayRentalVOR

are entered S EnterBarCode

are retrieve S Retrievelnventory

are displayed S Displaylnventory

computes S ComputeRentalTotal

is entered S EnterPayAmt

is computed S ComputeChange

displayed S DisplayChange

is updated S Updatelnventory

is stored S WriteRental

printed S PrintRental

FIGURE 11-18 Named Process List for ABC Video

duplicates. If the company operates on a cash basis and simply needs to know videos outstand- ing for a customer, then we do not need Rental. If the company operates on an accrual basis and needs to be able to exactly reconstruct individ- ual transactions, then we need Rental. Video rental is a cash basis business; therefore, we do not need Rental but we do need to carry its information in VOR.

Next, we consider Vic's potential need to differ- entiate between rentals for a customer or to main- tain information beyond the rental's life. Once again, the software engineers return to Vic to find the answer.

Vic: "I have customers sign a copy of a rental and I keep those. I use them to resolve disputes, to find errors, and to provide accounting records. I don't care how you identify rentals because I don't have a need, at the moment, for any

analysis. I would like to add trend analysis in the future."

From this discussion, we know there is no busi- ness requirement to separate the two objects. A side issue to the decision is whether separation or join- ing of the objects impacts processing time. For ABC, there is no process time impact. If there were an impact, we would probably opt for the faster solu- tion. We could choose consolidation of VOR and Rental to simplify processing. In this case, Rental would be removed from the list and declared in the object list as =VOR. Another option is to leave it as it is. A third option is to think about Rental as Trans- action since attributes, such as TotalAmountDue, apply to a specific grouping of videos for a customer at a point in time. There is no 'right' answer to this question, and we do not have enough information to make a final decision although transaction sounds like an idea we will need in design. For now, we will

Verb from Paragraph Space

is entered S

is retrieved S

is displayed S

is added S

is updated S

can be paid S

are calculated S

is updated S

enter S

create S

enter S

create S

Object-Oriented Analysis Activities 479

Process Name

EnterBarCode

RetrieveRentalVOR

DisplayRental VOR

AddRetDateVOR

Add1toVInv

Updatelnventory

=ComputeRentalTotal =EnterPayAmt =ComputeChange =DisplayChange

ComputeLateFees

WriteRentalVOR

EnterCustomer

CreateCustomer

EnterVideolnventory

Create Video Inventory

Object Assignment

FIGURE 11-18 Named Process List for ABC Video (Continued)

change the name of Rental to TempTrans to reflex this thinking and will revisit the need for this class/ object again during design. There are no other multiobject processes. The final process list is Fig- ure 11-20.

Define Attributes of Objects Rules for Defining Object Attributes

An attribute is a named field or property that describes a class/object or a process. Each object is a collection of attributes which take on values. A set of specific attribute values describes an object or instance. Each object is identified by a primary key which is a unique set of values comprised of one or more attributes. A primary key in object-orientation may not actually be used to identify stored objects; physical addresses are most often used.

To define the attributes of an object, we identify all of the information about objects. First, attributes that were set aside during object definition are now assigned to a class/object. All items from the original object list that we deleted because they were attributes are now listed with the class/objects they describe.

The original description of the project is rechecked to identify any adjectives or adjectival phrases describing nouns that are now objects in the solution space. In our case, we reread Chapter 2's description of the case and rewrite any attributes identified there that are missing from the object list. These attributes are added to the list.

Next, evaluate the rewritten paragraph to find any data requirements underlying what is stated in the paragraph but not already known. For instance, a sta- tus is implied in the statement 'Retrieve all open rentals.' The adjective 'open' implies a status of open/closed. Any qualified class/objects should be

480 CHAPTER 11 Object-Oriented Analysis

Verb from Paragraph Space Process Name Object Assignment-Action

Actions are (R)ead, (W)rite, Data Entry (DE), (D)isplay (P)rocess in memory, (PR)int

is entered EnterCustPhone Customer (DE)

to retrieve ReadCust Customer

to create

s s s s

Create Rental Rental (R)

to retrieve Retrieve RentalVOR Rental (R), VideoOnRental (VOR, R), (NOTE: This requires two dif- ferent actions because the primary keys and read processes are dif- ferent. We are keeping these to- gether for now for simplicity. All processes marked ... Rental VOR fit this requirement.)

are displayed Rental, VOR (D)

FIGURE 11-19 Class/Object Assignments to Processes for ABC Video Processing

evaluated to determine if the qualification is identi- fying an attribute. When evaluating the paragraph, ask what information is needed to perform, docu- ment, or track each action taken. When you identify new information, create attributes for each piece of information.

Next, normalize each set of attributes to third nor- mal form (3NF).5 For any newly normalized sets of objects, any process-object encapsulations should be

5 Recall that nonnalization includes the following: INF-Removal of repeating groups of infonnation 2NF-Removal of partial key dependencies 3NF-Removal of nonkey dependencies. If you have problems with this activity, refer to Chapter 9 to refresh yourself on this activity.

reexamined to determine that they encompass both the original object and new objects resulting from the normalization process.

When all attributes are listed with an object, iden- tify a primary key identifier. A primary key provides a unique identification for the object and is com- posed of one or more attributes. Compare objects to determine if any have identical primary keys. If the answer is yes, consolidate the objects, or change the object with the incorrect primary key. Now, let us walk through attribute identification for ABC.

ABC Video Example Object Attribute List

All items from the original object list that we deleted because they were attributes are first listed with the

Object-Oriented Analysis Activities 481

Verb from Paragraph Space Process Name Object Assignment-Action

is updated S Updatelnventory Videolnventory (P)

is stored S WriteRental Rental, VOR (W)

printed S PrintRental Rental, VOR (PR)

is entered S EnterBarCode VOR (DE)

is retrieved S RetrieveRentalVOR Rental (R), VOR (R)

is displayed S DisplayRental VOR Rental (D), VOR (D)

is added S AddRetDateVOR VOR(P)

is added S Add1toVlnv Videolnventory (P)

is updated S Updatelnventory Video Inventory (W)

can be paid S =ComputeRentalTotal =EnterPayAmt =ComputeChange =DisplayChange

are calculated S ComputeLateFees Rental (P), VOR (P)

is updated S WriteRentalVOR Rental (W), VOR (W)

enter S EnterCustomer Customer (DE)

create S CreateCustomer Customer (W)

enter S EnterVideolnventory Videolnventory (DE)

create S CreateVideolnventory Videolnventory (W)

FIGURE 11- 19 Class/Object Assignments to Processes for ABC Video Processing (Continued)

class/objects they describe. We refer to Figure 11-14 to find those items. A partial list of the attri- butes from our paragraph is shown in Figure 11-21.

Next, we review the Chapter 2 description of the case and rewrite any attributes identified there that are missing from the object list. These attributes are added to the list as shown in Figure 11-22.

Next, we reconsider our paragraph to find any hidden attributes that are implied by other informa- tion such as statuses. We have open and closed rentals, but we might not require a specific attribute for the status. We know a rental is open when it has a RentalDate without a ReturnDate, or when it has late fees owing. We can check those attributes in lieu of carrying a specific RentalStatus attribute. Keeping this attribute requires a judgment call. If junior peo-

pIe are doing the programming, a RentalStatus attribute is simpler. If senior people are doing the programming, either method is acceptable. As a mat- ter of choice, we will carry the RentalStatus to make sure that future maintenance programmers can also easily understand the processing.

Figure 11-23 shows the initial attribute list for each object. We evaluate each, in tum, to determine its completeness and primary key.

Customer6 appears complete in its information required to perform rental processing. VideoOn- Rental is considered next. We know we need a

6 Note that if Rental had been retained, it would have had the same primary key as Order and would have been eliminated in this step rather than the earlier one.

482 CHAPTER 11 Object-Oriented Analysis

Verb from Paragraph Space Process Name Object Assignment-Action

Actions are (R)ead, (W)rite, Data Entry (DE), (D)isplay (P)rocess in memory, (PR)int

is entered S EnterCustPhone Customer, Data entry (DE)

to retrieve S ReadCust Customer

to create S Create Rental TempTrans (R)

to retrieve S RetrieveRentalVOR TempTrans(R), VideoOnRental (VOR,R)

are displayed S DisplayRentalVOR TempTrans (D)

are entered S EnterBarCode TempTrans (DE)

are retrieved S Retrievelnventory Videolnventory (R)

are displayed S Displayl nventory Videolnventory (D)

computes S ComputeTempTransTotal TempTrans (Process)

is entered S EnterPayAmt TempTrans (DE)

is computed S ComputeChange TempTrans (P)

displayed S DisplayChange TempTrans (D)

is updated S Updatelnventory Videolnventory (P)

is stored S WriteVOR

FIGURE 11-20 ABC Final Process List

Customer Phone to tie rentals to customers and a Video ID to tie rentals to inventory. From Chapter 2, we also need rental and return dates. The question is how much fee information we need. Vic supplies the information that he needs to know that regular fees, late fees, or other fees have been paid and the amount of the fee. Therefore, we add those attributes to the list and it also appears to be complete.

The Videolnventory is not normalized. While we are normalizing, we can also evaluate the impact of Vic's nebulous desire for promotions on inventory objects. Refer to Figure 11-23 's list of the fields and definitions relating to videos in inventory. Repeating information is indented. Primary keys of each part of the information are underlined. The 3NF result of normalization is four relations (see Figure 11-24): Videolnventory, BarCodeVideo, VideoPromo, and Promo Video.

VOR(W)

The distinct definition of VideoPromo means we can omit it after this analysis because promotions are a future requirement. The separation of BarCode- Video from Videolnventory means we need to reeval- uate the object and process lists to define related changes. Since Videolnventory and BarCodeVideo are always accessed together, we can just add Bar- CodeVideo to the lists anytime Videolnventory is present. We may w(lnt to consolidate the two objects later in the design, for convenience of processing, if we can accommodate repeating information.

The final object attribute list is shown in Figure 11-25 and omits the VideoPromo Promo Type objects as discqssed ~bove. The attribute list shows the class/objects with their attributes. The process- object figure is £orrected to reflect the new Bar- CodeVideo class/object. The objects are a1l3NF and appear complete for ABC rental processing.

Object-Oriented Analysis Activities 483

Verb from Paragraph Space Process Name Object Assignment-Action

printed S PrintTempTrans TempTrans (PR)

is entered S EnterBarCode TempTrans (DE)

is retrieved S RetrieveVOR TempTrans, VOR (R) Videolnventory (R)

is displayed S DisplayTempTrans TempTrans (D)

is added S AddRetDate Temp Trans VOR TempTrans (P), VOR (P)

is added S Add1toVlnv Videolnventory (P)

is updated S Updatelnventory Videolnventory (W)

can be paid S =ComputeTempTransTotal =EnterPayAmt =ComputeChange =DisplayChange

are calculated S ComputeLateFees TempTrans (P), VOR (P)

is updated S WriteVOR TempTrans, VOR (W)

enter S EnterCustomer Customer (DE)

create S CreateCustomer Customer (W)

enter S EnterVideolnventory Video Inventory (DE)

create S Create Video Inventory Video Inventory (W)

FIGURE 11-20 ABC Final Process List (Continued)

Define Attributes of Processes inference limitations; for example, a prerequi- site of video rental is that all late fees must

Rules for Defining Process Attributes

Attributes of processes define formulae, constraints, or status processing performed by or on processes in the application being developed. In particular, process attributes define:

• how the process is performed in the system (that is, formulae performed by the process, for example, the formula computing change for a video rental)

• status changes resulting from the process exe- cution (for example, a customer changes from an overdue status to a current status when late fees are paid)

• cOI1straints on the process (that is, prerequi- site, postrequisite, time, structure, control, and

be paid).

The steps to define process attributes are similar to those for object attributes.

1. Assign attributes which were set aside dur- ing object or process definition to a class/ object.

2. Review the original problem description and any notes from data collection to find attributes.

3. Review the summary paragraph to find implied attributes, such as statuses a process can take.

We use the original description of the problem and the paragraph to determine process attributes.

484 CHAPTER 11 Object-Oriented Analysis

Object Name

Customer

TempTrans

VideoOn Rental

Videolnventory

Attribute Name

Customer Phone

CustomerPhone BarCodeld Retu rn Date LateFeesDue TotalAmtDue TotalAmtPaid Change

CustomerPhone BarCodeld ReturnDate LateFeesDue

VideoName RentalPrice VideoCountOfCopies BarCodelD

FIGURE 11-21 A Partial List of Attributes from the Paragraph

Status attributes identify state changes due to a process's successful completion. The status attri- butes will, during design, be assigned to a class/ object. The purpose of identifying them with pro- cesses is that they are more obvious and less likely to get lost.

The constraints are identified to ensure that the procedural code generated during design includes the constraints. The formulae are included as process attributes because they provide some of the logic detail that is also included in the process design.

One inadvertent consequence of process attribute identification can be the definition of artificial con- straints on processes. For instance, in the ABC Video rental process, we know that customers must return and pay for prior rentals before taking out new rentals. But consider this situation:

A customer has several tapes on loan. The customer returns all but one video and wants to rent two others. The customer could pay for all past rentals, the new rentals, and late fees up tb the current date for the tape still on loan.

Or the customer could pay for all past rentals and the new rentals. The remaining tape, because it is not returned, is left unchanged.

Both of these solutions might be acceptable, but the first places the prerequisite that' all rental fees be up-to-date' on the customers. This requirement is slightly different than 'all late fees must be paid before new rentals.' The difference is in how late fees are defined; that is, do customers incur late fees when the due date is past the current date or when a video is returned and it has been kept out past the expected return date? In keeping with Vic's edict of the least bureaucracy placed on the customer, the latter definition would be preferred, and he verifies this preference. With this discussion, let us turn to defining the attributes for ABC Video.

Object Name

Customer

Attribute Name

CustomerPhone CustomerLastName CustomerFirstName Customer Address CustomerCity CustomerState CustomerZip CustomerCreditCardType CustomerCCNumber CustomerCCExpDate CreditRating CustEnroliDate

TempTrans CustomerPhone BarCodeld Return Date LateFeesDue TotalAmtDue TotalAmtPaid Change

VideoOnRental CustomerPhone BarCodeld Return Date LateFeesDue

Videolnventory VideoName RentalPrice VideoCountOfCopies BarCodeld TypeVideo Vendor DateReceived

FIGURE 11-22 Additional Attributes from Chapter 2

Object Name

Customer

TempTrans

Attribute Name (Primary key is underlined, Repeating information is indented.)

CustomerPhone

CustomerLastName CustomerFirstName Customer Address CustomerCity CustomerState CustomerZip CustomerCreditCardType CustomerCCNumber CustomerCCExpDate CreditRating CustEnroll Date

CustomerPhone TotalAmtDue TotalAmtPaid Change

BarCodeld Rental Date FeesPaid Return Date LateFeesDue LateFeesPaid FeesDue FeesPaid

Object-Oriented Analysis Activities 485

Object Name VideoOn Rental

Videolnventory

Attribute Name Customer Phone BarCodeld Rental Date FeesPaid

Return Date LateFeesDue LateFeesPaid FeesDue FeesPaid

VideoName RentalPrice VideoReleaseDate VideoCountOfCopies TypeVideo Vendor DateReceived

Promotion T lipe PromoOnDate PromoOffDate Promo Price

BarCodeld BarCodeRentalCount BarCodeRental Days

FIGURE 11-23 Initial Object Attribute List for ABC Rental Processing

ABC Video Example Process Attribute List

First, we list all processes down the left margin of a page (see Figure 11-26). Then, we examine each process to determine whether it is constrained in any way. To identify constraints we return to the origi- nal description of the problem and the final para- graph to determine processing formulae, constraints, and statuses.

The obvious process attributes are the formulae used to compute rental total and to compute change. Each of these are entered in the table (see Figure 11-26). To ensure proper payment processing, a postrequisite that Change be greater or equal to zero is defined. If this postrequisite is not met, payment processing is performed again.

The first entry in the table for RetrieveRental- VOR is a prerequisite that the Customer informa- tion must be retrieved and a Rental able to be devel- oped. If this process is not successful, it is due to a new customer and the EnterCustomer process is initiated.

Several status attributes which were set aside during process identification are defined here. Two statuses were identified for knowing when all video data entry is complete and when all transac- tion processing is complete. Both of these prerequi- site statuses are listed with related processes in Figure 11-26. Notice that for the constrained pro- cesses, we listed the type of constraint and the details of processing relating to the constraint. Also, notice that many processes have no specific attributes.

486 CHAPTER 11 Object-Oriented Analysis

Unnormalized Form

VideoName RentalPrice VideoReleaseDate VideoCountOfCopies TypeVideo Vendor Date Received

Promotion Type PromoOnDate PromoOffDate PromoPrice

BarCodeld BarCodeRentalCount BarCodeRental Days

1NF

VideoName Rental Price VideoReleaseDate VideoCountOfCopies TypeVideo Vendor Date Received

VideoName PromotionType PromoOnDate PromoOffDate Promo Price

VideoName BarCodeld BarCodeRentalCount BarCodeRentalDays

2NF

YideoName PromotionType PromoOnDate

PromotionType promoOnDate PromoOffDate Promo Price

3NF Disposition

Videolnventory

VideoPromo

PromoType

BarCodeVideo

FIGURE 11-24 Normalization of ABC Inventory Information

Perform Class Analysis

Rules for Analyzing Classes

This step is conceptually one of the more difficult steps in object-oriented analysis. It is also crucial to defining the class relationships properly. You have already learned to define entities, relationships, and class hierarchies in Information Engineering, so many of the ideas are not new. What is new is the notion that not just data is inherited: Both data and processes are inherited and considered in this analysis.

The goal is to define classes of class/objects and their relationships. A class defines the attri- butes and processes that are shared by one or more class/objects. All objects are members of at least one class. When multiple objects share attributes, or share processes, we extract the attributes and pro- cesses in common, and create a superset class. The important issue is to ensure that the class does, in fact, relate in exactly the same way to all of the mul-

tiple class/objects. The class has no objects of its own; it is simply identifying shared data and processes.

Classes are similarly evaluated for commonly shared attributes and processes to create layers of classes. The notation for such a relationship is simi- 1ar to that of an entity-relationship diagram with directed arrows indicating the direction of the rela- tionship and small numbers indicating the cardinal- ity (i.e., number) of the relationship (see Figure 11-27). Recall that cardinality can be one-to-one (1:1), one-to-many (1:m), or many-to-many (m:n).

To instantiate means to define the values of a specific occurrence of an object. (Keep in mind that processes are the same for all instances.) For exam- ple, the class/object Customer has one instance object for each customer. At the analysis level, an instance is analogous to a tuple in a relation or a record in a file. In an order entry example, illustrated in Figure 11-28, Customer class has no specific data; it is an abstract class. The Cust class/object instanti- ates, that is, defines the data values for the customer

class. The Order class/object inherits the data and processes in the Customer class.

Inheritance relationships identify shared data and processes. The object at the arrow-headed end shares or inherits from the other object. Inheritance relationships identify hierarchical networks of relationships.

Booch [1991] also recommends the design of classes for class/objects whose data or processes are used by another class/object. For instance, an order uses information about inventory items. Therefore, another class would be created shared inventory information (see Figure 11-29). This notation is the same as for general classes.

A fifth type of class, a meta-class, can also be defined, but is usually developed during design. The

Object Name

Customer

TempTrans

TempTransDetail

Attribute Name (Primary key is underlined, Repeating information is indented.)

CustomerPhone CustomerLastName CustomerFirstName CustomerAddress CustomerCity CustomerState CustomerZip CustomerCreditCardType CustomerCCNumber CustomerCCExpDate CreditRating CustEnroll Date

CustomerPhone TotalAmtDue TotalAmtPaid Change

CustomerPhone BarCodeld RentalDate FeesPaid Return Date LateFeesDue LateFeesPaid FeesDue FeesPaid

Object-Oriented Analysis Activities 487

meta-class relationship defines a class whose in- stances are themselves classes. For instance, cus- tomers contain CustomerName which defines a subclass 'character string,' which defines a subclass 'character.' Customer is a meta-class representing its character string contents. In general, all classes and class/objects from analysis are meta-classes that are elaborated during design.

Coad andYourdon [1990] recommend looking for classes by evaluating each class/object for special cases and creating generalization classes for spe- cialization class/objects. For example, cash and credit customers might be specialized class/objects of the general class customer (see Figure 11-30). Coad and Yourdon customize their notation for generalization-specialization relationships, although

Object Name VideoOn Rental

Videolnventory

BarCode Video

Attribute Name CustomerPhone BarCodeld RentalDate FeesPaid Return Date LateFeesDue LateFeesPaid FeesDue FeesPaid

VideoName RentalPrice VideoReleaseDate VideoCountOfCopies TypeVideo Vendor Date Received

VideoName BarCodeld BarCodeRentalCount BarCodeRentalDays

FIGURE 11-25 Final Object Attribute List for ABC Rental Processing

488 CHAPTER 11 Object-Oriented Analysis

Process

EnterCustPhone

CreateTempTrans

RetrieveRentalVOR

DisplayTempTransVOR

EnterBarCode

Retrievelnventory

Displaylnventory

Compute RentalTotal

EnterPayAmt

ComputeChange

DisplayChange

Updatelnventory

WriteRental

PrintTempTrans

EnterBarCode

Retrieve RentalVO R

DisplayTremTransVOR

Add DateToVO R

Update Inventory =ComputeTempTransTotal =EnterPayAmt =ComputeChange =DisplayChange

Write Rental

ComputerLateFees

EnterCustomer

CreateCustomer

EnterVideol nventory

CreateVideolnventory

Attribute

Prerequisite: CreateTempTrans process must be successful to continue rental process- ing. If not successful, goto EnterCustomer process.

Status: Bar code entry finished.

Postrequisite: All rentals are entered.

Formula = ILateFeesDue + IVideoPrice by CustomerPhone

Formula = TotalAmountDue - Total Amt Pd by CustomerPhone

Postrequisite: Change must be;?: zero to successfully complete this process. If change < zero repeat payment process.

Prerequisite: TotaIAmountDue=zero, and processing is complete.

Status: Bar code entry finished.

Formula = I LateFees by CustomerPhone

FIGURE 11-26 Process Attribute List for ABC Rental Processing

Object-Oriented Andlysis Activities 489

Line Type Relationship

() Uses

-------. Instantiates-Same data type ~------. Instantiates-Different data type

Inherits-Same data type

Inherits-Different data type

" 7 Meta-Class

Cardinality Necessary Relationship

Required

01 Optional

Om Optional

1m Required

FIGURE 11-27 Relationship Types and Cardinality for Object Class Diagram

it is not necessary to do so unless using their CASE tool. Figure 11-30 shows two alternative general- ization-specialization notations.

Coad and Yourdon also recommend that classes be created to express whole-part relationships. For example, in manufacturing, finished goods are assemblies of other goods; the whole class might be for the finished product, while the part class/objects define each component (see Figure 11-31). Again, Figure 11-31 shows two notations, a customized ver- sion of whole-part as expressed by Coad and Your- don, and the more general notation used in manual drawings and other CASE tools.

To summarize, we have five types of relationships that we evaluate for specifically. First, we look for shared attributes and processes across class/objects to define inheritance classes. Then we evaluate the class/objects for specialization and for component part relationships. Next, class/objects which use the

attributes or processes of another class/object are identified to create a class for the common class/ object items. Finally, we define meta-classes as abstract classes whose instances are themselves classeS.

To create less cluttered diagrams, elevate the highest independent class or class/object on each diagram to define subjects. A subject is the most abstract class represented in an application. The pur- pose of subjects is to provide a summary identifier that represents the cluster of subordinate relation- ships which inherit from the class (see Figure 11-32).

Finally, we reevaluate and, as necessary, redefine both process-object assignments, class, and class/ object definitions again. We reevaluate to ensure that all definitions accurately reflect the applica- tion requirements, and are 'clean,' that is, all processes relate to all data with which they should be encapsulated.

490 CHAPTER 11 Object-Oriented Analysis

Customer Class

1 I 1

I I

I /

Om I

Cust Order

Class/Object Class/Object

FIGURE 11-28 Order Entry Example of Customer Class

ABC Video Example Class Analysis

The class diagram for ABC rental processing is fairly simple (see Figure 11-33). First we draw the object classes: Customer, VideoOnRental, Video- Inventory, BarCodeVideo, and TempTrans.

Next, we evaluate the relationships between them. Referring back to the attribute list, we see that VideoOnRental (VOR) contains information from Customer, BarCodedVideo, and VideoInventory. The question is, Is this an inheritance relationship or a using relationship? In other words, are the data and processes also shared by VOR or does it simply use the data? The answer is found in the process descriptions. For all three class/objects, if the object

does not exist while rental processing is going on, the rental class/object is supposed to be able to add new customers and new videos. Therefore, the pro- cesses for adding and reading the information from all three class/objects are shared and should be inherited. If VOR simply used the data, the using relationship would have been more appropriate. BarCodeVideo, Video-Inventory, and Customer are drawn as classes because they will not actually store data. They manage the shared processes.

Next, we consider the relationship of VideoOn- Rental (VOR) to TempTrans. There is considerable overlap since VOR gets all new objects from Temp- Trans, and TempTrans gets all information about open rentals from VOR. In this example, neither can

Customer Class

1 I I

Customer C/O

FIGURE 11-29 Example of Using Class

inherit the processes of the other. Since they both use each other's data, they have reciprocal using rela- tionships which are expressed in the diagram (see Figure 11-33).

Then, we create new class/objects for attributes and processes not shared or inherited by VOR (see Figure 11-34).

Next, we consider the relationship between Bar- CodedVideo and VideoInventory. VideoInventory defines the characteristics of a group of inventory items. For instance, there will be one object with the value Terminator 2 in the Video Name, but there might be many BarCodedVideo objects which refer to that name. That is, there are many copies of the movie, each with its own bar code. Therefore, the

Object-Oriented Analysis Activities 491

Inventory

Order

characteristics of VideoInventory appear to be in- herited by BarCodedVideo.

Next, we ask if the processes of VideoInventory also apply to BarCodeVideo. For instance, when we add a BarCodeVideo, do we need to know or do pro- cessing for VideoInventory? One attribute of Video- Inventory, a count of the number of videos in stock, is created and updated every time VOR is created or used during rental processing. Therefore, a class for VideoInventory that includes the attribute(s) and processes that are shared is required. Now we have two classes dealing with VideoInventory and one class/object that will contain the data. The diagram reflecting these final data and processing require- ments is shown in Figure 11-34.

492 CHAPTER 11 Object-Oriented Analysis

Generalization

Customer

I Specialization

Credit Customer

Adapted from Coad and Yourdon (1990).

Traditional Notation

Credit Customer

I .................. Specialization

Cash Customer

FIGURE 11-30 Example of Generalization-Specialization Classes

Draw State-Transition Diagram

Rules for Drawing a State-Transition Diagram

A state-transition diagram defines allowable changes for data objects. Specifically, for each change of data content for an object, we identify the initial state, the event that causes the change, the process by which the change occurs, and the result- ing state. A state is a set of values an object can have

while a transition is an event causing a change to the set of values.

There are two subtly different types of state- transition diagrams known as the Mealy model and the Moore model. The Mealy model defines all state changes and associates each with an action; it is used in this text. The Moore model defines all actions and associates each with a state. Theoretically, both mod- els lead to the same definitions, they take different perspectives. For novices, the Mealy model is sim- pler because it is easier to identify and verify state changes than it is to identify and verify that all actions are present.

The icons used in drawing a state transition dia- gram are shown in Figure 11-35 as a circle and directed line. The rules for developing a state- transition diagram are as follows:

1. Draw one diagram for each object/class and each class.

2. Identify the possible states the class/object can take.

3. Draw circles on a diagram labeling each with a state.

Whole Toaster

r:J] Adapted from Coad and Yourdon (1990).

Traditional Notation

Object-Oriented Analysis Activities 493

4. Connect the states to show transition from one state to another. Use directed arrow lines to show the direction of state change (i.e., from ... to ... ). Each state should lead to one or a small number of other states.

5. Label the transition lines to identify the events that initiate the change. Write the event names above the lines.

6. Label the lines with the processes that man- age the event. Write the process names under the lines.

Part: Elevator Part: Chassis

01 c:JIILJ FIGURE 11-31 Example of Whole-Part Class

494 CHAPTER 11 Object-Oriented Analysis

Product Composition EJ

FIGURE 11-32 Example of Subject Diagram

7. Examine the diagram. If there are any recur- sive state changes, reanalyze that part of the diagram in more detail to remove the recur- sion or to specifically label the state and its processes as recursive.

8. Walk through the diagram with other team members until it is complete and accurate.

The circle identifies the states of the object. Directed lines signify transitions and lead to the resulting state. The event causing the transition is written on top of the directed line. The process that changes

Customer

Add,Read

Video I nventory

Add,Read

BarCodeVideo

Add, Read

the state is written under the directed line. The names of states should be unique, but the names of events and actions need not be unique if they, in fact, relate to more than one state. Events can spawn more than one process. Conversely, object states can require more than one event to be changed. If many events are required to initiate a state change, they are shown with separate lines leading to the state. If any of several events can initiate a state change, the lines converge into one line entering the state. Each class and class/object in an application has a state- transition diagram developed for it.

y TempTrans

./ j

., VideoOn Rental

FIGURE 11-33 Class Diagram for VideoOnRental

Customer

Add,Read

Videol nventory

Add,Read

,~ Vlnv

Change, Delete

BarCodeVideo

Add,Read

FIGURE 11-34 Class Diagram for ABC

State-transition diagrams are optional represen- tations in object orientation. They are useful for diagramming the behavior of systems with

• multiple message types • complex processes • synchronization requirements.

Different diagrams, such asfence diagrams,? are often substituted for state transition diagrams when there are less than 20 states.

Object-Oriented Analysis Activities 495

Cust

Change, Delete

TempTrans

--.;; VideoOnRental

BCVideo

.. Change, Delete .J.

o Circles are used for class/object states t Directed arrows are used for transitions

7 See Martin and McClure, 1985, for a further discussion of dif- FIGURE 11-35 Icons Used in State ferent substitute representations. Transition Diagrams

496 CHAPTER 11 Object-Oriented Analysis

ABC Video Example of State-Transition Diagram

The steps to developing a state-transition diagram are to draw circles for each state that an object can take. Then connect the circles with lines showing which states lead to which next-states. Label the lines with the event triggering the change on top and the associated process from the application under the line. Rental VOR objects are the most complex in the ABC Video rental processing task, so they are discussed here. Development of state transition diagrams for the other objects is left as a student exercise.

In its most simple form, a rental is either open or closed (that is, no rental). So, the first iteration of the state transition diagram will begin with those two states. The high level diagram is in Figure 11-36. For each path between these two states, we ask ourselves the question, What causes the change? First, what causes the change from no rental to an open rental? Open rentals are created when a customer requests a rental; this is the event for the line from no rental to open rental. The process accompanying this event is to create an open rental.

Second, what causes the change from open rental to no rental? Return of the video(s) and payment of late fees can cause an open rental to be closed. There are two events in this statement, so now we ask our- selves about the events' timing. Are all returns and

Pay Late Fees

Close Order

payments performed at the same time? If not, what is different about them? From the description of the rental process, we know that returns can be made without any payment taking place. So, we separate these events.

Consider returns first. When a video return takes place, what process is performed? The answer is that we update the rental with the return date. The rental does not change from open to closed when a return is performed, however; so, we draw a recursive line from open rental to open rental and mark it with the event and process. This recursive line identifies a need for another level of detail on rental states because each state should have its own circle for clarity of expression.

Finally, we evaluate the other event, payment of rental fees. This event causes a rental to become closed. The directed line connects open rental to no/closed rental, the event is pay late fees, and the process is close rental.

We already know we have to create another level of detail to this diagram to be more specific about return date processing, but we also want to evaluate this diagram to see what else is required. Does this diagram account for all rental states? The answer is no. It does not account for situations when late fees are owed (in other words, if there is already an open rental), and it does not account for updates for fees paid. So, we redraw the diagram to include these states.

Customer Request

R e t u r n

FIGURE 11-36 High-Level State-Transition Diagram for ABC Rental Processing

Summary 497

FIGURE 11-37 State-Transition Diagram for ABC Rental Processing

In the revised diagram (see Figure 11-37), we continue the thought process we used to draw the first diagram while accounting for the details we omitted from the first diagram. Now, we try to iden- tify the states through which a rental proceeds from its opening to its closing. The states are:

• open • temporary, new rental in memory, until fees

paid • unpaid, returned VOR may have late

fees • paid, returned VOR • closed rental with return date and all fees

paid

Next we draw the lines showing how each of these states comes to exist. Notice that a customer request triggers a search of open rental and will result in either the temporary rental status or the add-on rental status, depending on whether or not a rental for this client exists. The remaining events are return-Rental and all fees paid.

AUTOMATED __________ __ SUPPORT TOOLS FOR __ _ OBJECT-ORIENTED ___ _ ANALySIS ______ _ Object orientation is less than five years old in its use in business. Yet the number and variety of support tools and environments available attests to its grow- ing popularity and legitimacy. The tools presented here represent both partial and complete support for one or another method of developing object views of the world (see Table 11-1). Many tools include code generation capabilities which automatically generate C++ or other object-oriented code objects from the logical definitions provided in object analysis and design.

SUMMARy ____________ __

Object orientation is a methodology that alternates evaluation between objects and processes to develop

498 CHAPTER 11 Object-Oriented Analysis

TABLE 11-1 Automated Support Tools for Object-Oriented Analysis

Product

DSEE, HP /Softbench

Excelerator

Object View

Object Vision

OOA Tool

ProMod

Software Backplane Cohesion

SW Thru Pictures

Teamwork

Telon

Visible Analyst

vs Designer

Company

Apollo/Hewlett-Packard

Index Tech. Cambridge, MA

Knowledge Ware Atlanta, GA

Borland International

Object International, Inc. Austin, TX

Promod, Inc. Lake Forest, CA

Atherton Technology / Digital Equipment Corporation Maynard, MA

Interactive Dev. Env. San Francisco, CA

CADRE Tech. Inc. Providence, RI

Pansophic Systems, Inc. Lisle,IL

Visible Systems Corp. Newton, MA

Visual Software Inc Santa Clara, CA

Technique

Integrated CASE Product Supporting 00 Analysis

State-Transition Diagram Matrix Graph (RTS)

Application Proto typing Software Using 4GL or SQL Code

Visual Application Development System

Coad's Tool Supporting Object Analysis Using Coas & Yourdon Graphics

Control Flow Diagram State-Transition Diagram Module Networks Function Networks

Integrated CASE Product Supporting 00 Analysis

Control Flow State-Transition Diagram

DFD Control Flow State-Transition Diagram Process Activation Table

State-Transition Diagram

Booch Diagram Visual RD Diagram Ward-Mellor Diagram

a complete view of an application. Objects are enti- ties to be automated. They are encapsulated with processes which operate on them or which read them.

Encapsulated class/objects may be identified for creation of reusable, normal, or polymorphic mod- ules. Reusable modules perform the same action on

the same data type class/objects, but are called by more than one class/object. Normal modules per- form one action on data from one object. Polymor- phic modules perform one action on data from many objects of differing data types. Object-process capsules are evaluated to determine their interrela- tionships. The interrelationships usually describe a

hierarchic network of relationships for which the lower-level capsules inherit both the data and pro- cesses of the higher capsules. Encapsulated class/ objects with multiple relationships have multiple inheritance from related higher capsules.

The declarative steps performed to develop an object analysis include identification of class/ objects, identification of processes, class and hierar- chy definition, definition of attributes of operations, definition of interobject messages, and class/object state-transition definition. The procedural evalua- tions within each step consist of questions to be answered and actions to be taken based on the answers to the questions.

REFERENCES ______ _

Berrard, E. v., An Object Oriented Design Handbookfor Ada Software. Frederick, MD: EVB Software Engi- neering, Inc., 1985.

Booch, G., Object Oriented Design with Applications. Redwood City, CA: Benjamin/Cummings Publishing Company, Inc., 1991.

Coad, P., and E. Yourdon, Object-Oriented Analysis. Englewood Cliffs, NJ: Prentice-Hall, 1990.

Coad, P., and E. Yourdon, Object-Oriented Design. Englewood Cliffs, NJ: Prentice-Hall, 1991.

Graham, Ian, Object-Oriented Methods. Reading, MA: Addison-Wesley, 1991.

Taylor, David, Object Orientation and Information Systems: Planning and Implementation. NY: John Wiley & Sons, 1992.

KEY TERMS ___ --. __ abstract data type (ADT) attribute class class hierarchy class/object client object encapsulation generalization class inheritance instance instantiate Mealy model message

meta-class Moore model multiple inheritance object object-oriented analysis part class polymorphism primary key private part (of a class/object) problem space process attribute public part (of a

class/object)

solution space specialization class state state-transition diagram subject class

Study Questions 499

superset class supplier object transition user object whole class

EXERCISES _______ _

1. Complete the state-transition diagrams of the ABC Video rental processing application. Walk through the diagrams in class and discuss the difficulties and alternatives you found in devel- oping the state transition diagrams.

2. Perform an object-oriented analysis on the Eagle Rock Golf League in the appendix. Develop all lists, tables, diagrams, and pictures required to document the requirements of the problem.

3. Split the class into three teams. Have each team develop a second-level analysis of ABC Cus- tomerOnVideo maintenance using object- oriented analysis. Compare the resulting views of the application.

4. Debate this assertion: Object orientation is more likely than process or data methodologies to lead to well-defined modules which automati- cally deal with problem complexity by hiding information, being single-purpose, and having minimal coupling.

STUDY QUESTIONS ___ _

1. Define the following terms: class meta-class class/object multiple inheritance encapsulation object inheritance

2. Describe the sequence of events during analysis.

3. Compare the differences between the major forms of documentation in structured analysis and object-oriented analysis.

4. Compare the differences between the major forms of documentation in information engi- neering and object-oriented analysis.

5. Why is the summary paragraph in object- oriented analysis so important?

500 CHAPTER 11 Object-Oriented Analysis

6. Compare and contrast the definitions of objects, processes, and encapsulated objects.

7. List the documents and graphics created in object-oriented analysis and describe how they are related to each other.

8. What are the decisions you must make in object-oriented definition of object hierarchies? Why are they important?

9. What rules in object-oriented analysis simplify quality control and review?

10. How do you determine that the allocation of objects to processes is correct? What are the questions asked, and why are they important?

11. What is polymorphism? What is its importance in object orientation?

12. What is the purpose of a state-transition diagram?

13. Describe the development of a state-transition diagram.

14. What is the relationship between a state- transition diagram and objects, processes, or encapsulated objects?

15. What is the purpose of a graphical class diagram?

EXTRA-CREDIT QUESTIONS

1. What are the rules for identifying objects? Can you think of others that might be useful?

2. The steps that use nouns and verbs to identify objects and processes, respectively, have been criticized as too simplistic. Can you think of a different approach to identifying objects and processes, perhaps borrowing from another methodology, that improves on the process?

502 CHAPTER 12 Object-Oriented Design

Hardware

Data .~~ Human ~.- - ~ Interface

'--------' ~ ~ ~

Software

FIGURE 12- 1 Object-Oriented Subdomains

is the entire application as currently defined. As the prototype is examined, further details of operation are explicated for incorporation in the next iteration of the prototype. Following the format of previous chapters, we first define terms used in the OOD process, then move on to developing guidelines for each step and an example of the step and thinking processes for ABC Video's rental application.

DEFINITION OF ____ _ OBJECT-ORIENTED ___ _ DESIGN TERMS ____ _

The seven steps to performing an object-oriented design are:

1. Allocate objects to four subdomains, includ- ing human, hardware, software, or data.

2. Develop time-event diagrams for each set of cooperating processes a.'1d their objects.

3. Determine service objects to be used. 4. Develop Booch diagrams. 5. Define message communications. 6. Develop process diagram. 7. Develop package (i.e., module) specifications

and prototype the application.

In this section, we define the terms used in these steps, again integrating and extending the work of

Booch with that of Coad & Yourdon. Keep in mind that while the terms are fairly well-defined, the man- ner and order of implementing the steps is not. The documentation created by these steps is summarized in Table 12-1.

In the first step, problem domain objects are assigned to one of the human, hardware, software, or data subdomains. The human subdomain defines human-computer interaction in the form of dia- logues, inputs, outputs, and screen formats. A dia- logue is interactive communication that takes place between the user and the application, usually via a terminal, to accomplish some work. A dialogue defines actions of users and actions of the application and hardware. Inputs (i.e., data entry), outputs (e.g., reports), and screens are the three modes of com- munication used for a dialogue. The task being per- formed is usually a transaction relating to a business event (e.g., sale of goods), but could also relate to application-generated events, such as sensor read- ings in process control or a data request in a query application. A screen design alone is a static defini- tion of field formats while the dialogue is a series of interactions that takes place via a dialogue.

The hardware subdomain defines object assign- ment to physical processors or firmware. 2 The hard-

2 Firmware refers to software that is permanently on a program- mable chip and that processes significantly faster than memory-resident software program code.

Definition of Object-Oriented Design Terms 503

TABLE 12- 1 Object-Oriented Design Documentation

Tables

Process Assignment to Object Table

Subdomain Allocation Table

Message Table

Diagrams

Contains all solution space objects and, for each, the processes that act on the object

List of processes and subdomain assignments

Contains, for each process, the calling object, the called object, the input message contents, the output message contents, and the object to which control is returned

Subdomain Allocation Optional graphical depiction Diagram of process-subdomain

assignments

Time-Ordered Event Diagram

Booch Package Diagram

Process Diagram

Depicts required sequencing of processes

Depicts objects and message flow for the entire application. Lower-level Booch diagrams, one per processor, are created to show objects and processes with message flow.

Shows hardware configuration and process assignment to processors

ware interface is significant as we develop applica- tions using more firmware, mainframes augmented by local intelligent devices, and distributed process- ing. To support these types of processing, alloca- tion of tasks to hardware must explicitly be part of the methodology.

The software subdomain defines service con- trol and problem-domain objects. Service control objects, also known as utility objects, manage application operations. Depending on the complex- ity of the application, synchronizing, scheduling, or

multitasking services to control object/process work might be required. Problem-domain objects are the class/objects and objects (hereafter, both are referred to as objects) defined during analysis and describ- ing the application functions.

The last subdomain relates to data, which are the actual instances of the objects in the solution set. During the data design, data are normalized and redesigned to accommodate operational efficiencies. Depending on the 'purity' of the object implemen- tation, the physical data storage mayor may not implement encapsulated data and processes in the database. The most common variation of data stor- age is a template definition that uses physical address pointers to reference the physical data store for data and processes. The template is analogous to the File and Working-Storage Sections of a COBOL pro- gram, but includes a process template as well as a data template.

The second step for all processes, regardless of their subdomain assignment, is to develop time- event diagrams. Time events are the business, sys- tem, or application occurrences that cause processes to be activated. Time-event diagrams show se- quences, concurrency, and nesting of processes across objects. The time-event diagram, then, shows the relationships between processes that are triggered by related events or have constraints on processing time. Process relationships are either sequential or concurrent, determining the types of service objects required in the application. Processes that are not concurrent are sequential and related only by data or parameters passed between the pro- cesses. Concurrent processes operate at the same time and can be dependent or independent. Depen- dent concurrent processes require synchronization of some sort.

Above, we defined service control objects as managers of application operations. The third OOD step is to determine which service objects are needed to control the application. There are three broad cat- egories of service objects: synchronizing, schedul- ing, and multitasking.

Synchronizing is the coordination of simultane- ous events. Synchronizing objects provide a ren- dezvous for two or more processes to come together after concurrent operation (see Figure 12-2).

504 CHAPTER 12 Object-Oriented Design

.--___ ~I Concurrent L- I Process 1 I

- 1-------l1 Synchronizing I I Process r Concurrent 1

L..-___ ~I Process 2 t--"

FIGURE 12-2 Diagram of Synchronizing Object Functions

Scheduling is the process of assigning execution times to a list of processes. Scheduling objects can be for sequential, concurrent-asynchronous (i.e., independent), or concurrent-synchronous (i.e., de- pendent) processes. In the terminology of COBOL, scheduling objects are analogous to a mainline rou- tine (see Figure 12-3), but the scheduler performs many functions beyond those of a COBOL mainline.

Multitasking is the simultaneous execution of sets of processes (see Figure 12-4). Each set of con- current processes is called a thread of control. These threads are initiated by the scheduling objects and controlled by multitasking objects. Multitask- ing objects track and control the execution of mul- tiple threads of control and can be in both the problem domain and the service control domain. These three types of service control objects provide the structure within which problem domain objects execute.

Service object definition is based on time-event diagram analysis. If all objects are sequential and used one at a time, then only scheduling objects are required. If concurrent processing takes place, syn-

Get object Get memory location Store object Enqueue object Dequeue object Set time Check time Stop time

FIGURE 12-3 Scheduling Object Functions

chronizing and scheduling objects are used. If many users are supported concurrently, multitasking objects are added to the other types.

After service objects are identified, the next step is to begin to develop a Booch diagram. A Booch diagram depicts all objects and their processes in the application, including both service and problem domain objects. First, a draft diagram is created. Then, several message passing schemes are evalu- ated. After a message passing scheme is identified, message contents are defined.

The basic graphical forms used are rectangles and ovals (see Figure 12-5). Vertical rectangles signify a whole package. A package in OOD is a set of mod- ules relating to an object that might be modularized for execution. Service packages are single purpose and do not usually have subparts that are visible to the rest of the application. Service objects have no visible data, that is, no oval identifying a data part to the object. Problem-domain packages have data identifiers for objects and processes. The object in the oval and the process names are each in their own horizontal rectangle (see Figure 12-5). In Figure 12-5, the lines connecting modules show allowable paths for messages.

Next, messages are defined. A message is the only legal means of communications between en- capsulated objects. Messages are clear in their intention, but not clear in their implementation which is completely determined by the language. For instance, at the moment, Ada does not implement message communication. In this text, a message is the unit of communication between two objects. Messages contain an addressee (that is, the object providing the process, also called a service object), and some identification of the requested process.

Definition of Object-Oriented Design Terms 505

Multitask Manager

Scheduled Trans1 Trans2 Trans4

Transn

CPU-Active Task = Trans6

Compute Execute Write ...

Waiting Trans3 Trans5 Trans?

Active Trans6

FIGURE 12-4 Multitasking Management of Multiple Threads

Messages may be unary, binary, or keyword (see Figure 12-6). Unary messages contain only an addressee and service identifier. Binary messages contain addressee, service identifier, and two argu-

Control Object

ments (that is, variable object names or addresses upon which the service is performed). Keyword messages contain addressee, service identifier, and one or more keywords, each with an argument

FIGURE 12-5 Sample Booch Diagram-Simple Inquiry Process

506 CHAPTER 12 Object-Oriented Design

Unary Message: Addressee Service Identifier

I Customer Create

Binary Message: Addressee Service Identifier Arguments .------.1-----------.1----------------------.

ComputeTotal PastDueFees, CurrentFees

Keywork Message: Addressee Service Identifier Keyword Expression(s)

DateTranslate Field=Dateln, DataType=lnteger

FIGURE 12-6 Example of Message Types

to show optional process selection. Message defini- tions probably will expand as languages capable of expressing and processing object-oriented designs develop.

The next step is to develop a process diagram that defines the hardware environment and shows process assignments to hardware. The first activity is to draw a hardware configuration showing proces- sors (sh,adowed boxes in Figure 12-7) and devices (plain boxes in Figure 12-8). Lines connecting processors identify allowable message paths. At this summary level, multiple messages may travel each path.

When the process diagram is complete, the Booch diagram is divided and redrawn for each processor in the configuration. These subdiagrams show the extent of replication in the application and may iden- tify new service object needs to control interproces- sor communications. The message list is reexamined to ensure that all interprocessor messages are accommodated and complete. For multiprocessor applications, the timing of processes is reverified to ensure correct definition.

U sing the information from the problem domain analysis and the OOD diagrams describing object interrelationships and timing, the next step is to develop package specifications and prototype the application. These are not the last steps in

the design, only the last steps in an iteration of the design process which may have several itera- tions. As a result of prototype development, other service objects might be recognized as needed. Iterating requires review of all design steps and redoing analysis as required to support develop- ment of a complete application prototype for each iteration.

Package specifications define the public inter- face for both data and processes for each object, and define the private implementations and language to be used. The public interface is that part of the data and process definitions visible to all objects in the application. The private interface describes the physical data structure and actual functions (i.e., data manipUlations, calculations, or control processes) to be coded for the application. Multiple implementa- tions of the same function that operate on different data types might be required. The function that has one name but multiple implementations is called polymorphic. Polymorphism, is the ability to have the same process, using one public name, take dif- ferent forms when associated with different objects.

One item in a package specification is a definition of the language to be used. Process timing (i.e., sequential or concurrent) and a need for poly- morphism determine the type of implementation language required. Some languages are more con-

Definition of Object-Oriented Design Terms 507

Printer

PC001 -

Store Sales "";pmmmL,

""" PC002 - Manager

PC006 - Gateway

PC005 - File

Server

PC004 - Gas Sales

Gas Pump Sensors

Query, Store Sales

PC003 - Gas Tank Monitoring

\ Gas Tank Sensors

FIGU RE 12-7 Process Diagram Example of Convenience Store/Gas Station Network

straining than others. To understand these language differences, binding and client/server relationships should be understood. Binding is the process of integrating the code of communicating objects. Binding of objects to operations may be static (fixed at compile time), pseudo-dynamic (parameter driven and decided at the beginning of a session), or dynamic (decided for each object while the system is executing, that is, at run time).

A major difference between object orientation and other methodologies is the shifting of responsi- bility for defining the data type of legal processes from server (or called) objects to client (or calling) objects. A server object is one that performs a requested process. A client object is one that requests a process from a supplier. For instance, you might need to translate a date from month-day-year format to year-month-day format. As a client object,

you request the translation of the supplier object and pass it the date to be translated. If the language sup- ports dynamic binding, you also pass the data type of the date (for example, binary string or packed matrix). This shift, to client/server logic, plus the notions of inheritance and dynamic binding, support the use of polymorphism.

Let's return to the idea of binding and work our way through these ideas and how they work together. In most business applications, we think of processes as always operating on the same type of data. For example, items on an order have an order quantity (for example, 2), quantity type (for example, each or dozen), and price (for example, $1.20) that is expected to match the quantity type. To compute the line item total, we multiply quantity times price for a given quantity type. But what if the type quantity is not known beforehand and the formula must change

508 CHAPTER 12 Object-Oriented Design

based on the type? Then, we have three choices. First, we could write many routines that are all resi- dent in the compiled code as static binding requires. This is the most common COBOL solution.

Second, we could write many routines that use information passed to the computation procedure to identify which routine to use for the session (for instance, only dozens will be processed in one ses- sion). This is called pseudo-dynamic binding (e.g., in Ada at the moment).

Third, we can write many routines and pass the quantity type to the computing object in the request message to dynamically bind and select the routine it needs to compute the total (as in Assembler, C++, or Smalltalk). Dynamic binding is done on-the-fly at run time. When the computation is complete, the quantity type code no longer is kept in the com- puter's memory.

Binding time is a function of the language used and the application's requirements. If the application is batch, single-thread, and sequential, there is no need for any but static binding. If the application is anything else (multithread, concurrent, real-time), dynamic binding is desirable, but many languages only support pseudo-binding. Then, the application requirements, in the form of business needs for response time or process time, should drive the lan- guage selection decision.

We no longer assume that a called object can do only one thing in only one way; instead, a called object can do only one thing but it can do it in many ways. This ability to do one thing many ways is polymorphism. Polymorphic processes take different forms when associated with different objects, but a process always takes the same form with a given object. Client-object message requests contain both the process and the form of the process. The poly- morphic process then loads its correct process code to service the request via the dynamic binding mech- anism of the implementation language. An example of pseudocode for polymorphic pairwise item com- parison is shown in Figure 12-8.

This discussion summarized the major terms, diagrams, and procedural steps in object-oriented design. Next, we discuss the steps of OOD in detail, including allocation of objects to the subdomains, developing time-event diagrams, determining ser-

Pairwise Compare- Two Numbers

If A = 8 return-code = 1

else return-code = O.

Return return-code.

Pairwise Compare- . Two Matrices

Set sub = 1 Set return-code = O. Perform compare

varying sub by 1 until sub = 1 st-entry.

Return return-code.

Compare. If A(sub) not = 8(sub)

return-code =1. Compare-exit. Exit.

FIGURE 12-8 An Example of Polymorphic Descriptions for a Comparison Process

vice objects, developing Booch diagrams, devel- oping process diagrams, and developing module specifications. Prototyping is beyond the scope of this text.

OBJECT-ORIENTED ___ _ DESIGN ACTIVITIES ___ _

In ABC's rental application, we are using off-the- shelf software in an off-the-shelf hardware environ- ment. In the environment, the operating system, network, database, and form of human interface are all given. Because of our choices-PCs, MS-DOS, Novell Netware, and a SQL DBMS-the application does not easily lend itself to object-oriented design that assumes none of the services and functions pro- vided in our target environment. Because of the dif- ferences, we will discuss ABC at two levels: one for SQL DBMS which becomes unobject-like, and one for a Unix/C++ environment that stays object-like. First, we follow ABC through the process of design keeping in mind that the off-the-shelf software will be used. Think of this design as object-based, that is, based on object thinking, but decidedly not object-oriented in implementation. Object-based design is what is practiced by most novice object- designers, and is what most CASE tools being retro- fitted for object orientation will be. In the chapter appendix, we present a second design for a Unix/

c++ environment that is completely object-oriented. Without both discussions, the view of object orienta- tion that you would get is not complete, and some of the discussions would be inaccurately stated for object-oriented design.

Allocate Objects to Four Subdomains Heuristics for Allocating Objects to Human, Hardware, Software, and Data Subdomains

The first step is to allocate the problem domain processes to one of the subdomains: hardware, soft- ware, data, and human interface. Each process and the data it requires from its object3 are examined to determine whether they are best implemented as part of the human interface, hardware, software, or data subdomains. There is no particular order to the allo- cation process. It is recommended to allocate the software domain last, because it is the default for all processes not allocated elsewhere. Since these implementation alternatives are usually not broken apart by other methodologies, and since hardware is usually completely ignored, the consideration of these subdomains and explicit allocation of objects to them provides useful detail that is explicitly docu- mented for maintenance. Also, since hardware options are becoming more numerous and common (e.g., automated teller machines have local intelli- gence and some of the application code for deposit and withdrawal processing), this mechanism accom- modates hardware and firmware in design decisions.

We will discuss data first, because current guide- lines demonstrate some of the shortcomings of cur- rent OOD writing. Booch suggests that standard database activities should be assumed to be under the control of the data domain, including create, retrieve, update, and delete processes (i.e., CRUD). All other data manipulations or computations are allocated 'somewhere else.' Coad & Yourdon, and most authors published after 1992, assume the use of

3 Superset objects, class/objects, and objects are all assumed in the use of the tenn object.

Object-Oriented Design Activities 509

a DBMS and usually an object-oriented one that includes the properties of persistence, inheritance, and abstract object-oriented data definition. Some authors assume use of an SQL-compatible database with an equally unobject-like language, recommend- ing that the data functions should be separated from the application which will maintain its object-like properties for all non-data operations. .

Keep in mind that this is an inexact process that IS highly dependent on the implementation language and the implementation environment. For example, if we were using Smalltalk, in which everything is an object, separation of data access and manipulation is usually more efficient than keeping the functions all together. Conversely, if an OODBMS, such as Gemstone, were used, the DBMS object performs the physical CRUD actions and the applica~ion objects usually control the logical CRUD functIOns that are grouped by object. The key idea is that judg- ment on allocation of functions is required and needs to be done with knowledge of the entire implemen- tation environment.

If the application needs to use a nonOODBMS, then evaluating whether data integrity, security, and access controls can be adequately maintained by not using the DBMS language is required. If the appli- cation can both perform the functions faster, and pro- vide for integrity and so forth, then there should be a real analysis of where the functions should be. The application requirements for execution and response time may force use of a programming language when constraints are tight, and default to use of the DBMS language when there are no constraints.

Table 12-2 summarizes this discussion, showing that allocation of physical and logical read, write, and delete actions and the control over security, integrity, and access be tied to constraints and t~e type of database environment used. If no DBMS IS used, the alternatives are either to allocate DBMS functions to each object, or to design data control objects that perform DBMS functions, or to design a polymorphic reusable object that performs all DBMS functions.

We said before that DBMSs illustrate the problem of all authors in object-oriented design. For the most part, 00 authors do not work in commercial busi- ness and do not build commercial applications; they

510 CHAPTER 12 Object-Oriented Design

TABLE 12-2 Heuristics for Data Allocation Processes

Type Database 00 Functional or response Y time constraints

Allocate CRUD to DBMS Phys.

Allocate CRUD to Object Log. or generic

Allocate security, integrity checking, access control to DBMS

Allocate security, integrity *All checking, access control to Object or create generic objects

Legend:

Phys. Log.

* All Y N

Physical functions (read, write) Logical functions (edit) Requires analysis and judgment Physical and logical Yes No

work in defense-related businesses and build real- time, embedded applications which function as part of some larger system. For instance, defense appli- cations might include building a guidance system for a missile, a monitoring system for airplane radar, or a reporting system on the Hubble microscope. These applications all have no persistent data; rather, they work on sensor data and pass on the informa- tion they filter for processing or feedback by other systems.

The problem with applying embedded-system thinking to persistent object problems is that there is little overlap in designing for temporary and persistent data. Persistent data and, in particular, DBMS-stored persistent data, have entirely differ- ent thinking processes that the computer-scientist authors of most object -oriented methods do not rec- ognize. Because of this lacking recognition, these heuristics on object allocation are more crude than those of, say, process methods which have been tried for the last 20 years.

00 Non-OO Non-OO None N

All

Y N

*Phys. Phys. *Log.

*Phys. *Log. All Log.

*All

All *All All

A similar problem occurs in the hardware domain. Object-oriented authors most often are designing state-of-the-art hardware as part of their application design including customized operating systems and software. Most business applications use off-the-shelf hardware that is generalized in function and has many user features. The only cus- tom development in most business applications is the application software itself. So, the design prob- lem with hardware is opposite that of DBMSs. For hardware, the methodology authors do more detailed levels of development than is necessary in most business applications. You will see this problem again when we discuss service object definition.

Now let's consider allocation of functions to the other subdomains. The human interface is exactly what you think it is, the interactions with people, usually through a terminal device, that provides the essential inputs and outputs of the application. The human interface is discussed poorly in the OOD books that do exist (including all of those in the ref-

erences of this chapter) because of the traditional lack of human users in object-oriented applications. Because of this lack, they are discussed in Chapter 14 as one of the 'forgotten activities' of systems analysis and design.

In general, the activities that provide human interface control, such as screen interactions, are rec- ommended to be relegated to the human component of the application. Again, there are no compelling reasons for blindly making this decision, therefore it is subject to analysis. Activities that can be grouped across objects, such as line control, error message display, and screen reads and writes can all be abstracted out of the individual objects and placed in reusable, generic objects. The actual editing of data from screens should remain with the original object unless there are sufficient similarities across screens and data items to warrant abstracting them out as well, or unless the functions will be assigned to human interface hardware. To perform this ab- straction requires listing all the detailed, primitive actions required of screen interactions for each ob- ject, identifying which actions are performed auto- matically by the DBMS or other application software and removing them from the list, and re- evaluating the remaining items to determine whether or not there are commonalities across objects.

This primitive level of detail may be deferred automatically when you relegate all screen interac- tions to the human interface. This deferral allows you to build the interface during proto typing even though you may not know all of the primitives dur- ing the first iterations. In other words, allocating screen interactions to the human interface is a means of deferring detailed design decisions until initial prototype development.

The more distributed devices and processors, the more likely that processing might be allocated to firmware embedded in otherwise unintelligent devices. For instance, automatic teller machines include some intelligence for editing magnetic strip information from the cards used for withdrawal and deposit of funds from banks. They can, for instance, tell what type of card, such as Visa, is being used, and whether or not the personal ID number (PIN) is a valid combination of digits. They cannot tell whether or not the PIN matches the card number

Object-Oriented Design Activities 511

entered because that requires access to a database that is not stored locally. In addition, specific hard- ware functions, such as accepting a deposit enve- lope, are functions that would be allocated to hardware.

Allocation of processes to hardware/firmware is determined by the need for fast response time, min- imum communication delay, and minimum process- ing time. Whenever any of these three constraints are present in an application's functional specification, hardware process allocation should be investigated. Some authors recommend that allocation to hard- ware can include functions to be performed by the resident operating system. When there is access to these functions and they can be used as generics, this is a useful, time-saving idea. So, for instance, in systems such as Unix and Smalltalk, where the environment, operating system, and application are essentially inseparable, thinking of operating sys- tems and hardware as one simplifies design thinking.

Finally, we have allocation of processes to soft- ware. This allocation assumes that all problem-do- main processes not already allocated elsewhere will be implemented in software in the software domain. This allocation includes remaining service and prob- lem domain objects after the other allocations are complete. Now, let us tum to ABC Video to see what allocation means in this application.

ABC Video Example of Subdomain Allocation

ABC's rental application will be an interactive, mul- tithread set of processes which will service up to six threads of control, with growth to some higher num- ber. Therefore, the concurrent processing require- ments of the application should be considered when allocating processes to subdomains to ensure that timing requirements will be met.

To refresh your memory, we had decided to use an SQL-compatible database to implement the application. We can interface the SQL language with other languages, but, as is typical of most DBMS software, all data accesses must go through the DBMS. This implies that the create, retrieval, update, and delete (CRUD) functions will all be allocated to the data subdomain as discussed above.

512 CHAPTER 12 Object-Oriented Design

By doing this allocation, we explicitly are decid- ing what is and is not object-oriented. SQL is not object-oriented. Therefore, any functions performed in SQL are not object-oriented. The design can pro- ceed in an object manner until the primitive level is reached, then the design is completed in SQL.

If we look at the output from the analysis where we allocated objects to processes, we can identify all those processes relating to these functions. Each object has simple CRUD functions as well as a need for CRUD functions on a user-view of the database that incorporates Customer, Inventory, and VideoOn- Rental. Eventually, for SQL implementation, we will collapse the superset objects back with the class/ objects and will control the use of add and read func- tions by logic in the SQL DBMS application code. Any access control on superset objects is controlled by the DBMS.

Figure 11-20 processes are listed in Table 12-3 with their subdomain allocations. First, consider the data subdomain. From Table 12-2 we know that we can allocate the data functions based on application requirements. We are using a non-object DBMS and have no constraints on processing. Part of the attrac- tion of the fourth generation database is its ease of use, therefore, anything that can be allocated to the DBMS should be. As Table 12-3 shows, all CRUD functions are allocated to the data function. Simi- larly, printing, which interfaces with external devices, is allocated to hardware. Print control is allocated to hardware because in aLAN, spooling and printing are network operating system functions that are not under application control.

All data entry functions are allocated to the human interface for design and control. Remaining processes are allocated to the software subdomain.

Draw Time-Order Event Diagram Rules for Drawing a Time-Event Diagram

A time-event diagram graphically depicts the tim- ing constraints and events that trigger related

objects. Time-ordered event diagrams show neither flow of control nor if-then-else logic. These dia- grams are showing what can happen in time, includ- ing required timing. The time-order event diagram becomes the basis for decisions about concurrent processes and is helpful in identifying service- object needs of the application.

The diagram is a two-dimensional graphic with objects listed down the left axis and time, broken into segments corresponding to events in the appli- cation, along the horizontal axis. For processes that might run concurrently, mUltiple lists of the objects are shown. Synchronization of concurrent events is shown by the divergent lines returning to one event at some point (see Figure 12-9).

Two formats for time-event diagrams are used. One shows deviations from an otherwise horizontal line with events and critical times demarcated by vertical bars (see Figure 12-10). The other format shows rising steps to mark events and critical time slots within the main object (see Figure 12-11). If one diagram per transaction is created, the rising step method is preferred because it is easy to see the points of change. If one diagram per application is drawn, the information can be presented more com- pactly with the horizontal line method.

Rewrite old VideoOnRental r-- Write new VideoOnRental ~

Print TempTrans r-- Rewrite BarCodeVideo

v Potential concurrent processes

objects, showing sequences of processing, concur- FIGURE 12-9 Potentially Concurrent rent processes, and nesting of processes across Processes

Object -Oriented Design Activities 513

TABLE 12-3 Process Subdomain Assignments

Process Name Data

EnterCustPhone

ReadCust X

CreateTempTrans

Retrieve VOR X

DisplayTempTrans

EnterBarCode

Retrievelnventory X

ComputeTempTransTotal

EnterPayAmt

ComputeChange

DisplayChange

Updatelnventory X

WriteVOR X

PrintTempTrans

EnterBarCode

Retrieve VOR X

DisplayTempTrans

AddRetDateTempTrans VOR

AddltoVlnv

Updatelnventory X

ComputeLateFees

WriteVOR X

EnterCustomer

Create Customer X

EnterVideolnventory

Create Videolnventory X

Diagram segments are defined as event-driven or clock-driven. For time-constrained segments of the diagram, the allowable maximum time is labeled along the horizontal axis (see Figure 12-12). For

Subdomain

Hardware Process Human

event-driven segments, the event is identified on the horizontal axis. Actual drawing requires knowledge of the problem domain requirements for processing.

The steps to creating a time-event diagram are:

514 CHAPTER 12 Object-Oriented Design

Object 1

U U Object 2 Object 2

Object 4

I I I I I I I I I I I T I I I I I II I

O------------------------------~~~

Time / Events

FIGURE 12-10 Horizontal Time-Event Diagram

1. Define all allowable transactions in the application.

2. Define the processing steps for each trans- action.

3. For each transaction, design a time-event diagram reflecting the dependence or inde- pendence of processing steps.

Object 1 15 ms

Object 2

Object 4

ABC Video Example of a Time-Order Event Diagram

For ABC, Table 12-4 shows the transactions allowed in the application. The transactions should have no surprises by this stage of design, and should be closely related to the processes defined for each

E4 E5

E10 E11 10 ms

O------________________________ ~~~

Time / Events

En = Event identifier

FIGURE 12-11 Rising Step Time-Event Diagram

Object-Oriented Design Activities 515

OBJECT Display

Retrieve Cust -10 ms

Cust -15 ms I Customer ~~~~e Mreate I---~Cust I

I I

Order History

I Retrieve Order

I History- I 15 ms

Display Order History

I Get Purchase Items

Purchase

I Retrieve Inventory Inventory

I I I I I I I 15ms I 15ms I I I I

• Time-Constraint - ____ -' Time o ~

FIGURE 12-12 Diagram Segments Identified as Time-Driven or Event-Driven

object. Some objects, such as TempTrans, have pro- cesses that relate to more than one transaction, while other objects each have processes that reflect one transaction, such as for Customer.

Of the transactions shown, we will discuss two that are representative of the others: video inven- tory additions and rental processing.

First, we describe what happens for a Video- Inventory addition. This step requires detailed knowledge of the specific processing to be per- formed. This knowledge comes from user inter- views, study of current procedures, and so on. Subprocess details should be based on the process- object assignment list (Figure 11-20). If there are discrepancies between the use of objects here and the list, the list should be revised to reflect this more detailed level of thought. The steps to adding inven- tory are:

1. Enter a new VideoId and remaining infonna- tion for a particular film.

2. When the NumberOfCopies is entered, add the new video infonnation to Videolnventory. Begin prompting for BarCodeld until the number of bar codes is equal to NumberOf- Copies.

3. As each BarCodeId is entered, add the new BarCodeVideo entry to the data- base.

4. When the number of BarCodelds entered is equal to NumberOfCopies, signal completion of the transaction to the clerk and end processing.

Figure 12-13 shows the time-event diagram for the processing steps about video inventory creation. Notice that two objects are involved: Videolnventory and BarCodeVideo. Even though Videolnventory is begun first, its processing is completed before BarCodeVideo processing takes place. The processes are related in that the Videold is passed to the BarCodeVideo process, but they are otherwise

516 CHAPTER 12 Object-Oriented Design

TABLE 12-4 ABC Transaction List

Object Transactions

Customer Create Retrieve Update Delete

Videolnventory Create Retrieve Update Delete

BarCode Video Create Retrieve Update Delete

VideoOnRental Rental without Returns Rental with Returns Returns without Rental Returns with Rental

Video History Create

Customer History

EndOtDay

Retrieve

Create Retrieve

Create Retrieve Delete

independent. There is no necessary concurrency within the transaction.

The rental transaction shows that several pro- cesses might be concurrent. First the steps to com- pletion of a rental process are:

1. Get the entry and determine its type (either CustomerPhone or VideoId).

2. If the entry is CustomerId, get all relevant customer information (e.g., name, address, and so on).

3. If the entry is VideoId, get the corre- sponding VideoOnRental and place it in memory.

Use Customerld to get all relevant cus- tomer information (e.g., name, address, and so on),

4. Get all current outstanding rentals (i.e., either unpaid late fees or unreturned rentals).

5. Compute LateFees on returned tapes. 6. Compute TotalAmountDue. 7. Display all information. 8. Process returns and redo steps 5-7 until no

more returns. 9. Get VideoIds of new rentals until end of

transaction is signaled. For each, get VideoInventory and BarCodeVideo informa- tion; format and display the relevant information; recompute and display Total- AmountDue.

10. At transaction end, process payment and make change until TotalAmountDue equals zero.

11. Write new VideoOnRental entries; update and rewrite old VideoOnRental entries; print TempTrans; update and rewrite BarCode- Video as required; end transaction.

The first event, data entry, results in one of two possible processes being invoked. These are shown with dotted lines on the diagram to show that only one is running at a time. If the VideoId is entered, then we have a choice to either nest getting the cus- tomer or transfer control. If we transfer control, the video information must have been stored in mem- ory for the first VideoOnRental to avoid passing unnecessary data. If we do not transfer control, and nest retrieval of customer information, then the cus- tomer information is unnecessarily passed through the retrieval process for VideoOnRental. The best object-oriented decision would be transfer control to maximize information hiding here, but we can treat these accesses as one if the DBMS supports a user view that links the relevant information. SQL DBMS does provide user views and we select that option. (Make sure you read the appendix for true object-oriented design of this information. It is significantly different.) Once VideoOnRental is accessed, then, the related information from Video- Inventory, BarCodeVideo, and Customer are all present automatically (see Figure 12-14).

Eventually, we loop through getting all current outstanding rentals from VideoOnRental. This itera-

Object-Oriented Design Activities 517

Objects Create Videolnv.

Videolnventory

BarCodeVideo

Get Videolnfo I

Get I BarCodeld

Time

Create BarCodelnv.

FIGURE 12-13 Time-Event Diagram for Inventory Creation Transaction

tion can be programmed to run until a return code in- dicating no more videos on rental are present. This return code, then, becomes the event to trigger the next step of the process.

Control is passed to compute Late Fees on re- turned tapes that will require a count of the number of VideoOnRentals in memory to be maintained and passed to control this process. Having processed late fees until this count is reached triggers the next step to compute TotalAmountDue. This is a one-time event at this point, and its completion leads to dis- play of all current customer and rental information on the user screen.

At this point, if there are new rentals, the Bar- Codelds are entered. This triggers obtaining Bar- Codelnventory and Videolnventory information. To simplify memory processing, we have a choice sim- ilar to that above for customer and VideoOnRental in step 3. In this case, the decision is between treating BarCodeVideo and Videolnventory as separate and independent or nested or the same. In order to treat them the same, we must be accessing a user view

that contains the relevant information. Again, SQL allows user views, and we use the user view that col- lapses this activity from two to one. As each video's information is displayed, the TotalAmountDue is recomputed and redisplayed.

Upon receiving the trigger that the rentals, or returns, are complete, payment processing takes place and continues until TotalAmountDue equals zero. At that time, all of the VideoOnRentals, Bar- CodeVideo locations, and video history counts (for returns) are updated. These are once again assumed to be in the same object as a result of having user view capabilities in SQL.

Determine Service Objects Guidelines for Determining Service Objects

Service objects perform background scheduling, synchronizing, and multitasking control for the ap- plication. The activities performed by some service

518 CHAPTER 12 Object-Oriented Design

OBJECT 10

12 11 .---"";"'::;""---i

TempTrans 3

Customer

VideoOnRental 4

VINV 13

BCVideo

o--------------________________________ ~. End Trans

Time / Events

Legend: 1 - Get Entry 2 - ReadCust or Read VideoOnRental 3 - Create TempTrans, ReadCust 4 - Retrieve all related VOR, Read Videolnv. and Read BarCode Video 5 - Compute Late Fees 6 - Compute Total Amount Due 7 - Display Temp Trans 8 - Process Returns (includes return to steps 4, 6, 7) 9 - Get new rentals, Read Videolnv. and Read Bar Code Video

10 - Format and display new rentals, update Total Amt Due 11 - Process Payment (includes EnterPayAmount, Compute Change,

Display Change) 12 - Print TempTrans, Rewrite old VORs Update BarCodeVideos 13 - Write new VORs, Rewrite BarCodeVideos 14 - End Trans

FIGURE 12-14 Time-Event Diagram for ABC Video Rental Transaction

objects are analogous to those of an operating sys- tem in a mainframe environment which provides job management, task management, memory manage- ment, I/O management, and data management. For that reason we will digress a minute to discuss these operating system functions, relating them to service objects.4

4 This discussion is necessarily short. For further infonnation see Per Brinch Hansen, The Architecture of Concurrent Pro- grams, Englewood Cliffs, NJ: Prentice-Hall, Inc., 1977.

Job management routines initiate processing for individual applications. In multitasking applications, that means that the first scheduling tasks are loaded and turned over to the task management routines for execution. In mainframes, there are multiple jobs, sometimes as many as 50, executing concurrently. The job management routines keep track of all jobs active in the system.

The task manager monitors and tracks individual steps within a multistep set of sequential processes (i.e., a job). Task management is similar to monitor-

ing done for multiple threads of control for concur- rent processes. The work of job and task manager routines are similar and include:

• Load, schedule, execute • End, abort • Get/set process attributes • Create/terminate process • Wait for time • Wait/signal event • Get/set process attributes for jobs, files, or

system data

Multiple-thread management requires both job and task management. Think of individual transac- tions as analogous to jobs to be managed, and of individual steps to completing a transaction as tasks, or processes in OOD terminology. The job manage- ment, transaction routines manage whole transac- tions, and task management routines manage atomic processes to perform the transaction.

Monitoring of individual processes (or transac- tions) and sequences of processes, one per thread, is accomplished either by stacks (sometimes called heaps) or queues, depending on the operating system software. The stack commands are push to add something to the stack and pop to take something off the stack. The queueing commands are enqueue and dequeue, to add and delete items, respectively. The stack (or que) items, in multithread control, include the name of the task, its current execution status (i.e., running, idle, or waiting), and the address of the next command to be executed. One set of stacks is man- aged for each transaction, and one set is managed for each process. Stacks operate on a last-in, first-out principle while queues are first-in, first-out.

Similarly, the I/O manager and data managers act together to perform physical inputting and out- putting of information to central processing unit (CPU) memory. The I/O manager interacts with ter- minals, printers, and other devices that are moving information physically into and out of the computer. The data manager interacts with secondary storage devices, such as disks. The activities performed by these managers include file manipulation and device management. The key activities include:

Object-Oriented Design Activities 519

File Manipulation:

• Create/delete file • Open/close • Read, write, reposition • Get/set file attributes

Device Management:

• Request/release • Read, write, reposition • Get/set device attributes

These tasks are usually provided in primitive form by the operating system and in a more abstract form by a DBMS. The more sophisticated the soft- ware environment, like a DBMS, the more likely the services are provided by the environment.

Finally, memory management keeps track of the location of each item, in random access memory (RAM). Recall that all data and programs must be memory-resident to be executed. In dynamic appli- cations in which modules and data are being moved into and out of memory constantly, memory man- agement is a crucial function. The main functions provided by the memory manager include:

• Allocate/delete memory (can be dynamic or static)

• Track used and free memory location by task • Track used and free memory within each

task's allocation • Garbage collection (identify and erase or

write-over unused objects)

All operating system management is accom- plished by cooperating processes that use event- driven interrupts to provide services in the system. Interrupts at the operating system level are called supervisor calls (SVCs). The implementation of SVCs differs across operating systems.5

5 For a more complete treatment of this information, see any operating systems text. Some good ones include A. J. van de Goor, Computer Architecture and Design, Reading, MA: Addison-Wesley Publishing Company, 1989; Anthony P. Sayers, Operating Systems Survey, NY: Auerbach, 1971; J. Peterson and A. Silbershatz, Operating System Concepts, Reading, MA: Addison-Wesley Publishing Company, 1983.

520 CHAPTER 12 Object-Oriented Design

Now, let's relate this operating system informa- tion to applications. All of these functions are required for the three types of control provided by service objects. If you are working in a Unix or Smalltalk environment which already have been used for application development, many of these functions should already be available for reuse. If you have to write your own, you need to test and retest these functions very thoroughly to en- sure proper application functioning. In any case, you need to decide which of the service object functions are needed and provide them for your application.

The steps to identifying the service objects are:

1. Examine the event diagram and identify each process as sequential or concurrent, and, if concurrent, as independent or cooperating.

2. Define the service needs for loading the object, processing the object, synchronizing the process to others, and sending any mes- sages the object might generate.

3. Compare this list to one specific to the target operating environment that identifies reusable service objects that can be used by this application.

4. Enter the name, language, and any other information needed to identify the reusable object. For all service objects, make sure that the class, object, event, and/or process using the service object are identified.

5. When all reusable objects have been identi- fied, the remaining service objects included in the remaining tasks are divided among the four subdomains as appropriate for module specification.

In general, all applications need scheduling objects (see Table 12-5). The need for synchroniza- tion and multitasking are determined by the time- event diagram and whether or not the objects are concurrent and multiuser. Table 12-5 shows that con- current, single-user processes need synchronization while concurrent and multiuser objects need syn- chronizing and multitasking services. Multiuser, sequential processes, like ABC, require both sched- uling and multitasking services.

TABLE 12-5 Decision Table for Service Object Type Requirements

Problem Domain Object Characteristics:

Sequential

Concurrent

Multiuser

Service Objects Required:

Scheduling

Synchronization

Multitasking

x X

ABC Video Example of Service Objects

First, we examine the time-event diagram to iden- tify each related process as sequential or concurrent, and independent or cooperating.

There are three possible sets of concurrent pro- cesses within one rental transaction shown on Figure 12-15 as circled and numbered sets. The other pro- cesses are sequential. Our decision on concurrency, then, is based on the implementation environment. Let's say that SQL supports multithread but not mul- titasked processing, therefore, we need to decide se- quential ordering of the processes and how the processes will be performed in SQL.

Next, for each process, define the service needs for loading the object, processing the object, syn- chronizing the process to others, and sending any messages the object might generate. SQL supports user views. By creating user views to link Video- Inventory to BarCodeVideo, and VideoOnRental to Customer, VideoInventory, and BarCodeVideo, the opportunity for most concurrency disappears in one database access that retrieves all the related information. 6

6 See Chapter 12 appendix discussion of ~BC in which the ser- vice object discussion results in a different outcome.

Object-Oriented Design Activities 521

OBJECT

TempTrans 3

Customer

VideoOnRental

VINV

BCVideo

o----____________________ ~~ ________ ~

FIGURE 12-15 Potential Concurrent Sets of Processes

Even though we have removed concurrent object processing from the diagram, we still have both transaction level and process level service object requirements. Transactions and processes all need scheduling, including processes that load, store in memory, initiate, terminate, monitor events, and possibly provide message communications between objects.

This list is compared to our target operating environment: SQL on a PC LAN running Novell Netware.™ The services are all provided transpar- ently by the operating environment and are not needed to be developed in primitive form for ABC's application. Even though the target environment is not object-oriented, the need for service objects dis- appears because these are all services provided in the operational environment.

The next step is to examine a current library of reusable objects for use as problem domain pro- cesses. Since ABC's environment is new, there is no

reusable library; therefore, any modules would need specification and development.

Develop Booch Diagram Guidelines for Developing Booch Diagram

Booch diagrams, also called module structure dia- grams, provide a graphical summary of the class and object information in the entire application. The icons for drawing the diagram ate shown in Figure 12-16 with service objects in vertical rectangles with no other detail beyond their name, and problem domain objects in vertical rectangles with smaller ovals to identify the object and horizontal rectan- gles to identify the individual processes. One dia- gram connecting the domains as required is drawn; then one Booch diagram for each subdomain (or for the whole project if it is small) is developed.

The steps to drawing a Booch diagram are:

522 CHAPTER 12 Object-Oriented Design

Service Object Name

FIGURE 12-16 Booch Diagram Icons

1. Draw the Booch icons (see Figure 12-16) relating to service and problem domain objects.

2. Evaluate and choose a scheme for connecting the objects via messages.

3. Draw lines between objects to signify the legal message connections.

4. Define message processing scheme.

Service objects selected for controlling applica- tion operations are arranged by personal preference, but can be grouped by function performed: schedul-

Package

Service Object

ing, synchronizing, and multitasking within subdo- main. The service objects described in the previous section are shown with sub domain grouping in Fig- ure 12-17.

Problem-domain objects are obtained from the process-object assignment list developed during analysis. This table is now reversed with the infor- mation arranged by object for this diagram. During the reversal process, a reevaluation of process-object assignment should be made to ensure that the pro- cesses are associated correctly with their necessary objects. Subdomain groups may be maintained on

Object-Oriented Design Activities 523

Hardware / Operating System

Data

Define Device Execute I/Os

Define Physical Data Stores Open/Close Provide Data Access- Get/Put

Define Logical Data Access Control Application Presentation

Multiuser Management

Hardware Management Open/Close Start/Stop Get/Put

Memory Management Define Get/Put Garbage Collection

Security/Access Control

Application Start/Stop Allocate Memory

Session Start/Stop Allocate Memory

Transaction Start/Stop Allocate Memory Manage Tasks

Human

Define Device Format Screens Get Data Entries Edit for Numeric/Alpha Entry

Software (Problem Domain)

Process Security/Access Management

Load/Release/Monitor Processes Provide Message Communication

between Objects Application Objects

FIGURE 12-17 Service Objects by Subdomain

the diagram which means that we may have new superset objects to define the split between objects for subdomain processing.

Processes that are candidates for generic, reusable object development should be marked consistently in some way, for instance by bold or italic print to identify them visually. A quick glance at the diagram gives the viewer a sense of the extent to which reusable objects and processes are being leveraged in the application.

After the icons are drawn, they are played with to evaluate different message passing schemes.

There is no one right way to do message passing, but there are definitely some methods that are better than others. We will walk through a reasoning process for message passing definition in the ABC Video exam- ple. In general, the goal of messages are

1. To accomplish the application's tasks. 2. Pass minimal information and pass only to

objects requiring information. 3. Minimize the potential for bottlenecks. 4. Maximize the potential for application

throughput.

524 CHAPTER 12 Object-Oriented Design

Distributed Message Control

FIGURE 12- 18 Sample Configurations of Object Message Passing

The evaluation of alternatives is to determine the best throughput scheme of message passing without creating bottlenecks, wbile accomplishing the first two goals. Booch suggests a 3x5 approach to this evaluation in which, rather than drawing the diagram icons on paper, the information for each object is written on a 3" x 5" card. The cards are arranged spa- tially in different configurations on a large piece of paper with lines drawn to signify the required inter- object message communications. When a configu- ration is identified that might be useful, it is

annotated for further analysis. Figure 12-18 shows two different configurations for a simple application. You can see how, if you have 20 or 30 objects, the 3" x 5" method simplifies evaluation of message passing schemes.

All further alternative configurations are evalu- ated to determine message traffic. Message traffic is the number and direction of messages in the system. Overall, the goal is to minimize the number of mes- sages passed for any single transaction, while not overloading any single object with message traffic

related work. 7 The minimum number of messages is n-l, where n is the number of packages needing to communicate in the application. That is, once initiated, each package must communicate with at least one other package. The centralized message control scheme shown in Figure 12-18 shows an example of n-l messages. The arrangement with the best message traffic configuration is selected for prototype development, and the design process continues.

ABC Booch Diagram

Before we can develop a Booch diagram, we need to digress and redefine some application needs to fit the SQL environment. 8 The drawing of packages normally assumes no consolidation of functions or data via user views, but we have collapsed our pro- cessing to take advantage of SQL features. There- fore, Table 12-6 shows the effects of user views on data domain processes: the 11 data processes are now eight consolidated processes. The remaining subdomains are not affected by the data changes.

First, we will draw the packages based on what we now know to be the design of the application (see Table 12-6). There are four data packages: Cus- tomer, Videolnventory, UserViewl which includes VideoOnRental, VideoInventory, BarCode Video and Customer, and U serView2 which includes Video- Inventory and BarCodeVideo (see Figure 12-19). The related processes for those data objects are placed in horizontal rectangles in their respective packages.

There is one scheduling service object (which we may not need because of the environment) that includes initiation and termination of the application, user sessions, and transactions. There is an inter- face service object to provide all display and input from personal computers (see Figure 12-19). The hardware service object contains only one process for printing TempTrans. Finally, the TempTrans

7 This would cause a bottleneck.

8 Don't forget to read the Chapter 12 Appendix for a complete discussion of object-oriented design using an object-oriented development environment.

Object-Oriented Design Activities 525

object contains the data and problem domain pro- cesses that are the core of rental processing.

Next, we try different configurations of the objects to develop a message passing scheme that will provide necessary processing and information to called objects, while minimizing the communica- tions overhead in the application. Figure 12-20 shows one reasonable message passing scheme that follows the logic of processing. The scheduling object passes control to the interface object which has some choices. The interface object could pass, for instance, a CustomerPhone to either TempTrans or Customer to initiate rental processing. If the pass is to Customer, it could return and pass the customer information to Temp Trans , or Customer could con- tinue and initiate TempTrans directly. You see how the options can build and get complex. We will opt for a fairly traditional scheme in which the Inteiface will pass any rental transaction data to TempTrans which will determine what to do with it. This deci- sion is reflected by the line connecting Humanlnter- face with TempTrans.

TempTrans then initiates one of three data re- trievals: Customer, UserViewl, or UserView2. The data is returned and TempTrans continues process- ing. This method of passing provides the most information hiding between objects, but could result in a bottleneck within TempTrans which is controlling all of the interobject communication for the problem (e.g., software), hardware, and data sub- domains. This is a potential problem that would be checked during prototype development.

The Humanlnterface object also communicates directly with Customer and Videolnventory for create processing which does not require Temp- Trans. All completed transactions, regardless of type, return to the Scheduling object to terminate the transaction.

Define Message Communications Rules for Defining Messages

The next step after the Booch diagram is to actually define message contents to provide interobject interfaces for the application. A table is created to

526 CHAPTER 12 Object-Oriented Design

TABLE 12-6 Consolidated Process Subdomain Assignments for Oracle

Process Name Data

EnterCustPhone

ReadCust X

CreateTempTrans

RetrieveVOR X (includes VideoInventory, BarCodeVideo, and Customer)

DisplayTempTrans

EnterBarCode

Retrieve BarCode Video X (includes VideoInventory)

Display Inventory

ComputeTempTransTotal

EnterPay Amt

ComputeChange

DisplayChange

WriteVOR X

PrintTempTrans

EnterBarCode

DisplayTempTrans

AddRetDateTempTrans VOR

Add 1 toVInv

Rewrite VOR data X

ComputeLateFees

Write VOR data X

EnterCustomer

CreateCustomer X

EnterVideoInventory

Create VideoInventory X

document the specific requirements of each message (see Table 12-7). The objects that act as clients are listed in the Calling Object column, service objects are in the Called Object column. This information

Subdomain

Hardware Process Human

should come from the Booch diagram coupled with the Process table generated during analysis that iden- tifies objects with the processes that act on them. The Input Message column describes the data that is sent

Object-Oriented Design Activities 527

Customer Videolnventory

TempTrans

User View 1 UserView2

Schedule Object

Human Interface

I I Enter CustPhone I

I I Enter BarCode I

Hardware Service I Enter PayAmount I I Enter Custlnfo I I Enter Video Inventory I

I I Display TempTrans I

I I Display Inventory I

I I Display Change I

FIGURE 12- 19 ABC Rental Booch Diagram Objects

as part of the calling object message to be processed. The output message is the result data that is sent on (or returned) by the called object after processing. The columns Action Type and Return Object are op- tional. The action type describes the process to be performed in terms of CRUD or other processing. The return object provides continuity of processing

logic when the called object does not return directly to the calFng object.

For each process-object pair defined in the Pro- cess Definition List, we will have one input message to initiate processing and, if needed, an output mes- sage which reports the results of processing. The message list contains one column for each of the

528 CHAPTER 12 Object-Oriented Design

Customer Videol nventory

TempTrans

User View 1

Schedule Object

Human Interface

FIGURE 12-20 ABC Booch Diagram Message Passing Scheme

types of information shown in Table 12-7. The steps to creating the message list are:

1. Make a table with headings as listed in Table 12-7.

2. Refer to the list of all object-process combinations. The objects from that list are listed in the' Called object' column. The processes from the

process list are placed in the 'Input message' column.

3. Next, decide both the 'Calling object' and other 'Input message' entries.

These two definitions seem to go together be- cause as we define the input message, we know the information required to perform the process. Once we know the information to perform the process, we

TABLE 12-7 Message List Contents

Header Contents

Calling object Identifies the client.

Called object Identifies the server.

Input message Identifies the process to be performed and any input parameter data needed to per- form the process, for instance, the data type for polymorphic processes.

Output message

Action type

Return to

Defines the output to be passed, if any.

Defines the process as Read, Read/Write, Write, Display, or Print.

Identifies either the object to which the result is returned or a nested object for further pro- cessing, if any.

decide which object has that information to pass it on. This step determines much of the logical process flow from one encapsulated object-process to another. The logical process flow defines the se- quence of processing in the application.

4. Define the 'Output messages' by determining what type of information is required next from each process as it completes. For data entry type processes, frequently the output message is only an acknowledgement of pro- cessing (ACK = successful, NACK = unsuc- cessful). For some processes, no response is required.

5. Complete the 'Action type' column.

The action type summarizes the type of process- ing for designers to determine possible implementa- tion consolidation of activities, or to decide on further allocation of processing to hardware, soft- ware, or firmware.

6. Define the return object column.

Object-Oriented Design Activities 529

This column usually refers to the calling object which is ordinarily the object to which control returns, but some nested subprocess might take place. When subprocessing occurs, the return object column identifies the next object entered to help other software engineers understand the logic flow.

Completeness and correctness review of the mes- sage list is done to ensure that each process-object pair has an associated message in the table and that the calling/return objects are correct.

ABC Video Example of Message List

First, we make a table with the above headings. Then, referring to the process list that we used to draw the Booch diagram, we list all object process combinations. The objects from that list are listed in the 'Called object' column. Make sure that all process-object pairs have one entry in the table.

Next, we decide both the 'Calling object' from the Booch diagram and the 'Input message' for each entry (Table 12-8 shows the completed list). Then the 'Output message' is completed for each entry. As the output message is complete, we complete each line with the 'Action' and 'Return Object' definitions.

Table 12-8 shows the message list for ABC's application. It reflects the consolidated data objects, the messages decided during the development of the Booch diagram, and the details of information that must be provided for each object-process. Notice that many processes are called from within an object itself. This localizing of processing is desir- able to simplify interobject communication and ensure information hiding, but it also can encourage development of nonobject-oriented designs. Make sure that each message contains all, and only, the information required to perform the process. Make sure that each message returns only the information required by the client object.

Develop Process Diagram Guidelines for Developing the Process Diagram

A process diagram depicts the hardware configura- tion and the allocation of processes to processor

530 CHAPTER 12 Object-Oriented Design

TABLE 12-8 Message List for ABC Video Rental Processing

Calling Called Input Output Action Return Object Object Message Message Type Object

Human Customer Customer CustomerPhone Create Human

Interface Information Interface

Human Video Video VideoId, Create Human Interface Inventory Information # BarCode Interface

Videos Created

Schedule Schedule Application Id Queue Address Execute Schedule Init Appl

Schedule Schedule UserId Memory Execute Schedule Address or Init Session Logoff

Schedule Schedule Session Id, None or Quit Execute Human Menu Selection Session Init Session Interface for Rental

Human Human No data Trans Request Enter TempTrans Interface Interface (Initiate Data Memory Request

Request) Address

Human TempTrans Trans Request Data access key Create UserViewl Interface data TempTrans or Customer

TempTrans Customer Data access Customer Info Read TempTrans key

TempTrans UserViewl Data access key Customer, Read TempTrans VideoOnRental, Video Inventory, BarCode Video

Customer or TempTrans TempTrans Info TempTrans Format TempTrans UserViewl

TempTrans TempTrans Memory Location, Ack Compute TempTrans VideoOnRental, Late Fees Rent/Return Date

TempTrans TempTrans Memory location Ack Compute Total TempTrans (Amounts Due) Amount Due and End of rentals/returns when present

TempTrans Human TempTrans Info Display Human Interface and End of Interface

rentals/returns when present

Object-Oriented Design Activities 531

TABLE. 12-8 Message List for ABC Video Rental Processing (Continued)

Calling Called Input Output Action Return Object Object Message Message Type Object

Human Human No data Prompt Prompt TempTrans Interface Interface (Execute BarCode or

Request) End of Rentals/ Returns

Human TempTrans BarCode None Format UserView2 or Interface (Rental) or TempTrans

End of rental

Human TempTrans BarCode None Format TempTrans Interface (Return) or

End of return

Temp Trans User View2 Bar Code Video Read TempTrans (new rental) Inventory,

BarCode Video

UserView2 TempTrans TempTrans Info TempTrans Format Human Interface TempTrans

Human Human End of Rentals/ Payment Data Entry TempTrans Interface Interface Returns Amount

Human TempTrans Payment Change or Compute Human Interface Amount Payment Due Change Interface

Temp Trans Human Change or End of Trans Display TempTrans Interface Payment Due

Human Temp Trans End of Trans None Change User Viewl Interface BarCode Status

Temp Trans User Viewl Video on Rental Ack Rewrite TempTrans Information

Temp Trans User Viewl Video on Rental Ack Write TempTrans Information

Temp Trans Hardware TempTrans None Print Schedule Services

Hardware Schedule Trans Id Terminate Schedule Services Trans

Schedule Schedule Session Id Terminate Schedule Session

Schedule Schedule ApplId Terminate System Appl.

532 CHAPTER 12 Object-Oriented Design

platforms in a distributed environment. There are two types of icons used in the diagram: processor and device. A processor is any intelligent device that performs data, presentation (i.e., monitor dis- play), or application work. A device is any dumb device that is part of the hardware configuration sup- porting application work. Processors are shown on the diagram as a shadowed cube; devices are shown as transparent cubes (see Figure 12-21). This dia- gram is a crude equivalent of a system flowchart used before process methods were developed. It is crude because devices and processors are all treated as the same, the only immediate visual knowledge the user gets is the configuration size and the extent to which intelligent processors are used.

The methodology assumes that hardware config- uration decisions are not part of the SE task and that the hardwflre decisions are known. Similarly, there are no guidelines for allocating processes to proces- sors. This is an artifact of the development of 00 in a defense environment in which the applicatidn developers are working from specifications devel- oped by government employees in another city. In the absence of guidelines from the methodology, we can borrow the distribution decision techniques from information engineering and apply them to this decision. In any case, the processes are listed in small print next to the processor in which they will operate.

One shadow cube is drawn for each processor. Individual processes are allocated to each proces- sor. Lines are drawn to show communications capa- bilities between the processes, not between the processors (i.e., the processors are assumed to be

Processor 1-----1

t Communication

Link

Terminal Device

FIGURE 12-21 Process Diagram Icons

.. .~(

File Impact Server Printer

AIIP rocessing

Personal Computer

FIGURE 12-22 First-Cut ABC Process Diagram

networked whether or not the application processes communicate). Only one line per set of processors is drawn, since the details of messages are docu- mented elsewhere. The lines only have directional pointers to show one-way communication.

Next, for each processor, draw the terminals, printers, disk drives, and other peripheral devices that are attached to it. If there are more than one disk drive in the configuration, a list of the classes, class/ objects, and objects is made near each drive that will contain data used by the application.

Finally, the diagram is compared to the message list to ensure that all messages are accommodated in the diagram and accurately depict communica- tions between processes. The Booch diagram or the message list can also be used to validate the accuracy and completeness of processes allocated to proces- sors, and of the data allocated to storage devices.

ABC Video Process Diagram

The most simple form of ABC's process diagram shows the file server as the processor and the PCs and printers as terminal devices (see Figure 12-22). This allocation of work is a problem because it does not take advantage of PC intelligence and, therefore, is suboptimal in terms of benefits to be gained from using PCs. Having said this, the allocation is con- strained by the software environment. If SQL sup-

ports multilocation processing, then the comment stands. If SQL does not support multilocation processing, then the figure is complete. As it is currently, SQL does not support multilocation pro- cessing although it does support distribution of databases.

An alternative process distribution is shown in Figure 12-23. Even with SQL, we could distribute editing, the hardware management functions, pay- ment and change processing, and printing of the rental copy to the local pes. This is a more complex application because the multiple sites now require synchronization and intraprocessor scheduling in order to coordinate their work, but, if bottlenecks show up in a prototype of the first-cut process dis- tribution, this is a likely candidate for the second iteration of design and prototyping. As it is, we select the simple design because it is significantly easier to implement and maintain, having no syn- chronization overhead. If it works and is robust to additional users, the first prototype will be com- pleted and placed into production.

File Server

All DBMS processing All transaction/thread

management All rent/return processing

except editing, payment, and printing

Object-Oriented Design Activities 533

Develop Package Specifications and Prototype Guidelines for Package Specifications and Prototyping

At this point in the design, the functions to be per- formed are translated into package specifications for translation into program code. A package is an encapsulated definition that contains both data and process specifications that define an execute unit. The data might be defined in the form of a class, class/object, or object, with specific attributes and identification. There may be one or more process in a package; they result in individual module specifi- cations and are independently executed under the control of service objects.

Packages have both public and private parts which are specified. The public package part iden- tifies the data and processes to the application with- out any indication of how they are physically implemented. The private package part defines the

Personal Computer

Impact Printer

Printer hardware functions Edit all fields Process payment and make change Print rental copy

FIGURE 12-23 Alternative ABC Process Diagram

534 CHAPTER 12 Object-Oriented Design

physical implementation. If there are polymorphic definitions of a function, each version of the function is defined separately, and the control mechanism for interpreting the message and activating the appropri- ate function is defined. Service objects should be used for this interpretation and activation if at all possible.

The steps to package specification are:

1. Review the diagram/list set. 2. Redraw a subset of Booch diagrams, one per

processor in the process diagram, to depict objects and processes by processor.

3. Document packages. 4. Design physical database if not already

designed. 5. Develop pseudocode specifications for all

processes and messaging handling routines.

ABC Video package specifications are not cre- ated for this step as it is beyond the scope of this text.

WHAT WE KNow ___ _ AND DON'T KNow ___ _ FROM OOA _____ _ AND OOD _____ _ Object orientation, based on the contents of tables and diagrams, provides a detailed, reasonably com- plete view of an application. Exceptions to this view are human interface design and specific attention to database, input, and output design. Object-oriented design is distinguished by three characteristics: detail, all potential environments are accommodated, and the need for an object-oriented implementation environment to obtain the payoff from the exercise.

The extensive detail generated in object-oriented design leads directly to module specification which should be straightforward since the definition of process details, the class/object data, constraints, and message communications are all completely defined.

Object orientation, as seen by the exercise in the chapter, can accommodate even nonobject-oriented environments. The benefit of OOD's ability to accommodate any application environment is that, for on-line, object application environments, the

methodology does lead to information hiding, mini- mal coupling, and maximal cohesion by virtue of the thinking processes. Object orientation requires good understanding of operating system concepts, object thinking, and interactions between services and applications. The design process, as the chapter appendix shows, requires iteration and proto typing to get required levels of detail and to ensure efficient processing of message traffic. Most important, object thinking IS NOT the same as entity thinking or as process and data methodology thinking. Object orientation requires a paradigm shift to be done correctly.

Object orientation is not very object-oriented in an SQL implementation environment. The choice of SQL changes the entire design from what it would be in an object environment to be object-based. Like COBOL, the methodology can be made to do any- thing. Is this the best use of OOD? Not in my opin- ion. Unless an application is at least on-line and will be in an object-oriented environment, the work required for object-oriented design is not worth the effort. Especially with a fourth-generation DBMS, like SQL, the undesign that must be done wastes tremendous time and could result in a worse design than use of some other methodology. While this compromise is acceptable for a small, on-line appli- cation such as ABC, it would not be acceptable for applications with real-time or more complex pro- cessing requirements. Much of the effort to develop an object-oriented design is wasted when the imple- mentation environment is not object-oriented. There- fore, the choice of methodology should be driven by the expected implementation environment.

AUTOMATED __________ __ SUPPORT TOOLS FOR __ _ OBJECT-ORIENTED ___ _ DESIGN ______________ _

There are a vast number of object-oriented CASE tools that have all come on the market in the last few years. Some are more complete in life cycle cover- age than others. Some environments, such as 001 Tool Suite, cover most of a development life cycle,

Summary 535

TABLE 12-9 Automated Support Tools for Object-Oriented Design

Product

001 Tool Suite

Actor

Company

Hamilton Technologies, Inc.

Symantec Cupertino, CA

Technique

Full life cycle multiuser OOA, OOD, and code generation tool for C or Ada

OOD environment for client/server applications. Links to C and SQL databases.

Aide-De-Camp Software Maintenance and Development Systems Concord, MA

Configuration management software with support for 00 languages.

BOCS Berard Software Engineering, Inc. Berard object and class specification

C/Spot/Run

Design/1XO, Design/IDEF, Design/OA

Procase, Corp. Santa Clara, CA

Meta Software Corp.

Interactive, GUI environment for C language development on Sun, HP, and Apollo hardware

Data and behavior modeling expressed in 00 C-language tool

DSEE, HP /Softbench Apollo/Hew lett -Packard Palo Alto, CA

Integrated CASE Product Supporting 00 Analysis

Excelerator

IPSYS OONRD Tool Suite

Object View

Object Vision

Index Tech. Cambridge, MA

IPSYS Software

Knowledge Ware Atlanta, GA

Borland International Scotts Valley, CA

in this case, from analysis through code generation. Some tools, such as ObjectView, are more object- based than object-oriented. Some, like Software Through Pictures, try to shiel,d the user from code altogether by sophisticated graphics that generate objects for that environment. Their existence attests to the object revolution that is beginning to be felt in business organizations.

State-transition diagram Matrix graph (RTS)

Shlaer-Mellor OOA and Recursive Design

Application prototyping software using 4GL or SQL code

Visual application development system

(Table continues on next page)

SUMMARy ____________ _

Object-oriented design (OOD) requires detailed development of all required functionality in the operating system and how it interacts with an appli- cation. In this chapter we developed the seven steps to object-oriented design, linking them to the tables developed during object-oriented analysis. First, the

536 CHAPTER 12 Object-Oriented Design

TABLE 12-9 Automated Support Tools for Object-Oriented Design (Continued)

Product

ObjectMaker

Company

Mark V Systems

Technique

Full life cycle structured analysis using Ward-Mellor extensions tool with code generation for Ada, C, and C++

OMTool, OMT/SQL GE Advanced Concepts Center OOA and OOD with schema

ProMod

SmalltalkN

Promod, Inc. Lake Forest, CA

Digitalk Los Angeles, CA

compilation compatible with Oracle, Ingres, and Sybase

Control flow diagram State-transition diagram Module networks Function networks

32-bit Smalltalk for OS/2 hardware

Software Backplane Cohesion

Atherton Technology!Digital Equipment Corporation Maynard, MA

Integrated CASE Product Supporting 00 Analysis

Software Thru Pictures

Teamwork

Interactive Dev. Env. San Francisco, CA

CADRE Tech. Inc. Providence, RI

Control flow State-transition diagram

DFD Control flow State-transition diagram Process activation table

Telon Pansophic Systems, Inc. Lisle, IL

State-transition diagram Code generation

Treed4C, Tree4Fortran, Tree4Pascal, TreeSoftl

1 Software Engineering Camarillo, CA

Program code reengineering products for Sun hardware

Visible Analyst

vs Designer

Visible Systems Corp. Newton, MA

Visual Software Inc. Santa Clara, CA

objects are allocated to four subdomains: human, hardware, software, and data. The split of pro- cessing into these four areas accommodates the use of, for instance, firmware, distributed comput- ing, DBMSs, and intelligent interfaces in what would otherwise be a monolithic development of an application.

The second step of OOD is the development of time-event diagrams for all processes and all objects.

State-transition diagram

Booch diagram

The purpose of a time-event diagram is to allow the analysts to identify independent, sequential, concur- rent, independent, and concurrent, dependent pro- cesses. Usually, several alternative ways of looking at the timing of processes emerge from this analy- sis, one of which is selected for development.

Once the types of process are defined, their ser- vice object needs are identified. Service objects closely parallel operations performed by an operat-

ing system (OS). ass have five main functions to manage: memory, job, task, I/O, and secondary stor- age. The memory, 1/0, and secondary storage man- agement functions are directly translatable into object thinking. Job management functions are anal- ogous to those performed at the control level for an entire application andlor user. Job management is more appropriately called session, or -qser, manage- ment in object terms. Similarly, tasks are individual steps of a job and are analogous tq transaction- related modules when thinking in objects. Therefore, the term used here for task functions is transaction management. Each type of management function requires its own type of processing and the processes selected are particular to the application and imple- mentation environment.

The fourth step of OOD is to develop a Booch diagram to summarize the objects-both applica- tion and service-and their interactions. Booch rec- ommends a 3" x 5" approach for which each object and its processes are shown as a package on a 3" x 5" index card. The set of cards is moved into different configurations and message connections are drawn. The purpose of this exercise is to choose a message- passing scheme that minimizes the pqtential for bot- tlenecks and that provides information hiding and minimal coupling. The final configuration selected is documented for the application.

The message connections decided during design of the Booch diagram are elaborated in the next step, which is to define message communications. Each called object and its calling object, input message, output message, action type, and return object are identified.

At a higher level of abstraction, the next step is to develop a process diagram that shows the distri- bution of functionality and equipment for the appli- cation being developed. A process diagram depicts processors, for example, computers, and devices, that is, limited-intelligence equipment such as a disk drive. All equipment and their interconnections are identified. Multiprocessor interconnections show allowable message movement throughout a network, while the device connections show hardware con- figuration. The functions performed at each proces- sor in a multiprocessor configuration are also on the diagram.

Key Terms 537

The last step of 00 D is to develop package, or module, specifications for programming. The infor- mation from the various tables and graphics is rearranged to show the relevant information for each particular module. Also, details of each module's logic, if not already documented in a dictionary, are defined in the package specifications.

OOD CASE tools come in several varieties: object-oriented life-cycle development, object- oriented design without code support, object- oriented coding without design support, or object-based thinking through adaptation of exist- ing methods.

REFERENCES __________ __

Booch, Grady, Software Engineering with Ada, second ed. Menlo Park, CA: Benjamin/Cummings Publishing Co., Inc., 1987.

Booch, Grady, Object Oriented Design with Applica- tions. Redwood City, CA: Benjamin/Cummings Publishing Co., Inc., 1991.

Coad, Peter, and Edward Yourdon, Object-Oriented Analysis, second ed. Englewood Cliffs, NJ: Prentice- Hall,1990.

Coad, Peter, and Edward Yourdon, Object-Oriented Design. Englewood Cliffs, NJ: Prentice-Hall, 1991.

Graham, Ian, Object-Oriented Methods. Reading, MA: Addison-Wesley Publishing Co., 1992.

LaFore, Robert, Object-Oriented Programming in Turbo C++. Emeryville, CA: The Waite Group Press, 1991.

Peterson, J., and A. Silbershatz, Operating System Concepts. Reading, MA: Addison-Wesley Publishing Company, 1983.

Rumbaugh, James, Michael Blaha, William Premerlani, Frederick Eddy, and William Lorensen, Object- Oriented Modeling and Design. Englewood Cliffs, NJ: Prentice-Hall, 1991.

KEy TERMS _______ __

3" x 5" approach binary message binding Booch diagram client object concurrent processes data subdomain

device dialogue dynamic binding hardware sub domain human sub domain keyword message logical process flow

538 CHAPTER 12 Object-Oriented Design

message message traffic module module structure diagram multitasking multitasking objects object-based package specification package polymorphism private interface private package part problem-domain objects process diagram processor pseudo-dynamic binding public interface

public package part round-trip gestalt scheduling scheduling objects server object service objects software subdomain static binding supervisor call (SVC) synchronizing synchronizing objects thread of control time events time-event diagram unary message utility objects

EXERCISES _______ _

1. Continue with the exercise begun in Chapter 11. Design the application for Eagle Rock Golf League.

2. Design all Customer processing for ABC's application. Why is it different from that of Videolnventory? If we add multiple members to a household, how does that change the design?

3. Compare the SQL and C++ designs for ABC rental processing. If there are bottlenecks in pro- cessing for the two designs, where are they likely to be? How might they be removed? Which design gives you better control over the computer and its resources?

STUDY QUESTIONS ____ _

1. Define the following terms: message service objects object synchronizing polymorphism thread of control problem domain time-event diagram round-trip gestalt

2. Define the four subdomains and the type of objects found in each.

3. What benefits accrue from the allocation of processes to hardware, software, database, and human sub domains ?

4. Why are service objects needed? When are they needed and when not?

5. What is multitasking? Why is it important in application design?

6. What is the purpose of a Booch diagram? 7. List and compare three types of message

formats. 8. What is the purpose of a process diagram? 9. Describe client/server computing and how it

relates to object orientation. 10. What is binding? What types of binding are

possible? How do you know what type is used in an application you are developing?

11. Describe an example of polymorphism. 12. What are some of the problems associated with

allocation of processes to subdomains? 13. What does the configuration If on a time-

event diagram mean? 14. Describe how to interpret a time-event

diagram. 15. Describe how operating systems relate to ser-

vice objects. 16. Describe the kinds of activities managed by the

task manager. 17. What are the control levels in object orienta-

tion that are analogous to job and task manage- ment in an operating system? Distinguish between them and the tasks they manage.

18. What is memory management and why is it necessary?

19. List the steps to defining service objects. Describe some of the problems related to this activity.

20. What is the purpose of a Booch diagram? 21. Describe the steps to developing a Booch dia-

gram. What information is shown on the diagram?

22. What is a package? What are its contents on a Booch diagram? What are its contents in a working application?

23. Booch recommends the use of 3" x 5" cards to create and 'play' with the Booch diagram con- tents. What is the playing for? Why are 3" x 5" cards helpful to that process?

24. List three design goals of messages. Create an example of message passing in an object- oriented application. Describe different types

of messages to illustrate good and poor mes- sage designs.

25. What information is placed in the message table to document message traffic in an application?

26. Why is message definition a difficult activity? 27. Describe the icons used in a process diagram

and their purpose. 28. How many Booch and process diagrams are

drawn for an application? 29. Describe the validation processes used

throughout an object-oriented design process. Why is each validation step where it is in the process and what is the purpose of each validation?

30. Discuss the statement: "There is no such thing as a one-shot object-oriented design."

31. What information is provided for package specification documentation? How do you decide what is public and what is private infor- mation to an object?

32. What is the role of proto typing in object orientation?

* EXTRA-CREDIT QUESTIONS 1. Research queue or stack management. Write a

two-page paper to describe the functions of that type of management. Then, design the object- oriented class/objects and processing routines that would accomplish these functions.

2. Booch discusses primitive processes in detail and names several different types of primitive processes. Research these types of processes and discuss their importance to object-oriented design. How important is it to have a name for each type of thing in a design?

APPENDIX: UNIX/ ___ _ C++ DESIGN OF ____ _ ABC VIDEO _____ _ Although the Chapter 12 presentation of ABC Video's design began as object-oriented, it ended as a hybrid: part-object and part-not, because of the im- plementation environment. This appendix is the

Appendix: Unix/ C++ Design of ABC Video 539

same design with a discussion of the decisions and alternatives from a purely object-oriented perspec- tive. Chapter 12 presented a consistent discussion of the implementation throughout the text and shows what happens when you deobjectify the application to fit a particular language environment. This appen- dix, then, gives you a basis for contrasting what would happen if you designed a purely object- oriented application. Each stage of the process is presented with enough comment for you to see the differences between the hybrid and object designs. Package specifications and a prototype are still beyond the scope of this discussion, but we present a partial package specification so you can contrast the levels of detail for OOD to the other methodologies.

A few terminology differences exist with the Unix, C++ environment and we start with them. Class structure is similar in C++ to the discussion in the chapter. Data in C++ is defined by structures. A structure that contains both data and functions is called a class. Classes were defined in the chapter as having public and private parts. In C++ classes have public, private, and protected parts. The public part is that part accessible by the rest of the system. The private part is not directly accessible by any other classes. These two definitions have not changed from the chapter. A protected part specifies what may be inherited, that is, processes that are accessible by member processes in its own class or in any class derived from its own. A derived class is one that has multiple inheritance and is made up of its own, and its inherited, data and functions. Class inheritance is implemented by having processes that have a protected status. Thus, in C++, the manner of implementing inheritance is to provide the pro- tected part of an object and to distinguish inheriting objects by calling them derived classes.

The term process refers to functions in C++. Functions can be part of a class (i.e., a member) and restricted in use, or they can be stand-alone entities that are independent of a class. At least one inde- pendent function, mainO, is required to initiate pro- cessing of a program or application. Many functions are provided in a library of reusable functions that are link-edited to compiled code for execution. We will not spend much effort on functions since they are most evident at the code level.

540 CHAPTER 12 Object-Oriented Design

Individual language operators are analogous to other languages. Polymorphism is termed operator overloading but the meaning is the same. Virtual functions are the method used to provide run-time binding for polymorphic functions. Other function types beyond the typical ones associated with classes include friend functions, that have read only ac- cess to the private data of a class, and static func- tions, that operate on the class level rather than at the object (i.e., instance) level. Borland's Turbo C++ provides an entire set of classes with functions and inheritance as the basis for developing applications. The' container' classes, for instance, include several types of arrays, associations, hash tables, lists, stacks, and queues. The container classes are impor- tant because they provide a means for imple- menting service objects. Next, we discuss the object-oriented design (OOD) activities.

Allocate Objects to Subdornains In object-oriented analysis (OOA), we defined classes, class/objects, and superset classes needed to properly define all of the interrelationships among objects in the application. This diagram and the table matching processes to their objects are the basis for this activity. The allocation in Table 12-3 has no change here (see Table 12-A1).

In allocating the data handling functions to the data subdomain in C++, we commit to designing generics to handle all files. This means that we need a new object for DB actions. Also, there will be no collapsing of data objects as in SQL. Object-access control will be implemented as a superset of func- tions to mirror the object relationships. To imple- ment the generics, a fixed message type that accommodates all of the processing for all of the data objects is required. Such a message's minimal contents are: From-Object, To-Object, Action, Ob- ject, Return-code, Physical-Location-Key, Length- of-Data, and Data.

While the subdomain allocations do not change, the handling of them does. Once functions are allo- cated to a DBMS, all developers need to know all allowable interactions. Those interactions must be defined and designed manually when no DBMS is used. A partial list of functions required includes:

Locate Data (transform key to physical location) Get Data (may include a prechange write to a

log for recovery) Rewrite (may include a postchange write to a

log for recovery) Write (may include a postchange write to a log

for recovery) Delete (may include a postchange write to a log

for recovery) Space Management Queue Management (including service requests

and service responses) Backout Management Commit Management Lock Management Access Control Management Error processing for such problems as data not

found, out of space, hardware error, or unsuc- cessful read, write, rewrite, or delete.

These functions can be defined and incorporated into documentation at sub domain allocation time or dur- ing service object definition.

The human interface definition is also going to be different. In the main text of this chapter we designed the system for a 4GL, in which a screen is painted and the programmer only needs to know the fields, their format, and desired characteristics. The 4GL software manages all of the formatting and set- ting of field attributes. In a lower level language, such as C++, screen format, line, starting position, length, field attributes (e.g., blink, reverse video, or color), and field contents are all managed by the pro- grammer and, therefore, require design.

Another choice we make is to have full-screen, line-at-a-time, field-at-a-time, or character-at-a-time interactions. Selection of input method is application specific. In ABC's case, we decide that using a method that will not slow down users the least dur- ing peak periods is best. Since actual data entry is limited to CustomerPhone, VideoBarCode, and money amounts, for rental processing, and since rental processing is the most used function, we choose field-at-a-time entry. If the application had thousands of users and millions of transactions each day, we might have field-level entry for rent/return processing and screen entry for customer and video

Appendix: Unix/ C++ Design of ABC Video 541

TABLE 12-A 1 Process Subdomain Assignments

Process Name Data

EnterCustPhone

ReadCust X

CreateTempTrans

Retrieve VOR X

DisplayTempTrans

EnterBarCode

RetrieveInventory X

Display Inventory

ComputeTempTransTotal

EnterPay Amt

ComputeChange

DisplayChange

UpdateInventory X

WriteVOR X

PrintTempTrans

EnterBarCode

Retrieve VOR X

DisplayTempTrans

AddRetDateTempTrans VOR

AddltoVInv

UpdateInventory X

ComputeLateFees

WriteVOR X

EnterCustomer

Create Customer X

Enter VideoInventory

Create Video Inventory X

maintenance, because they are more data-entry in- tensive activities. Whichever input 'chunking' method is chosen, we must intercept start and stop characters from the keyboard and bar code reader to

Sub domain

Hardware Process Human

synchronize processing between the input devices and the computer.

With field-level input, we could choose field-level interactions, having local, PC-based intelligence

542 CHAPTER 12 Object-Oriented Design

simulating a 4GL that checks alphabetic/numeric contents and beeps on errors. This greatly compli- cates the application and is decided against. At some future date, if the number of users begins to tax the file server, we could revisit this decision to speed processing by off-loading work from the server.

Draw a Time-Event Diagram The time-event diagram also does not change and is presented here as Figure 12-Al. Now we will pay more attention to the potential for concurrency, because we must be able to prove the processing and that implies monitoring of the success of all write, rewrite, and print actions.

The choices for concurrent processing all relate to data I/O, and the consequences of deciding for con- currency must be considered. First, consider conse- quences of concurrency if we opt for read/write concurrency. At the hardware level, the affected databases must be on separate buses (on a PC) or channels (on a mainframe) to ensure that the pro- cesses are not contending for the same hardware disk access time. Second, management and synchroniz- ing modules to reunite multiprocesses within a thread and to verify processing are required. This implies a need for queues for each process and for each thread. For each process we need process ID, thread ID, and return code. For each thread, we need all concurrent processes' IDs and return codes from processing. Side effects of potential errors must be considered. For instance, if Write VideoOnRental, RewriteVideoOnRental, PrintReceipt, and WriteHis- tory objects are all active at the same time, we need to decide acceptable combinations of successful! unsuccessful processing and actions taken for each possible combination.

Concurrency decisions should be based on busi- ness constraints and needs for processing or re- sponse time. There should be some attempt to compute how long a transaction will take and to determine response time. For example, ABC rental transactions have an approximate processing time of 8.6 seconds (8566 ms; see Table 12-A2) during nonpeak time and about 11 seconds during peak pro- cessing times. From this table, which the SEs gen- erate, we see that input and output from the terminal

account for 8.1 seconds of the total and actual inter- nal processing is about 506 ms or slightly over one- half second. If the internal time were over two seconds, we would opt for concurrency to minimize the internal strain on processing. With under a half- second processing time, we can continue thinking of sequential processing as we did with the SQL solution. The differences in using SQL versus an object-oriented language are not yet apparent. The major difference so far has been the level of detail of the reasoning process to make concurrency and data-related decisions. This level of detail is simi- larly lower for the other OOD reasoning processes as well.

Determine Service Objects In this section, we list the required service object functionality to show the level of detail and com- plexity required of true object systems, but without much explanation. We will assume that the Unix/ C++ environment being developed for ABC will employ reusable code objects for many service func- tions. 'Free' code is one of the benefits of using con- sultants who come with their own implementation modules for many functions. We still need to deter- mine which modules are needed, however. Referring back to Table 12-5, ABC is a sequential, multiuser application with needs for scheduling and multi- tasking management, in addition to I/O, user, transaction, thread of control, memory, startup/ shutdown, and data management. Table 12-A3 lists high level service objects required to support ABC's application.

Input/output is straightforward. There are four I/O functions to design: keyboard, bar code reader, display screen, and printer. We assume that all input interactions are from the keyboard or bar code reader, which read slightly differently. The keyboard is read one character at a time until a field is com- plete. The bar code reader reads the entire code, or field, at once. Thus, we can use polymorphic mod- ules to GetField and possibly for other functions as well. Likewise, we assume all output interactions are to the display screen and printer. The basic actions for all four devices is to start, synchronize (abbrevi- ated synch from now on), get/put, wait, or stop.

Appendix: Unix/ C++ Design of ABC Video 543

OBJECT 10

12 11 .--------1

8 9

TempTrans 3

Customer 2

or VideoOn Rental 2 4

VINV 13

4 9 12

BCVideo

4 9 12 13

O ______________________________________ ~~End Trans

Time / Events

Legend: 1 - Get Entry 2 - ReadCust or Read VideoOnRental 3 - Create TempTrans, ReadCust 4 - Retrieve all related VOR, Read Videolnv. and Read BarCode Video 5 - Compute Late Fees 6 - Compute Total Amount Due 7 - Display Temp Trans 8 - Process Returns (includes return to steps 5, 6, 7) 9 - Get new rentals, Read Videolnv. and Read Bar Code Video

10 - Format and display new rentals, update Total Amt Due 11 - Process Payment (includes EnterPayAmount, Compute Change,

Display Change) 12 - Print TempTrans, Rewrite old VORs Update BarCodeVideos 13 - Write new VORs, Rewrite BarCodeVideos 14 - End Trans

FIGURE 12-A 1 ABC Time-Event Diagram

Waiting requires a queue to manage multiple waiting requests.

User routines initiate an application session and verify user access. The 'put' commands all interface to the screen I/O manager, handing off the message to be displayed. Similarly, the' get' commands all interface with the keyboard or bar code routines of the I/O manager. The purpose of user logon routines is to identify physical terminal address (TermID) and user (User/D).

The transaction object and its routines manage individual transactions selected from menus. Infor- mation is directed to a specific device based on the

TermID and UserID passed from the User routines. For instance, customer Maintenance has four trans- actions: create, delete, update, and retrieve. Job rou- tines then display menus and alter menu contents based on user logon and access codes. As above, 'puts' interface with the screen or printer routines of the I/O manager objects and 'gets' interface with the keyboard or bar code reader routines. The infor- mation passed to the command object for use in process control includes Term/D, User/D, and TransCode.

Thread of control is handled by a command object and routines which manage atomic processes,

544 CHAPTER 12 Object-Oriented Design

TABLE 12-A2 Rent/Return Transaction Processing Time Estimate

Internal Instruction Input* Process Output Total

Get 1000 1000

Read (average 3) 30 ms each 96 96 plus data transfer of 6 ms each

Compute late fees 30 30

Compute amount due 10 10

Display (average 20 lines, 3000 3000 150 ms/line)

Get Returns (30% of 1000 1000 transactions)

Retrieve VOR (average 3) 96 96

Compute late fees and amount 30 30 due (10 ms each)

Display 3 lines 450 450

Get Rental (assume one) 1000 1000

Retrieve 3 DBs 96 96

Compute amount due 10 10

Display rental line, amount 300 300 due line

Process payment-enter 1000 1000 amount

Compute change 10 10

Display new amount due, 300 300 change

Print (assumes automating 10 10 queuing and time to transfer queue address)

Rewrite (average 3) 96 96

Write (average one) 32 32

Subtotal (nonpeak time) 4000 506 4060 8566

Time in queue (average .33 2855 trans waiting during peak times transaction time)

Total peak processing time 11421

* All times are in milliseconds.

Appendix: Unix/ C++ Design of ABC Video 545

TABLE 12-A3 Service Objects Required for C++ ABC Application

I/O Manager Keyboard Processes

Bar Code Reader

Display Screen

Printer

User Object

Get character until end of field Ready to receive (Sync keyboard) Start keyboard entry Reset keyboard Send entry to screen formatter

Start reader Sync reader Get bar code Send bar code to calling routine

Identify screen location and type interaction

Format screen protected lines Format screen data lines Put keyboard entry in field Set field attributes Check allowable value Get error message Send entry to calling routine Put screen Put screen line

Sync printer Start print Put lines until end of print Stop printer Get print lines until end of print Wait to print Store print lines for 60 seconds Queue address, length of print

information

Put logon prompt Get logon Verify logon Put error

that is, they supervise execution of code modules. The object reads code into memory, passes one instruction at a time to the CPU for execution, and interfaces to the other manager routines to perform I/O, memory, and data management. The command object uses the fields passed from the transaction object and adds to it the task and task status.

Memory management is designed simply to allo- cate the maximum amount of space for a transaction to any request. The largest transaction is a rental!

Transaction object

Thread of control- Command Object

Memory Manager

Put password prompt Get password Verify password Put password error

Put menu Get selection Verify selection Get memory Release memory Set up global user area Release global user area Call defrag for user area

Get memory address of data Get memory Set status Queue instructions for execution

(i.e., call object/process) Transfer control to TempTrans or

Data Enqueue transaction Dequeue transaction Execute instruction Check status Create status Delete status Release memory

Allocate memory Deallocate (free) memory Defrag memory (i.e., defragment) Queue memory request Dequeue memory request

(Table continues on next page)

return which is estimated to take 13,860 bytes as follows:

Design Element

Screen 80 x 22 Max fields 100 bytes x 10 lines Attribute bytes three/field Miscellaneous data area Code

Total

Bytes

1,760 1,000

300 800

10,000

13,860

546 CHAPTER 12 Object-Oriented Design

TABLE 12-A3 Service Objects Required for C++ ABC Application (Continued)

Start/shut MainO

Data Manager

Set up all memory Initiate managers Load application code Allocate transaction code locations Store application code Get DB indexes Store DB indexes Start DBs Close DBs Transfer to User

Open DB (Open Index, Read Index into memory, Position Index, Open DB files)

Close DB (Write Index, Close Index, Release Locks, Backup DB, Backup Indexes, Close DB files)

While this over-allocates memory, the alternative, to size memory to each transaction, is more com- plex. If memory becomes scarce, the change to transaction size allocation can be made. To contrast the amount of memory required, a Customer Create transaction takes approximately 5K memory.

Startup and shutdown could be handled as part of the user object, but a cleaner implementation is to design them as separate. This start/shut object allocates memory, initiates application and DB pro- cessing, including bringing all transaction code and DB indexes into memory. In C++ implementation terms, the start/shut object will be the mainO rou- tine that initiates ABC processing.

TABLE 12-A4 Service Object Allocation

Data Hardware

Data Manager I/O-Print

Read DB Write DB Rewrite DB Position DB Determine physical location Request Read Wait read Request Write/Rewrite position Index Read Index Wait write/rewrite Check item locks Enqueue item lock Dequeue item lock Wait for item lock

Last, data management could be by file or by function. By file is simpler and easier for novices to maintain, but it also requires much more code and, therefore, more maintenance. Here we will define one set of generic CRUD functions for the data object with each requiring the specific DB name and data. If necessary, polymorphic processes for the CRUD functions can be customized for each database.

After the services objects are developed, they are allocated to the four subdomains of data hardware, software, and human interface as shown in Table 12-A4. Allocation of keyboard and bar code to hard- ware would be a possible choice. They are left with

Process

User Manager

Memory Manager

Transaction Manager

Command Manager (Thread of Control)

Human

I/O-Keyboard, Display, and Bar code reader

Appendix: Unix/ C++ Design of ABC Video 547

Hardware Subdomain

Data Subdomain Human Subdomain

~ Process Subdomain

'FIGURE 12-A2 Subdomain-Level Booch Diagram

the human interface because they are closely related to the display processes which mirror all of their input. Keeping these processes together reduces the object-switching overhead required to change from one object context to another.

Develop a Booch Diagram The first Booch diagram in Figure 12-A2 shows the subdomain-Ievel communication. To simplify the communications in the system, based on the subdo-

main message interchanges, we will define a generic message for use in most communications. The sec- ond Booch diagram, shown in Figure 12-A3, is at the object level and is obviously more complex than the SOL solution.

There are several major differences between the SOL and C++ designs. First, the schedule in SOL is a mainline routine that determines the next code to execute and is a centralized controller of the appli- cation. That function is performed to some extent by the command manager objects in the C++ design,

548 CHAPTER 12 Object-Oriented Design

Memory

FIGURE 12-A3 Object-Level Booch Diagram

but the scheduler functions are at a lower level and spread over the service objects. At this level, the spe- cific processes are not shown because the diagram would be more complex than necessary. Instead, we have shown the service and data objects only. To implement the application, we would complete that detail.

The design as shown in Figure 12-A3 is still incomplete for the data part of the processing. In Figure 12-A4 the next lower level of detail to show the complexity of the data objects is developed.

Command

TempTrans

Based on this diagram, we might decide to denor- malize the data to provide minimal accessing of databases during rental processing. For instance, we might replicate all Videolnventory information in each BarCodeVideo object to eliminate the need to access another object as part of rental process- ing. Similar denormalization might be done with Customer and VideoOnRental. Before a proto- type could be built, a second design iteration on all objects and complete design of the details is required.

Appendix: Unix/ C++ Design of ABC Video 549

FIGURE 12-A4 Object-Level Booch Diagram with Data-Object Detail

550 CHAPTER 12 Object-Oriented Design

Define Message Communications The message list is shorter than that of the SQL solution if we use a generic message as described above. The generic message list for the C++ Booch diagram is shown as Table 12-A5. If we do not use a generic message, the number of connections increases from the SQL number of about 30 mes- sages to over 170 messages for C++ as shown in Figure 12-A5, which depicts all connections in the Booch diagram, summarizing the processing for Command and I/O manager objects. In Fig- ure 12-A5, the processes with no specific arrows have multiple calling routines and return to the caller. The other routines with arrows are chained as shown.

In the SQL design, the network operating system and SQL shielded the application programmer from most of the complex elements-the service objects. With C++, the increased number of connections also increases the application's complexity. If we cannot use DB user views, there are more data objects on the diagram. If we do not have a sophisticated oper- ating system to monitor execution and physical I/O aspects of the application, the capability must be part of the application. By using generic messages, we reduce the complexity somewhat by reducing object abends for wrong message type and by allowing generic code for message reception and interpretation.

Develop Process Diagram The process diagram has no changes from Figure 12-22, which is redrawn here as Figure 12-A6.

Develop Package Specifications and Prototype Package specifications for SQL would be simple compared to those of C++. One package description/ program specification is shown below for customer data. The specification identifies public and private parts, plus the processing to be performed. Follow- ing the specification is an example of a C++ code

module to read the customer file based on a location that is passed to the read module.

Customer Specification

Item:

Name:

Documentation:

Visibility:

Cardinality:

Hierarchy: Superclass Class Metaclass

Generic parameters:

Interface- Implementation:

Public:

Protected:

Operations:

Persistence:

Description

Customer

The customer database contains information about legal customers for ABC.

All access is through the data manager routines.

All data is passed to using routines.

Private

400-600

Customer Cust None

&custloc &custrec

Only through passed parameters

Uses:Customer class Fields =

char custphon [10]; char custln [50]; char custfn [25]; char custadd1 [50]; char custadd2 [50]; char custcity [30]; char custstat [2]; char custzip [10]; char cctype [1]; char ccno [17]; date ccexp [8]; date entrydat [8];

Add (put) Seek (read) Update (put) Delete

Static

Appendix: Unix/ C++ Design of ABC Video 551

TABLE 12-A5 C++ Design Message List for ABC Rental Processing

Calling Called Input Output Action Return Object Object Message Message Type Object

Temp Trans Data TasklD, TasklD, CRUD, Caller Start/Shut Manager TerminallD, TerminallD, Open,

Thread ID, Thread ID, Close Database ID, Database ID, Type Request, Type Request, Data Return Code,

Data

Print Term Hardware- Data Address, None Print None Trans Print Type Print

Temp Trans, 1/0-Bar Code Task ID, TasklD, Input Caller Start/Shut Reader, Terminal ID, TerminallD,

1/0-Keyboard Thread ID, Thread ID, Database ID, Database ID, Type Request Type Request,

Return Code, Data

Start/Shut, I/O-Display TasklD, ACKor Display Command User Mgr, Terminal ID, Task ID, Trans Mgr, Thread ID, TerminallD, Human Database ID, Thread ID, Interface, Type Request, Database ID, Data Mgr Data Type Request,

Return Code

System Start/Shut Begin Non until Process User Mgr shut down

Start/Shut User Mgr Term Id I/O-Display- Put Prompt 1/0-Display Logon screen request (no message return to caller)

Command Temp Trans TasklD, Depends on next Process Either TerminallD, called routine, Command, Thread ID, either Task ID, Human Mgr, Database ID, TerminallD, Data Mgr, Type Request, Thread ID, HW-Printer Data Database ID, I/O Mgr

Type Request, Data or Task ID, TerminallD, Thread ID, Database ID, Type Request, Return Code, Data

552 CHAPTER 12 Object-Oriented Design

Memory

I AliocateMempry

I FreeMemory

I DefragMemo

I QueMemory equest

r DequeueMer oryRequest

Transaction .----~F=======:;lf- I Put Mehu

I~G=e=t=s:;e j::8c::t=iO=n ==::::: ~ I Verify election ~

r GetMe [nory

I Relea~ Memory

I SetUp lobals

I Relea~ ~Globals ~

1~I=c=a=tIlD~etr::ag:::==:jf4J

I/O

I Mge ~ ieybd ---- I MgeE IarCode

I Mge [ isplay ~

Main .-

I SetUp~1I ---tIll"

I LoadC alndexes ... r LoadA plication ... I ShutD wnAIl

I Trans! ~rtoUser ~

User.

I Put 10 on I~I'" I Get 10 on I~

I Verify oqon I ~

I~~~==~I"" .-Puter pr I~p=u=t p:::t:::\====~1 ~ ~IG=e=t P::::J)'lt===~1 ~ I:=v=e=rify=i:w=====~l ~ I Put P\ error I ~

Command

I Managj3Memory I ~ I Manag~Transactionl ~

I Managj3Status I ~

I Manag~Queues I ~ ~L-___ ~------~

TempTrans

.-L TempTrans

iCreateTempl rans I

IComputeTerr pTransTotal1

I ComputeChange II-_____ ~

I AddRetDat TempTrans I

I Add1toVln I

r ComputeL teFees I ~ Human

~--+=======:::;-]ll ~ Hardware .... r EnterCustPhone ~ r:::.-- I EnterS rCode I I SyncP inter I ~I.~

rStartP nt I~ r Displa' TempTrans l r Displa Inventory I I Displa Change l I EnterC listomer I ~

r EnterV ffeolnventOrVl

Data

Open

Create

Retrieve

Locate

Update

Video Inventory

Delete

[ Close l::c::!.. ~

r Manage Locks -:ES ~

rManage Requests I

I~P=u=tL=ints===~1 ~ ~IS=t=oP=Pfn=t ====~I ~ I GetPri tLines I ~

l::w=ai=tt0=tI>='rin=t====~1 ~ I Store rintLines IfooI-...... +------J

I Queue~ddress I ~

FIGURE 12-A5 ABC Process Diagram

Program fragment to read the customer data:

Iiseekc.cpp Ilread particular customer using passed customer location #include <fstream.h> lifile stream class customer { protected:

public: void custdb();

}; void main(custloc& custloc)

Ilcustomer location passed { person cust;

II establish customer object ifstream cust;

II establish customer file infile.seekg(O,ios:end);

Ilgo to 0 bytes from end int endposition=cust.tellg();

lifind file position int n=endposition/sizeof(cust);

Iinumber of customer on file int position=(custloc-1) * sizeof(cust);

Ilrelative location # * record size locates individual record cust.seekg(position); cust.read((char*)&cust,sizeof

(cust)) ; II read customer information

}

Appendix: Unix/ C++ Desigh of ABC Video 553

File Impact Server Printer

AIIP rocessing

Personal Computer

FIGURE 12-A6 ABC Process Diagram

C HAP T E R 13 SUMMARY AND FUTURE OF SYSTEMS ANALYSIS, DESIGN, AND METHODOLOGIES

INTRODUCTION ____ _

There are an unlimited number of ways in which the methodologies discussed in the preceding six chap- ters might be compared and analyzed. In addition, significant research is proceeding on individual methods as well as on integrating different meth- ods. To confuse matters, new technologies intro- duced daily profoundly impact our ability to develop applications and will require equally profound changes in methodologies to be used efficiently and effectively. In this chapter, we first compare the three methodologies to get a fix on their complete- ness and ability to be used to analyze and design applications. Next, computer-aided software engi- neering tools (CASE) are critiqued and summarized. The deficiencies and usefulness of CASE are dis- cussed and related both to development of current applications and to the future applications that com- panies now desire to build. Then, the changes in organizational and technological environments that

554

will require continuous evolution of methodologies are described and related to problems in application development.

COMPARISON OF ____ _ METHODOLOGIES ___ _

In this section, we take two different approaches to summarizing the usefulness and sophistication of the three methodologies discussed in the preceding six chapters. In the first analysis, the phases, information developed, characteristics, and decisions made in the three classes of methodologies are traced following the work of Olle et al. [1988] and expanding the information analyzed for each of the methodologies. Then, Watts Humphrey's maturity framework is described and applied to the methodologies to describe which, if any, might be appropriate for use in a maturing IS organization. In the concluding remarks in this section, we summarize the findings

and discuss the future of the methodology classes and, in particular, the three methodologies discussed in this text.

Information Systems Methodologies Framework for Understanding In their classic work, Olle et al. [1988], developed the information systems methodology framework to compare methodologies, discuss the representa- tion forms, and identify information supported in methodologies available for use in the mid-1980s, including the process methods and data methods analyzed in this text. Here, we summarize the frame- work to analyze activities and phases supported by the three representative methodologies. Then we extend the analysis to evaluate the phases in which information becomes known, the general capabilities of the methodologies, and the sophistication of resulting designs. Before the evaluation, please be cautioned that these analyses are not intending to condemn or otherwise pass a value judgment on the methodologies presented in this text. If they were not the best of their class, they would not have been selected in the first place. Rather, any shortcomings in the methodologies only point out that an organi- zation must compensate for the lacking activities, phases, or decisions by providing its own guide- lines and methods, or by hoping that their analysts have the requisite skills to perform these tasks on their own.

Activities and Phases

This section analyzes the phases of application development work that may begin at the organiza- tion level to develop information systems plans (ISPs) based on business objectives. An ISP is an analysis of both data and processes that includes manual or automated work to capture a snapshot of the work performed in an enterprise. The ISP is modified to provide the basis for organizational reengineering analysis as discussed in Chapter 5 (which is not part of Olle et al. 's work). Work pro- ceeds according to the framework to include busi-

Comparison of Methodologies 555

ness process, entity and feasibility analysis for a given application. Analysis and design are discussed in terms of the orientation of the majority of tasks performed during those phases. Support for human interface design, allocation of work to hardware or firmware, and DBMS design are all included. Main- tenance, the final phase of a project's life, is consid- ered in the extent to which it is supported in the methodology.

Table 13-1 shows the ratings of the process, data, and object methodologies from Chapters 6-12 on these activity and phase criteria. The process method, including the work of DeMarco and Your- don & Constantine, is most focused, including only analysis, design, and program development tech- niques and methods.

The information engineering (IE) data methodol- ogy is the most complete, covering all phases of the life cycle except maintenance explicitly, and cover- ing all design items to some extent (see Table 13-1). The support for hardware/firmware design is lim- ited to allocation of tasks and data to distributed environments. There are no decisions in IE for how to allocate work to hardware or firmware as in object orientation.

The enhanced Booch and Coad & Yourdon object-oriented (00) approach ignores front-end tasks, including organization level, business analysis of entities and process, and feasibility analysis. Rather, it assumes that these tasks have been per- formed before object-oriented methods begin to be used. Object orientation is more specific in its approach to analysis and design than process orien- tation, and, for some items, than data orientation. 00 examines and selects the objects and processes of interest in developing the application during the analysis process. These are then subsequently refined and further defined until design primitives are developed. Object design explicitly discusses the control structure of the application in the form of ser- vice objects which can support either batch, interac- tive, or real-time applications with any number of users, in addition to providing for distributed com- puting through the development of process dia- grams. The other two methodologies do not specifically address design differences that relate to timing or number of users for an application.

556 CHAPTER 13 Summary and Future of Systems Analysis, Design, and Methodologies

TABLE 13-1 Methodology Comparison: Activities and Phases

Knowledge

Business objectives as basis for applications

Organization Level Analysis

Business Process Analysis

Business Entity Analysis

Feasibility Study

Analysis

Design

Program Development

Human Interface Guidelines

HardwarelFirmware Attention

DBMS Design Attention

Maintenance Support

Process

Process-Oriented

Program design has some heuristics but relies on personal expertise of SEs

To summarize, information engineering (IE) cov- ers more phases of the life cycle and more specific activities as identified by the One framework. Object orientation (00) has more depth to the design phase by providing for design of problem domain, hard- ware, and service object activities. The guidance provided by IE for distributed computing decisions is significantly more detailed than the heuristics pro-

Data

Yes

Yes-Information Systems Plan (ISP) or Organizational Reengineering

Yes

Balanced Data and process analysis

Balanced Process data integration

Program design has some heuristics but assumes use of CASE which generates code

Yes

Distribution analysis

Yes-Assumes 3rd normal form relational DBs

Object

Objects incorporate both Data and Processes and are defined during Analysis

Encapsulated Object- Oriented

Iterative prototype development is an integral part of the methodology ... some methods are oriented to specific languages

Yes

Assumes independent modules which should be easily maintained

vided by object-oriented design for allocation of work to processors.

Where Information Becomes Known

Next, we evaluate the phases in which information becomes known by classifying data, processes,

relationships, and module information at different levels of detail.

Table 13-2 shows that both data and object methodologies provide analysis of all the items but some items are completed in different phases. Major entities and processes can be known during the information systems planning (ISP) activity of IE, if it is conducted. In addition, the current au- tomation state of the entities and processes is identi- fied during ISP as well. The same items, using the term object for entity, are defined during object-ori- ented analysis and are subject to refinement during object-oriented design. There is no explicit identifi- cation of cur-rent automation status for any of the items in 00 methods.

Business events and processing triggers are both identified in IE and object orientation. The timing of events, via event diagrams, is analyzed in more detail in object-oriented design, providing a basis for concurrent processing decisions. In IE, events are used to identify triggers for processing and to show where external data entry is performed in the appli- cation. Process methods identify necessary data flows into and out of the application, but they are not specifically tied to business events or triggers. The event/trigger distinction is important because it iden- tifies necessary and sufficient inputs whereas data flow identification leads to continuation of past data interactions without consciously reflecting on their need.

The process method does not provide for data relationship analysis, nor is data structure analyzed at either the logical or physical levels. The pro- cess method explicitly ignores timing and inter- process relationships. 1 The lack of relationship analysis means that the resulting designs will be less likely to mirror the business requirements of the application. Even Yourdon's 19892 update to the

1 This explicit ignoring of process timing and relationships is in DeMarco and Yourdon & Constantine. In extensions of process methods for real-time systems, these are both analyzed explic- itly. For a discussion of the real-time extensions, see Ward, P. T., and S. J. Mellor, Structured Development of Real-Time Sys- tems (three volumes). NY: Yourdon Press, 1985.

2 See Yourdon, Edward, Modern Structured Analysis. Engle- wood Cliffs, NJ: Prentice-Hall, Inc., 1989.

Comparison of Methodologies 557

methodology fails to integrate data with process analysis.

Object orientation appears more complete for real-time and database applications in explicit analy- sis and decisions for system, database, or software- specific attributes and processes that might be required of the application. The event diagram more explicitly identifies opportunities and requirements for concurrency than the other methodologies. The reliance of both process and data methodologies (with or without extensions) on designer knowledge and experience leaves too much to chance and puts pressure on designers to remember these tasks (i.e., concurrency analysis and software-specific data design).

General Capabilities

In this section, the methodologies are compared according to the extent to which they support analy- sis and design of the application characteristics described in Chapter 1: inputs, data, outputs, and constraints. In addition, processes and management of different sources of complexity are analyzed to complete the general description of an application. Inputs include the extent to which information and events that trigger processing are included in the analysis and design of the application. Data are internal, computerized representations of facts about entities in the real work that are stored in the data- base for the application. Outputs are information that leaves the computer system either to a display or to paper or some other (e.g., image) media. Processes describe the activity being automated, for instance, transaction, decision, or inferential processing.

Constraints define restrictions on objects, entities, data, relationships, or processes within an applica- tion. Constraint types include prerequisites, tempo- ral, inferential, structural, and control constraints.

Although not explicitly defined in Chapter 1, the ability of the methodology to facilitate management of problem complexity is a key concern to develop- ers. Complexity stems from several sources, includ- ing management of the number of elements in the application; the degree and types of interactions, and the need to support novelty and ambiguity.

558 CHAPTER 13 Summary and Future of Systems Analysis, Design, and Methodologies

TABLE 13-2 Methodology Comparison: General Capabilities

Knowledge

Entities/Objects

Entity Attributes

Entity Identifiers

Entity Class/Object Structure

Data Relationships

Specific attributes required of operating system, DBMS, or software

Physical Data Design

General Processes

Detail Process Logic

Data relationship to processes

Events, Triggers

Process relationships

Module Structure

Module Specifications

Process

Feasibility-Begun Design-Complete Terminology differs

Design Terminology differs

Design-Required knowledge of designers, not part of methodology

Design, Programming

Feasibility-Begun Design-Complete

Feasibility-Begun Analysis-Complete

Design

None-Analysis includes identification of external entity inputs only.

Design

Data

During ISP if done Feasibility-High level fully known Analysis-Complete

Analysis Design-Complete

Analysis

Design

Analysis-Entity Hierarchy

Design-Required knowledge of designers, not part of methodology

Design, Programming

During ISP if done Feasibility-High level fully known Analysis-Complete

Analysis Design-Complete

Design-Process Triggers on PDFD

Analysis Design-Complete

Design

Object

Analysis-May be revised during iterations

Analysis Design-Complete

Analysis, subject to change during Design

Analysis-Object Lattice Hierarchy

Design-Specifically part of the methodology

Design, Proto typing

Analysis

Design

Analysis Design-Complete

Design-Event Diagrams State Transition Diagrams

Process Timing defined in Analysis with State- Transition and in Design with Event/Triggers

Design

Comparison of ~ethodologies 559

TABLE 13-3 Methodology Comparison: General Capabilities

Knowledge

Inputs

Data

Output

Prerequisite Constraints

Temporal Constraints

Inferential Constraints

Structural Constraints

Controls

Complexity Management

Management of Novelty

Management of Ambiguity

Process

None

Minimal

None

Top-down perspective

Relies on SE skill for proper manual decomposition

None

As Table 13-3 shows, none of the methodologies are complete in providing for analysis of all types of design criteria. None of the methodologies sup- port design of inputs or outputs, even though both data and object methods identify the need for inputs via event/trigger identification.

None of the methodologies deal with inferential constraints (see Table 13-3). Remember, the fact that constraints might be missing from a methodology does not mean that they cannot be in the resulting application, only that they must be remembered and designed outside of the methodology and rely on designer skills. Process methods are the most limited

Data

Trigger Identification; Screen Design Heuristics

Entity Relationship Diagram, DBMS, Normalization

Screen Design Heuristics

Yes

Limited

None

Data only

Problem domain

Top-down perspective

Relies on SE skill for proper manual decomposition

None

Object

Event Analysis State Transition Analysis

Object Analysis Object Attribute Analysis

None

Yes

None

Hierarchic inheritance for data and processes

Includes both problem and service domains

Round-trip Gestalt perspective

Allocate processes to hardware, software, DBMS, and human interface; treat as four separate elements

None

in providing no constraint identification and pro- cessing as part of the methodology. In contrast, object-oriented analysis specifically provides a step to identify and define the constraints on processing and structural constraints as they relate to both data and processes. IE and data methods are in the middle with prerequisite constraints shown on action dia- grams, while structural constraints are limited to those expressed in a class hierarchy for data. Con- trols are explicitly provided for in both data and ob- ject methods and are absent from process methods.

Complexity management is similar in data and process methods since both take a top-down

560 CHAPTER 13 Summary and Future of Systems Analysis, Design, and Methodologies

perspective and are controlled through SE skills. IE decomposition is somewhat easier when an ISP is performed, because the software decomposition fol- lows from primitive business processes which trans- late into computer processes. The SE skills required, then, are for further decomposition of computer pro- cesses into modules and execution units that pro- vide for desired software characteristics such as minimal coupling, maximal cohesion, and so on.

The 00 design perspective of round-trip gestalt and explicit use of iterative prototype development supports complexity management to some extent by providing increasingly detailed abstractions of the application with each iteration. 00 design also man- ages complexity through inheritance which mini- mizes the replication of both data and processes and by allocation of processes to hardware, software, DBMS, and human interface. Through the allocation of objects and processes to each subdomain, the sub- domains can be considered independently, even by different design groups. The only need for inter- group coordination is for interprocess message definition.

For complexity management of ambiguous or novel requirements, none of the methodologies pro- vides guidance.

None of the methodologies guide input/output design. Process and object methods are unlikely to be useful in identifying conversion requirements of an application, since they do not differentiate auto- mated from manual data as IE does. Similarly, process and object methods are not likely to lead to well-defined databases since the methods do not pro- vide guidelines for database design. 3 IE provides explicitly for normalization and logical database design while recognizing the need for physical design based on data usage requirements.

None of the methodologies are perfect at com- plexity management. Object orientation appears to facilitate complexity management more than the other methodologies through its support for inheri- tance and allocation of processes to subdomains.

3 Attempts by Booch (1991), for instance, to design databases into an OOD and by Yourdon (1989) to integrate entity- relationship and data analysis in Modern Systems Analysis are incomplete and cursory.

Novelty and ambiguity of requirements are not addressed by any methodologies.

Sophistication in Explicit Design Decisions

Sophistication means "developed in form or tech- nique,"4 complex, or worldly. In this section, we rate the methodologies in their ability to guide the development of sophisticated modules, programs and applications to exhibit characteristics of reusability, modularization, information hiding, maximal cohesion, and minimal coupling. The issue is not can the methodologies use or result in modules with these characteristics-the answer is absolutely yes, they can. The issue is the extent to which the methodologies explicitly provide guide- lines and validation heuristics for reaching designs that exhibit these characteristics.

Neither data nor process methodologies provide for information hiding, maximal cohesion, or mini- mal coupling beyond somewhat arbitrary heuristics. Only object orientation specifically can result in a clean design (see Table 13-4), but it can also be cor- rupted if the designers significantly change intra- object and class/object structures or relationships during design. By early encapsulation of objects and processes during analysis, object orientation auto- matically imbeds cohesion in the application. By only allowing communication via minimal mes- sages, object orientation automatically provides minimal coupling and information hiding. When implemented using object-oriented DBMSs and lan- guages, object designs should have these properties.

Problems and a loss of minimal coupling and information hiding will occur if nonobject languages or software are used to implement 00 designs. For instance, COBOL is the antithesis of object orienta- tion. COBOL assumes global data and cannot man- age encapsulated objects because it assumes separation of data and process. Therefore, if COBOL is the target language, object orientation would not be a good choice of methodology, all other things considered.

4 From Webster s New World Dictionary, pocket edition. NY: Popular Library, 1973, p. 544.

Comparison of Methodologies 561

TABLE 13-4 Methodology Comparison: Explicit Design Decisions

Knowledge Process

Extent of Information NA Hiding

Extent of Heuristics rely on Modularization SE skill

Extent of Maximal Heuristics rely on Cohesion SEskill

Extent of Minimal Heuristics rely on Coupling SE skill

Supports reusable No object design

Supports reusable Yes module/object use

Extent of Reusability Relies entirely on SE skill

The other measure of sophistication is the extent to which the methodologies support reusability and reusable module/object design. Only object orienta- tion provides for explicit identification of potential reusable processes and objects. Once the reusable items are identified, object orientation does not pro- vide further guidance in how to actually design reusable modules; nor should it necessarily provide such guidelines.

IE covers the whole life cycle, something both process and 00 methodologies need to provide for application development. The IE data methodology provides more human interface design guidance and is the only methodology that covers the complete life

Data Object

NA Analysis-Begun Design-Complete

Uses Process-design Forces design until heuristics and SE skill primitives, highly dependent

on implementation language. Relies on SE skill and proto- typing.

Heuristics rely on Analysis-Begun SE skill Design-Complete

Heuristics rely on Forced by the methodology SE skill but could be subverted by

SE errors.

No Heuristics and procedure for identifying reusable objects

Yes Includes heuristics and limited procedure for identi- fying reusable objects

Relies entirely on Can be 80%+ SE skill Organization dependent

cycle of an application. IEs' disadvantage is that many activities rely on SE skill and experience to know the activity should be performed rather than incorporating the need for the activity in the method- ology. When data is complex, nonobject software (either DBMS or language or both) are used, or if human interface design is paramount, information engineering would be the choice.

Structured analysis and design, the process meth- odology, is the least prescriptive in telling users how to perform the various activities, and it has the least activities in the methodology.

Overall, object-oriented methodologies would be expected to lead to a design that more closely

562 CHAPTER 13 Summary and Future of Systems Analysis, Design, and Methodologies

resembles the functional requirements, if the func- tional requirements are adequately stated before 00 analysis begins. The lack of front-end activities in 00 hinders its usefulness in business. Keep in mind that just because object orientation is the most explicit methodology, it is weak in actual data design, human interface design, and must be used with object-oriented languages in order to realize the benefits from its use. Also, every author has a dif- ferent 00 methodology with different notation and different reasoning. As a result, the fledgling 00 methodology will change and be refined over the next decade. Large-scale commitment to 00 without attaining some consensus and stability of methods certainly adds risk to application development.

Humphrey's Maturity Framework The Humphrey's maturity frameworks was devel- oped for the Department of Defense as a self- assessment framework that identifies levels of computing and application development process maturity. The goal of the framework is to provide a means of assessing and accelerating technology transfer from research to practice throughout the Department of Defense. According to Humphrey, the ideal software process is predictable, consistent, measurable, and monitored according to objective standards. The maturity levels are initial, repeat- able, defined, managed, and optimizing (see Fig- ure 13-1).

At the initial level, neither measures (i.e., statisti- cal control) nor orderly progress are possible. This is the level at which organizations operating under no methodology and no life cycle operate. Managerial oversight for quality, productivity, and change con- trol to provide some stability to project schedules are required organizational supports that must be present to even attain the initial level.

At the repeatable level the organization has introduced managerial controls in the form of project

5 See Humphrey, Watts, Managing the Software Process. Read- ing, MA: Addison-Wesley Publishing, Inc., 1989.

I Optimizing J ~rocess Control

l"""--M-a-n-ag...c.e-d-'J

L Process Measurement I"'---D-ef-in~ed-----'J

/ Process Defin"ion ...--__ L--_, I Repeatable J

/ Basic Management L Control

Initial J

FIGURE 13- 1 Humphrey's Five Levels of Maturity

management cost, schedule, and change controls. Project team members are expected to commit to their tasks and be measured against their commit- ments. While never actually saying the words, the repeatable level implies the recognition of both a life cycle and a methodology, that is, a repeated set of global level tasks with deliverable products that implicitly become the measures of sche?u~e and c~s~ performance and that are performed wlthl~ a def~n able process. Humphrey's reason for havmg a hfe cycle/methodology is to provide a framework wit.hin which to address the risks to a development project from new tools, methods, and/or technologies. Orga- nizational support in the form of providing for walk-throughs, formal design methodologies, con- figuration management for code, and application testing standards and methods are required at the repeatable level to continue to the next stage. Humphrey argues the need for a process group (i.e., a Standards group) which defines the steps to making orderly progress in project work and that provides a nucleus for transferring the process knowledge to the working groups.

The defined level requires the definition of the software development process, which defines the methodology in sufficient detail to guide the work

process and define detailed subphase products that collectively become the phase deliverables needed to further manage the tasks. Each deliverable product has process and product measures of quality and pro- ductivity that are aggregated to the phase and project level for managerial oversight and assessment. At this stage, a quality assurance group that performs independent analysis of product and application quality is formed to report to management on a prod- uct-by-product basis. At the defined level, a process database is established and all SEs are trained in the use of the information to provide history for the organization on the use and productivity of each project and tool.

At the managed level the organization initiates "comprehensive process measurements, beyond those of cost and schedule" [Humphrey, 1988, p. 302]. The managed process requires analysis of the process database measures to ensure that com- parable statistics are available and can be universally interpreted, and that project-specific data that high- light unique characteristics or aspects of application development projects are stored and interpreted properly. At the mana~ed process level, the data for the process database should be gathered automati- cally and used to modify the process to "prevent problems and increase efficiency" [Humphrey, 1988, p. 306]. Humphrey takes pains to point out that the database should not be used to penalize either proj- ect teams or individuals, but that type of use by man- agers can be taken. One example of measures is function points.

The optimizing level is one at which the orga- nization continues improvements begun at the managed level and starts development process opti- mization. The optimizing level, ideally, allows SEs to identify many types of errors in advance of their causing delays and problems on a current project by analyzing and identifying the patterns of mistakes from other projects based on information in the process database. In my opinion, this is truly an ideal at this point in time since our ability to detail the steps to what appear to be random incidences of Murphy's Laws is rudimentary, at best, and nonex- istent, in practice.

While Humphrey's framework is useful for dis- cussing key differences between methodologies,

Comparison of Methodologies 563

it is not without problems. First, it is based on Humphrey's and others' experiences in the field but has never been subjected to empirical validation of its definitions. Humphrey asserts that the maturity framework "represents the actual ways in which software-development organizations improve" [Humphrey, 1988, p. 307]. The stages are presented as distinct and sequential, with the implicit under- standing that to attain, for instance, the optimizing level, an organization must have moved through all previous levels. There is no basis for this supposi- tion. In fact, the framework represents Humphrey's ways of attaining software development maturity without recognizing that it may not fit all situations. The second drawback to the framework in analyz- ing methodologies is that many of the requisite support activities are organizational, not method- ological. For instance, walk-throughs, configuration management software, and testing standards are out- side the scope of methodologies. We assume they are not an issue in this discussion.

Having said these criticisms, the framework is still useful for discussing problems with methodolo- gies that relate to the extent to which they define development activities and support phase work.

Table 13-5 shows my subjective ratings of the methodologies with respect to Humphrey's frame- work. None of the methodologies has a uniformly high rating in all of the categories.

In general, process methods are the least pre- dictable, consistent, measurable, or monitorable because they leave so many activities to SE skill and omit specific activities from the methods. At worst, process methods are at Humphrey's initial stage; at best, they are repeatable. Because the focus is on process, I would assume that consistency and mea- surability of processes should be medium, that is, different people should arrive at similar analyses. In fact, we think they are low to medium. Designs would be expected to vary most because the heuris- tics are vague. Data analysis, data design, and human interface design, which some authors add on as an afterthought, would all be expected to vary signifi- cantly across different SEs because they are not explicitly part of the methodology.

Measurability is low to medium. Assuming func- tion point metrics, measurability is low because

564 CHAPTER 13 Summary and Future of Systems Analysis, Design, and Methodologies

TABLE 13-5 Methodology Comparison: Humphrey's Framework

Knowledge

Predictable

Consistent

Measurable

Monitored

Process

Low

Low-Medium

function points concentrate on externals (e.g., numbers of interfaces, files, I/Os, and so on) and not on processing complexity.

The ability to monitor the methodology-defined tasks is probably about medium. The ability to mon- itor process-oriented applications is low when only methodology-supported phases and tasks are moni- tored and would be inconsistent if monitored tasks were defined by project.

The data methodologies have slightly better over- all ratings. In Humphrey's framework they are, at worst, repeatable and, for some activities, reach the defined level. IE is reasonably predictable in having a set of activities defined into phases for ISP, feasi- bility, analysis, design, and program design. If using, for instance, Texas Instruments' version of IE, there are many more tasks that are not all necessary for a given application; thus the activities are not com- pletely predictable across projects. The activities should provide a level of consistency across SEs who should be expected to define the same entity- relationship diagram and the same activities even though details would probably differ. Therefore, consistency should range from medium to high. The extent to which IE analyses and designs are measur- able is ranked as medium. If function point analysis is used and baselines for the company have been defined, the measurability is probably medium since IE analyzes the major function point items. The extent to which IE can be monitored is medium. IE defines more tasks and activities and follows more phases of the application life cycle; therefore, its ability to be monitored is greater than that of pro- cess and object methods. However, all projects are subject to unforeseen problems that require unplanned time, and monitoring cannot assist in

Data Object

Medium-High Medium

Medium-High Low-Medium

Medium Medium-High

Medium Low-Medium

foreseeing those problems. Therefore, not all tasks and activities can be monitored to the extent that they eliminate problems during the development process. If a CASE tool, such as IEF, is used for development, monitorability is high because the entire life cycle has well-defined stages,products, and reports on status that can be tracked for all phases.

Object orientation, in the form of the enhanced and integrated Booch/Coad & Yourdon method- ology is similar to IE in predictability and mea- surability. Consistency is lower and varies from low to medium because individual SE skill is required to define the calling sequences and ultimate operational structure of the application, even though the definition of the object pieces is fairly well described. The difference between a good calling sequence and message set and a bad one is difficult to define in abstract, procedural terms, but can only be noticed through prototyping and actual compari- son of different schemes. Monitorability is less because of the ill-defined nature of service- object identification and of language-specific 00 requirements. Moving targets, like 00, are hard to measure. 00 is repeatable at best in Humphrey's framework.

The bottom line on methodologies and Humphrey's framework is that the methodologies alone do not offer enough guidance to support the defined level of application development manage- ment, let alone get to the optimizing level. For this reason, more work on methodologies, life cycle, and development activities are needed to accommodate the variety of work for different types of applica- tions. Having said this, we also need to be realistic about just how much predefinition of decision

Comparison of Automated Support Environments 565

processes can, in fact, be imbedded in methodolo- gies. Two things seem obvious. One is that we can define some of the methodology-driven activities more completely. The other is that the engineering nature of the SE task is that each appiication will require unique characteristics and design that cannot be codified!

In summarizing this section, no single method- ology appears to be complete and sufficient for all the tasks and activities performed during an application development. There is no silver bullet that will solve our application development prob- lems or provide a complete cookbook for the devel- opment process. For these reasons, there will always be a need for SE expertise in application develop- ment. There is also a need for continued definition of tasks needed during application development and the continuous evolution of techniques that are inte- grated into the various methodologies to guide those tasks.

COMPARISON OF ____ _ AUTOMATED __________ __ SUPPORT _______ __ ENVIRONMENTS ____ _

There is a marked degree of consensus on many design features of the ideal CASE environment. Table 13-6 summarizes many features and functions that Pressman, Gane, Booch, Martin, and McClure recommend. The curiosity is that the vendors do not seem to listen. Take three general requirements as an example: integration, intelligence, multiuser support.

CASE integration is the absence of barriers between one graphical or text form and others. The experts agree that the most useful CASE should sup- port all project life-cycle activities within an inte- grated environment. The rationale for this position is that tools that support only application development, even if they include project management, address only a small, possibly noncritical, portion of the SE discipline. Further, the integration should be seam- less, that is, transparent to users. Transparent inte- gration includes the automatic conversion of diagrams and design text into other forms of docu-

mentation or program code with little or no manual intervention. The integration should be both between tools and between life-cycle phases. This level of integration implies that some resolution of funda- mental semantic and syntactic differences between phases is required. Specifically, differences between analysis and design should be eliminated through CASE use. To reach this sophisticated level of inte- gration, the methodologies require some redesign to remove their own built-in lack of seamlessness between phases activities. For instance, in process methods, one major intellectual stumbling block is the transition from data flow diagram (DFD) in analysis to structure diagram in design. Many people ask, Why not develop a structure diagram in analysis instead? Or, conversely, Why not carry DFDs through to design?

Next, intelligence in tools is desirable. Artificial intelligence (AI) in CASE facilitates reusability and provides consistency and completeness checking within and between graphical and text forms. AI rou- tines can be used to implement the concepts of reusable analysis, design, program specifications, and code. The routines can locate, retrieve, and select specifications matching design parameters and can identify specification fragments that do not match what is required. Other applications of AI are the analysis of completeness and consistency of requirements or code. Other checking is between phases to match logical design to physical design to code. This use of AI is technically feasible and not particularly difficult. What we don't know about AI for these uses is what to match, how to match it, and when the best time for matching occurs. New meta- language descriptions of analysis and design re- quirements will be required to fully exploit AI in CASE. These meta-languages must also be consis- tent and no additional burden to the other integration and multiuser support requirements of CASE.

One consistently recurring theme in all CASE products and research is concern over the replace- ment of one sort of complexity with another sort of complexity. The solution to software development productivity, quality, and reliability problems is to build tools that, in hiding some complexities of the development process, are necessarily complex them- selves. The hidden complexities require absolute

566 CHAPTER 13 Summary and Future of Systems Analysis, Design, and Methodologies

TABLE 13-6 Desired Computer-Aided Software Engineering Features and Functions

Project Management: Work breakdown Cost estimation Person/task scheduling Monitoring allocated vs. actual times Budget creation Monitoring budget vs. actual money spent

Documentation for all Work Word processing editor functionality Integration of text and graphics Nesting of text, graphics, and so on with recall at

all levels Document templates-predefined and customizable Query capabilities to all parts of the graphical and

text definitions Version/release control support Change control support

Analysis Graphical and text support for specific methodology Intelligent syntactic evaluation of completeness

and correctness Repository (i.e., dictionary) support for all graphic

and text information with nesting and linkage within and between levels

Support for reusable component recognition, definition, use

Human interface definition support Prototyping support Customizable reporting facility

accuracy and reliability themselves to make their use worthwhile; the systems will have to reveal them- selves upon request so users may understand internal processing. With AI routines, that, for instance, learn to predict what is required for code based on design specifications, these revelations are crucial to guar- anteeing CASE's continued use.

The integration of phases and tools must also be multiuser. Multiuser CASE support implies some sort of centralized repository of information about the application that is accessible by any number of people concurrently. Warnings to users when a com- ponent is changed and automatic version control are desired features. Multiuser support extends to group

Design All analysis functions above First-cut of next step graphicalJorm from analysis

via automated functions Support for program definition language (PDL) with

interface to code generators for several languages Bi-directional interface to analysis and code from

design Sensitivity analysis on designs

Code All above plus Source code templates Source code syntax checking and comparison to

requirements Automated code generation Automated third normal form database definition

from repository data definitions Automated minimal test set definition ... with

generation of test data Integration to software configuration management tool

General Consistent interface with function keys having

identical uses across phases On-line documentation, suggestions for problems Adaptability to local conventions for methodology use Support on any operating system, hardware platform,

DBMS generation, and if not, machine indepen- dence of designed application

Interfaces to other tools and products

work collaboration, scheduling, tracking, sensitivity analysis, and electronic meeting support.

Now, let's first examine the extent to which the methodologies themselves exhibit the properties thought to be desired for CASE, then extrapolate from that to determine the level of support for these features we can realistically expect from CASE products.

First, integration across phases and graphical forms is important to building intelligence into CASE. If we examine the three methodologies described in this text, structure analysis and design (SA), information engineering (IE), and object ori- entation (00), we would find the most integration in

Comparison of Automated Support Environments 567

00 with less in IE and even less in SA. 00 begins with tables that are increasingly elaborate but whose contents can be traced from the beginning of analy- sis through to development of module specifications. There is no shift in thinking required once the data and processes become encapsulated, because they continue to be encapsulated throughout the remain- ing steps.

IE has less integration because there are two fairly distinct paths of thought in IE, one for data and one for processes. Within each path, the level of integration is consistent and high, but between paths, the integration is less consistent and there are few guidelines for integrating the two. One example of this lack of consistency is that, depending on the author, IE should not have data files or entities shown on action diagrams; action diagrams should remain a process sequencing and event trigger iden- tifying graphical form. If this line of reasoning is fol- lowed, data and processes are integrated at the program specification level. Program specification work is micro-design that could then miss major global problems because of the lack of data-process integration.

SA is even less integrated than IE because data are not specifically addressed in the methodology. The analyst is supposed to know what 'data stores' are required and the appropriate contents of those data stores. Some authors6 assert that a data store can refer to a group of related normalized relations, while others 7 assert a data store is a third normal form relation. When data analysis is not an official activity, by definition it cannot easily be integrated into the methodology. Similarly, there are numerous texts that describe how to use SA for developing real-time applications 8 and that provide a foundation for several of the graphical forms used in 00. But close analysis of the Ward & Mellor methodology, for instance, identifies a very different approach to

6 See Gane, Chris, Computer-aided Software Engineering: The Methodologies, The Products, and The Future. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1990.

7 See Yourdon, 1989. 8 Ward, P. T., and S. J. Mellor, Structured Developmentfor

Real-Time Systems (three volumes). NY: Yourdon Press, 1985, is one ofthe most commonly used.

developing applications from the original DeMarco and Yourdon & Constantine approaches.

Given the levels of integration as low for SA, medium for IE, and medium to high for 00, the greatest potential for CASE to provide seamless, complete integration of functions seems most likely for object orientation. Further, the higher the level of integration, the greater the intelligence that can be built into the software, once again, identify- ing 00 as the most likely to provide extensive use of AI. Does that mean that AI cannot be used for the other methodologies? Absolutely no! It means thatsophisticated AI that recognizes reusable analy- sis, design, or code fragments and that performs sig- nificant semantic analysis of the contents of diagrams and the interdiagram relationships is most likely in 00. Anyone using any CASE tool today knows that they provide fairly extensive syntactic evaluation intelligence that will tell you, for instance, if your connections on a data flow diagram (DFD) are all legal, or that the external entity inter- actions from the context diagram are all accounted for in the DFD.

From the discussion of the previous two issues, you should be able to figure out that multiuser sup- port in products also lags behind the desire for its sophistication in industry ... and it will continue to do so for at least five years. Multiuser support adds a level of underlying complexity because of the need for locking mechanisms, access security, and con- current multiplatform hardware support that impedes vendor development. Since there are no competitive reasons for developing multiuser capabilities, that is, no other vendors have it either, vendors are not spending their resources on multiuser support. Cur- rent tools with a central repository allow segmenting of repository items, such as an ERD. When multiple users want to change the ERD, they check out seg- ments and work on their respective segments indi- vidually. The completed checked-out segments are checked-in to a reconciliation procedure that fre- quently fails because of inconsistencies that are then manually reconciled. In a truly concurrent environ- ment, locking mechanisms would support multiple concurrent users without segmenting and check -out processing, but with locking mechanisms similar to those used in DBMS software.

568 CHAPTER 13 Summary and Future of Systems Analysis, Design, and Methodologies

What does the state of integration and AI mean for CASE? CASE tools are necessarily limited in the number of processes, number of entities, number of attributes, complexity and detail of description, and so on. These limitations are higher candidates for removal by vendors than are these three more abstract concepts: integration, intelligence, and multiuser support. The CASE industry has entered a push-pull stage of product development. The push comes from the ever increasing desire of client com- panies to develop ever more complex and sophisti- cated applications, and their recognition that CASE can be used to deploy ITs to their competitive advantage. The pull comes from the products on the market and their growing sophistication. As soon as one vendor provides a feature or function, others feel obligated to offer it too, or risk losing market share. Many vendors try to support as many methodologies as they can, frequently without regard to underlying differences in mental thought processes required to comply with the methodologies. So, for instance, DeMarco's SA and IE analysis might both be adver- tised as supported by the same vendor. But DFDs are not action diagrams and vice versa, nor will they ever be. So, when vendors claim multimethodology support, beware of the claim.

RESEARCH ______ _ RELATING TO _____ _ ANALySIS, ______ _ DESIGN, AND _____ _ METHODOLOGIES ___ _

There are two growing bodies of research9 relating to methodologies and the application development process. The first research is attempting to reconcile the differences in methodologies to develop an improved hybrid. The second type of research stud- ies the decision processes that occur in analysis and design activities. Both of these lines of research are described in this section and related to future

9 See Adelson & Soloway, 1985; Guindon & Curtis, 1988; Guindon, Krasner, & Curtis 1987; Pennington, 1987; Vessey & Conger, 1993.

changes that we might expect in methodologies and application development.

The methodology research consists of normative and descriptive writing on the procedures and appli- cation focus in analyzing application problems. From this body of work, we have over 60 identifiable methodologies with primary concentrations, such as SA, IE, and 00, described in this text. Unfortu- nately, the value of these methodologies has not been studied. There is no evidence that any of these meth- odologies is better than any other of these method- ologies. Nor is there any evidence that any methodology is more appropriate for a particular problem domain than any other. Intuitively, they can't all be best in all situations. Current research is taking two directions to follow on this idea: First, one line of research attempts to integrate methods to create an improved hybrid; second, the other line of research is trying to determine when and which methodologies are appropriate for different types of problems.

Current research in building hybrid methodolo- gies is primarily applied. All authors, so far, are seeking to integrate 00 notions and notations with some other methodology, including structured analy- sis, Jackson systems design, information engi- neering, and others. 1O This research is purely prescriptive, of the form: "If I were going to put 00 together with structured analysis, here's what I would do." While this research is promising, the lack of researcher attention to the differences in reasoning and thinking processes of the methods needs to be resolved. Also, these authors will need to offer evi- dence of the synergy they promise but for which they currently offer no evidence.

The second type of research discusses methodol- ogy learning by novices. Having learned COBOL or another procedural language, novice learning of structured analysis is easier and more accurate than learning of other methodologies. 1l Since there is less to learn, this is not surprising. In addition, this research notes that the thought processes of 00 are decidedly different that those of SA and IE. We would conclude then that novices who learn Ada

10 See for example, Sanden, 1989; and Ward, 1989.

11 See Vessey and Conger, 1993.

Business and Technology Trends that Impact Application Development 569

first, for example, would have an easier time learning 00 than structured analysis, and their 00 designs would be more accurate. This is a promising line of work that needs much more study, including analysis of real analysts doing real work before any results applicable to business use of methodologies can be expected.

The study did find that analysts' development of a mental model is crucial to complete solution of a task. The process followed by successful analysts includes development, expansion, and simulation of a mental model that uses personal problem-solving plans that are used to elaborate constraints, and notemaking as a means of deferring work until a later time. Many of these skills in Chapter 2 recom- mended for you to think of while studying the text were identified through this research.

Also, some comments about easy and hard fea- tures of methodologies can be developed. The easy features of 00 are those that automatically lead to information hiding, minimal coupling and maximal cohesion, the traceability of information throughout the process, and the essential continuity of the method (i.e., building tables and progressively add- ing details to the information). The hard 00 features are the extensive experience in operating systems re- quired to determine service object requirements and the significant coupling between the implementation language and the application design.

The easy features of IE are entity analysis, full- life cycle approach including enterprise through maintenance phases, the methods for deciding distri- bution, and the balanced thinking given to both data and processes. The hard IE features are the mental shift required to move from design to program speci- fication and from an action diagram to its compo- nents. The decisions about the size and content of components is left to the SE.

The easy feature of SA is the simplicity of the thought process which is easily grasped by most people. The hard SA features are the disjoint phase relationships moving from DFD to structure diagram and decomposing the structure diagram into mod- ules. These actions, like similar ones of IE, are left to SE skills and have few guidelines.

To summarize the application development liter- ature, we know that skills needed seem to vary by activity both across and within phases of a system

development life cycle, that task domain facilitates the process of building a mental model of the prob- lem solution, and that different types of domain knowledge exist, including methodology and task domains.

For SEs, this research has several implications. First, the entire field of methodology research is in its infancy. As it matures, both the methods and the way we use them should be expected to change. Sec- ond, hybrid methodology that attempts to integrate methodologies requiring different mental models of a problem, for instance, structured analysis and 00, are unlikely to be very productive. Rather, we need to identify which methodological orientation best fits different problem domains, concentrating on methodology improvement and use in the appropri- ate domains.

Last, since methodologies do not provide com- plete analysis of all aspects of problem domains, by definition, CASE tools based on the methodologies will also provide partial task coverage. The more complete the methodology, the more complete the CASE tool. Some vendors add completing tasks to support, for example, code generation; these CASE tools are even more complete than those that are only methodology-based. The most notable example of a more complete tool is Texas Instruments' Informa- tion Engineering Facility (IEF).

Applying Humphrey's framework to research in IS, methodologies are in either the initial stage or the defined stage. CASE tools help methodologies attain the defined stage, but sometimes impose such rigid- ity in doing so that usage is constrained and might not fit either the way SEs work or the work itself.

BUSINESS AND ____ _ TECHNOLOGY __________ _ TRENDSTHAT ______ _ IMPACT ________ _ APPLICATION ______ _ DEVELOPMENT ________ _ There are several trends in application management and development that will change dramatically the way business computing is performed in the next ten

570 CHAPTER 13 Summary and Future of Systems Analysis, Design, and Methodologies

years. The trends are both technological and busi- ness related, including management of legacy sys- tems and data, client/server computing, development of repositories and data warehouses, multimedia ap- plication development, and the business globaliza- tion. Each of these trends are briefly described with their impact on application development and soft- ware engineering.

Legacy Systems Legacy means handed down as from an ancestor. Legacy systems are applications that are in a main- tenance phase but are not ready for retirement. Legacy systems are most often mainframe, COBOL applications that were probably built using no methodology and no life cycle. Such applications are frequently referred to as 'held together with spit and glue' because they are fragile, that is, susceptible to introduction of errors caused by unrelated changes. In short, they are a liability. The reason these sys- tems are not all rewritten and done away with is because of the tremendous investment in their development.

A related concept is legacy data which is data used by outdated applications that are required to be maintained for business records. Legacy data are as much as 50% incorrect and may be in an unusable form without considerable expenditures of time and money. In short, they are a liability. The reason legacy data are not reformatted in some new DBMSs that can optimize storage and access time is the inherent cost of correcting the data which could be ten times or more than the cost of reformatting.

The impact of legacy data and systems is to inhibit and slow the integration of data across orga- nizations and applications, and to inhibit the inte- gration of technologies for application use. Ultimately, companies with significant legacy prob- lems will be forced, for competitive reasons, to spend the money to transform the systems and data into useful items or to abandon them and write off the expense.

The impact of legacy systems and data on SE is to continue to inhibit new application development by requiring attention. The new pulls from industry include need for reengineering data, methods, and

software that support data scrubbing to remove anomalies and errors. These are nontrivial needs that will divert some industry resources away from methodologies toward these very practical and real problems.

Repositories and Data Warehouses A related issue is the notion that organizations no longer want to discard data. For instance, the maintenance of legacy data sometimes is mandated by the government. The means to store unlimited, continuously growing databases currently are called data warehouses.

Similarly, all of this data must have meta-data that defines each attribute and its related entities (or objects), the applications and software allowed to access the data, and the allowable using organiza- tions. The meta-data definitions are in a repository which, in its most sophisticated form, is a data dic- tionary for data, processes, hardware, and software.

Repositories control and centralize management of data as an organizational resource. Distributed repositories will be developed in the future but are currently only available as one-user chunks of a cen- tralized repository that must be reintegrated with the centralized, official data.

Both repositories and data warehouses have sig- nificant overhead (i.e., human) costs associated with managing and tracking all of the information actu- ally managed by the software. Because of this over- head expense, companies must choose carefully those items they really want to maintain indefinitely. The luxury of being a 'data packrat' has currently unknown costs.

The impact of data warehouses will be felt in the need to design time-dependent databases 12 that have associative relationships and to migrate legacy data to the warehouse. Associative data relationships are irregular, dictated by data content rather than abstractions such as normalization. An example

12 Time-dependent databases are also referred to as temporal databases and have an entire body of research associated with their definition and use.

Business and Technology Trends that Impact Application Development 571

might be in an image document that describes an insurance policy. That policy needs to be related to the insured, the owner, the beneficiaries, and its value over time so that a complete reconstruction of its status at any single point in time can be deter- mined. Existing database products can support tem- poral databases but are not specifically designed for temporal data. This implies the development of a specialized temporal database type, or the extension of existing database products to accommodate tem- poral data definitions.

Client/Server Client/server computing describes a situation in which multiple processors share responsibility for managing pieces of an application. Currently, the pieces include data, presentation software for the human interface, and application. For a given pro- cessing request, one processor acts as a client requesting that a processing service be provided; the other processor is the server that executes the request. In this context, examples of a service request are to access data, perform a routine, or dis- play data on a terminal screen. In a true client/server environment, any processor can be a client and any processor can be a server. The same processor might be a client for some actions and a processor for oth- ers. Therefore, the client/server environment, in its truest form, is describing a peer-to-peer network- ing scheme in which intelligent sharing of resources and data across multiple processors is taking place.

The state of client/server development changes almost daily, so by the time you read this, Figure 13-2 will be out-of-date. Don't worry, it is only to give an example of the alternatives and confusion in the client/server marketplace. The figure shows the alternative configurations of presentation software, data (and DBMS), and application software with tra- ditional, centralized mainframe resource manage- ment on the upper left of the diagram. Moving down the diagonal to full distributed client/server process- ing, we have first presentation software that resides both on the mainframe and on a PC. The PC soft- ware interfaces to the mainframe presentation soft- ware and is translated for use by the application. At the next level of sophistication, the presentation soft-

ware is offloaded to the PC completely. Then data is partitioned (i.e., split by columns or rows or both), and accessible via DBMS in both places. Next, the data are moved fully to the distributed environment, possibly with replication (i.e., multiple data copies). At the next stage, some application functions are performed on a PC and others on a mainframe. In its most advanced state, all functions (or pieces of each) are stored both on mainframes and PCs and with access determined by the closest processor with available CPU time.

In client/server's most advanced form, for exam- ple, simple functions might be on a LAN and com- plex processing functions on a mainframe. The data might be anywhere. The application part closest to the request decides type of processing to be per- formed and ships the request off to be executed in the most efficient place. If that location is busy, its software might forward the processing request to another processor until idle CPU cycle time is found. The executing processor would obtain the nearest version of the data and perform the requested service. The result is sent back to the requesting processor.

Client/server processing is sometimes confused with downsizing. Downsizing is the shifting of pro- cessing and data from mainframes to some other, less expensive environment, usually to a multiuser mid-size machine, such as an IBM AS400, or to a LAN of PCs. Downsizing can occur with or without client/server computing. The reasons for buying mainframes are diminished with the availability of client/server computing, but the compelling argu- ment for maintaining an existing mainframe envi- ronment is to obtain the most benefit from the tremendous start-up and maintenance costs associ- ated with them. Downsized environments also have large start-up costs that sometimes are equivalent to mainframe start-up cost.

The impact of client/server computing on SE is here now. There is tremendous demand for SEs who know how to integrate data, applications, and pre- sentation software over multiple processors and net- works. The large accounting companies, such as Ernst & Young, who also do conSUlting, have found a niche in providing leading-edge services of this type. But the need is in every size of company, even

572 CHAPTER 13 Summary and Future of Systems Analysis, Design, and Methodologies

Mainframe Environment

Local Area Network Environment

No Client/Server Support

Legend:

Limited Client/Server

Support

Full Client/Server

Support

D = Database Management System P = Display Presentation Software A = Application Software

FIGURE 13-2 Client/Server Alternatives

those that cannot afford a large consulting com- pany's fees. The pressure on SE professionals then is to develop the integration skills to develop and sup- port these applications as fast as possible.

Multimedia Multimedia is a term that describes the integration of object orientation, database, and storage tech- nologies in one environment. By the 21st century, multimedia will transform both applications and the way we interact with them. New technologies must be able to be incorporated into traditional application processing to be useful in business organizations. By defining equipment as objects and storing the object definitions in a database repository, integrating new equipment and technologies in traditional applica- tions becomes not just possible, but fairly easy.

SEs developing multimedia applications require new skills for authoring the contents of multimedia

systems, and for developing the applications that make the information accessible in a meaningful manner. For graphic design, video direction, and so on, one strategy has been to hire graphics artists or movie school graduates, for instance, to be multi- media authors rather than to teach an SE about video production. This splitting of duties still requires SEs to develop skills in integrating multimedia in appli- cations. At present, the skills required include 00 analysis and design, media knowledge, and human interface design incorporating moving and still- motion video, graphics, text, and data in the same interface.

Globalization Globalization is the movement of otherwise local businesses into world markets. In 50 short years, business organizations worldwide have evolved

Business and Technology Trends that Impact Application Development 573

from national to multinational to global enterprises. As with all trends, there are forces that both ease and inhibit movement into global markets. In general, information technologies enable globalization; and, in general, cultural differences and history inhibit globalization. The technology enablers are applica- tion and communications technologies that remove the barriers of geography and time, while providing equal access to multimedia applications. The histor- ical and cultural barriers inhibit cross-cultural exchange of ideas, technologies, and methods of work. Dealing successfully with both the technolog- ical and cultural issues is a challenge to information systems professionals and business managers. Preparing yourself for deploying globalizing tech- nology is the challenge to SEs today.

There are three main social barriers to globaliza- tion of businesses: infrastructure differences, tech- nology transfer differences, and political and cultural differences. Infrastructure usually refers to the installed base of equipment and services for commu- nications, transportation, and services of a geo- graphic entity (i.e., a country). Infrastructure relates to computers, telecommunications, and supporting software, including, for instance, database and net- working software.

There are two infrastructure challenges to SEs. The first challenge is technical, learning both current and past technologies, and devising sometimes messy ways to integrate them. The second challenge is social, developing and presenting alternatives and trade-offs for imaginative, practical, cost-effective applications in developing countries.

Technology transfer is a large scale introduc- tion of a new technology to some previously non- technical environment. Transfers of computing and communications technologies to all developing countries in Eastern Europe, Asia, Latin America, and Africa are needed. History leaves me pessimistic about such transfers taking place easily, smoothly, or soon. Broadscale transfers for such disparate tech- nologies as farming methods, birth control, building of dams, and water purification have failed simply because technologists fail to contend with cultural differences and resistance. 13 Technology transfer

13 See Hirschman, A. 0., Development Projects Observed. Washington, D.C.: The Brookings Institution, 1967.

suffers from the same bias that diffusion of innova- tion theory in general suffers: If the technology is not accepted, there is something wrong with the intended user, not the transfer agent or the technol- ogy. Naively, we think our way of implementing and using are the right way as if no other way is as good. The concept of equifinality-many paths lead to the same goal-eludes most Westerners. We fail to evaluate the technology within the context of the intended cultural structure. We assume stupidity on the part of the users and also assume this stupidity can be corrected by sufficient education. What we forget is that projects fail when planning is incom- plete, potential difficulties are not assessed or are misassessed, and cultural impacts of projects are insufficiently analyzed. The challenge to SEs is not to oversimplify projects and circumstances of their implementation that inhibit technology transfer, but to attend to the cultural aspects of implementations.

In any technology transfer project, it is imperative that the sensitivity to local differences is maximal. Teaching and training in a different culture does not mean making the target audience the same as you. Equifinality must be allowed. SEs' roles change from doer to facilitator, with less control than usual over outcomes. Successful globalization of applica- tions and technologies requires considerable breadth experience for SEs; for those who can develop and integrate the necessary business skills with their technical skills, the rewards will be huge.

Client/server and multimedia are technologies that enable globalization and require different ways of thinking in a global context. Most effective place- ment of data, database software, software, storage media, and computers is the main issue. Distribu- tion of data and functionality will require new deci- sion criteria. Before distributed applications, decisions were based on what the software and hard- ware could do. Constraints drove the decision pro- cess. Now we can have anything anywhere. The decision criteria shift from being technologically dri- ven to being business driven. Why do we need data x for y PCs in location z if we can have a data for b PCs in location c? What business requirements demand this placement of data, hardware, and so on? The extent of distributed multimedia access and enabling of peoples in far-off locations that takes place will become a conscious business decision.

574 CHAPTER 13 Summary and Future of Systems Analysis, Design, and Methodologies

Ethical, political, and practical issues inform dis- tributed media placement decisions.

Multimedia applications, because they support data, graphics;photos, audio, and video images, also have a significant cultural component in a global application. Design of culture-free or culturally-rich applications becomes a decision. Is it truly possible to design culture-free applications? My feeling is no, all applications have cultural assumptions at least implicit in their design. Multimedia will make obvi- ous our assumptions about appropriate words, pic- tures, and ideas for users. Biases that surface will relate to information system developers, user designers, and manager approvers. When applica- tions go global, assumptions that survive in the United States, in all likelihood, will be inappropri- ate globally. The assumptions will require develop- ment of the same application with different media components to fit the using culture. SEs will need to learn how to surface cultural assumptions of appli- cation developers and how they carryover to the fin- ished product. SEs will need to make assumptions explicit, then use the assumptions to design cultural diversity into applications.

In summary, business and technical trends are pointing toward breadth and depth of skill levels in SEs in many different areas. Methodologies do not support these trends today. Therefore, con- tinued evolution and change to methodologies can be expected.

SUMMARY ________ ~ __ _ Two methods of analyzing methodology classes were used in this chapter. The first, the information systems methodology framework, was extended to include the characteristics of applications from Chapter 1 and the desirable characteristics of appli- cations. From the analysis we know that both infor- mation engineering (IE) and object orientation (00) are more complete in desciibing applications than structured analysis (SA), but each addresses differ- ent phases of the life cycle. IE is more complete in coverage of organization level information systems planning and analysis, both of which precede design

and implementation. 00 is more detail- and pro- gramming-oriented, resulting in a deeper level of design by the end of the design phase. SA is so process-oriented that data, input, output, and other detailed aspects of the application are left to SE skill and are not specifically addressed by the methodology.

The second analysis of methodologies used the Humphrey's maturity framework to discuss the maturity of methodologies. Humphrey discusses the initial, repeatable, defined, managed, and optimizing levels of maturity. The results of this analysis show that no methodologies are currently beyond the defined level and that SA is only at the initial level. There are too many activities that are not addressed by SA to reach the repeatable level for all requisite tasks. At the repeatable level different people would arrive at the same design. IE is at worst repeatable, and, when completed in a CASE tool, may reach the defined level. 00 is at the repeatable level for many early activities, but is at the initial level for package and message communication design.

CASE tools were discussed in their ability to pro- vide three key design objectives: integration, intelli- gence, and multiuser support. The ability of CASE is hampered by methodologies that are not themselves integratable because of shifts in thinking that must be made from one phase of work to another. In gen- eral, SA and IE characterize such shifts and have rel- ative difficulty in CASE interphase integration of work. In contrast, 00 is more consistent in the think- ing and documentation forms both within and between phases, thus, the CASE tools supporting 00 are more highly integrated and represent the ever more detailed thinking required in OOD, and do so within similar graphical and text forms through- out the CASE tools.

Next, business and technology trends that impact application development were discussed, including legacy systems, repositories and data warehouses, client/server computing, multimedia applications, and business globalization. Legacy systems and data are historical leftovers from premethodology days that may have errors and structural flaws that make their conversion to new environments costly and dif- ficult. In particular, client/server, data warehouses, and repositories are three emerging technologies to

which companies want to migrate the legacy systems and data. Client/server environments provide for storage and processing of data wherever it is most needed by the organization in a peer-to-peer network. Data warehouses are storage technologies that provide for massive amounts of historical data. Repositories are versatile means of storing informa- tion about data, applications, hardware, and software that provide the definitions of interchangeable tech- nology components. Multimedia applications will use repositories to define the integration of object orientation, database, and storage technologies in one application environment.

Globalization is the movement of businesses into worldwide markets. Global application developers must deal with difficulties in development due to infrastructure differences and technology transfer difficulties. Technology transfer is the large-scale introduction of new technology to a new environ- ment, usually a developing country. Problems in technology transfer relate to cultural and political differences more than to the new technology. SEs developing global applications will need to attend to the culture and politics to be successful. Client/ server technology enables global applications. Mul- timedia was discussed as one type of application with a significant cultural component.

REFERENCES ______ ~ __ __ Adelson, B., and E. Soloway, "The role of domain

experience in software design," IEEE Transactions on Software Engineering, SE-11 , Vol. 11,1985, pp. 1351-1360.

Bergland, Gary D., "A guided tour of program design methodologies," IEEE Computer, October 1981, pp.13-37.

Card, David N., Frank E. McGarry, and Gerald T. Page, "Evaluating software engineering technologies," IEEE Transactions on Software Engineering, Vol. SE-13, #7, July 1987, pp. 845-85I.

Conger, S. A., "Teaching globalization in information systems courses," in Global Information Technology Education: Issues and Trends (M. Khosrowpour and K. D. Loch, eds.). Harrisburg, PA: Idea Group Pub- lishing, December 1992, pp. 313-353.

Datamation; "The best in client/server computing," Spe- cial Issue, October 1, 1991, pp. 1-24.

References 575

Dunsmore, H. E., W. M. Zage, D. M. Zage, and G. Cabral, "Building an empirical case for CASE," Software Engineering Research Center Report SERC TR-8-P, Lafayette, Indiana, December 16, 1987.

Episkopou, D. M., and A. T. Wood-Harper, "Towards a framework to choose appropriate information systems approaches," The Computer Journal, Vol. 29, #3, 1986, pp. 222-228.

Gane, Chris, Computer-Aided Software Engineering: The Methodologies, the Products, and the Future. Engle- wood Cliffs, NJ: Prentice-Hall, 1990.

Guindon, R., and B. Curtis, "Control of cognitive processes during software design: What tools are needed," CHI Proceedings. ACM: 1988, pp. 263-268.

Guindon, R., H. Krasner, and B. Curtis, "Breakdowns and processes during the early activities of software design by professionals," in Empirical Studies of Pro- grammers-2nd Workshop (G. Olson, E. Soloway, S. Sheppard, eds.). Norwood, NJ: Ablex Publishing Co., 1987, pp. 65-82.

Hirschman, A. 0., Development Projects Observed. Washington, D.C.: The Brookings Institution, 1967.

Humphrey, Watts S., "Characterizing the software process: A maturity framework," reprinted in Mile- stones in Software Evolution, Paul W. Oman and Ted G. Lewis, eds. Washington, D.C.: IEEE Press, 1988, pp. 301-307.

Humphrey, Watts, Managing the Software Process. Read- ing, MA: Addison-Wesley Publishing, Inc., 1989.

Iivari, Juhani, "Levels of abstraction as a conceptual framework for an information system," Proceedings of IFIPS WG 8.1: Information Systems Concepts: An In-Depth Analysis, Belgium, October 18-20, 1989, pp. 122-15I.

Kelly, John c., "A comparison of four design methods for real-time systems," ACM SIGSOFT Software Engineering Notes, Vo1.12, 1987, pp. 238-25 I.

Keys, Paul, "A methodology for methodology choice," Systems Research, Vol. 5, #1, 1988, pp. 65-76.

McClure, Carma, CASE Is Software Automation. Engle- wood Cliffs, NJ: Prentice-Hall, 1989.

Olle, T. William, Jacques Hagelstein, Ian G. McDonald, Colette Rolland, Henk G. Sol, Frans J. M. Van Assche, and Alexander A. Verrign-Stuart, Information Systems Methodologies: A Frameworkfor Under- standing. Wokingham, England: Addison-Wesley Publishing Company, 1988.

Panzi, David J., "A method for evaluating software development techniques," The Journal of Systems and Software, Vol. 2,1981, pp. 133-137.

576 CHAPTER 13 Summary and Future of Systems Analysis, Design, and Methodologies

Pennington, N., "Stimulus structures and mental repre- sentations in expert comprehension of computer pro- grams," Cognitive Psychology, Vol. 19, 1987, pp. 295-341.

Pressman, Roger S., Making Software Engineering Happen: A Guide for Instituting the Technology. Englewood Cliffs, NJ: Prentice-Hall, 1988.

Sorenson, Paul G., Jean-Paul Tremblay, and Andrew J. McAllister, "The metaview system for many specifi- cation environments," IEEE Software, March 1988, pp.30-38.

Wand, Yair, and Ron Weber, "On the deep structure of information systems," Information Systems Research, Vol. 4,#2, 1993,pp.23-45.

Ward, P. T., and S. J. Mellor, Structured Development for Real-Time Systems (three volumes). NY: Yourdon Press, 1985.

Yourdon, Edward, Modern Structured Analysis. Engle- wood Cliffs, NJ: Prentice-Hall, 1989.

KEY TERMS ________ _

AI inCASE associative data

relationships CASE integration client/server complexity data warehouse downsizing equifinality fragile applications globalization Humphrey's defined level Humphrey's initial level Humphrey's managed

level Humphrey's maturity

framework Humphrey's optimizing

level

Humphrey's repeatable level

information systems methodology framework

information systems plan (ISP)

infrastructure legacy legacy data legacy systems multimedia multiuser CASE peer-to-peer network process groups repository seamless CASE technology transfer

EXERCISES _______ _

1. Write a three- to five-page paper describing some new technology--distributed database (e.g., Informix or Sybase), Multimedia, Simple Network Management Protocol (SNMP) (net-

working protocol), im:lging. Predict how the technology will change in use in applications in the next five years. Predict IS and user organiza- tional changes as well as design changes.

2. Discuss globalization of businesses and other changes to software engineering activities that might be required.

3. Compare the methodologies using your own technique. What are the important methodology issues to you? How easy or hard do you find the work involved in describing the ABC applica- tion in each methodology? How easy or hard is it to really learn each methodology? Which are you most likely to continue using? How likely do you think these methodologies are to be use- ful for the emerging technologies of client/ server and multimedia? How would you change any or all of the methodologies to make them more usable? How might methodologies become less tied to technology? (Please send your responses to the author.)

STUDY QUESTIONS ___ _

1. Define the following terms: client/server Humphrey's maturity downsizing framework equifinality legacy data globalization repository

2. What phases of application development are in the Olle et al. information systems methodol- ogy framework?

3. Describe the features of the Olle et al. approach to comparing methodologies and identify the sophistication of the three method- ologies on each feature.

4. Why do you think the ISP was left out of the process methods of Tom de Marco and Ed Yourdon? (You might refer back to Chapter 1 's historical discussion for a hint.)

5. Object-oriented methodologies all ignore the front-end tasks of feasibility and data collec- tion. Why? Can they continue to ignore those actions and still be useful in business applica- tions? Why?

6. The Olle et al. framework was expanded to analyze the phases within each methodology where information is expected to become known. Describe this framework extension and identify, for data, processes, relationships, physical database model, and event triggers, where this information is known in each of the three methodologies.

7. What is the position of process methodologies with respect to data and data modeling? What is the significance of this position? How inte- grated is data to process description? What is the significance of this level of integration?

8. List three sources of application complexity. How does each source add to the complexity of an application?

9. Which methodology handles complexity the best and why? What is deficient about the other methodologies' handling of complexity?

10. To what extent do the three methodologies dis- cussed guide input/output design? What is the significant of this?

11. Rate the three methodologies on desirable application characteristics: minimal coupling, maximal cohesion, and information hiding. Justify your ratings.

12. What is Humphrey's maturity framework? How is it used to assess IS organizations? How is it used to assess IS methodologies for appli- cation development?

13. What are three shortcomings of Humphrey's framework? How might they be eliminated?

14. List and describe the five levels of maturity in Humphrey's framework.

15. Do many organizations or methodologies reach the optimizing level of Humphrey's frame- work?

16. Describe the three methodologies in terms of Humphrey's framework.

17. If you have access to a CASE tool, use Table 13-6 to analyze the sophistication of your tool. List five ways in which the tool you use could be improved to contain more of the desired CASE features and functions.

18. Three issues in CASE are discussed: integra- tion, intelligence, and multiuser support. How

Exercise Questions 577

does the author view current products on the market? How does a CASE tool you use rate on these three criteria? What changes might be made to the tool you use to improve its integra- tion, intelligence, and multiuser support?

19. Describe the research that seeks to integrate the best of all methodologies into a new, improved hybrid. Critique the utility of such a methodol- ogy and identify three of the problems with this approach. What benefits might accrue from a hybrid methodology? Why is it such a popu- lar topic of research?

20. Describe the research that studies novice analy- sis of problems and relate this research to that which seeks to integrate the best of all method- ologies into a new, improved hybrid. How can the analyst research be used to improve methodologies? What effect will hybrids have on novice learning?

21. What impact do legacy systems and data have on the use of new methodologies and CASE tools?

22. Define and discuss the issues of legacy systems and data.

23. Define a data warehouse and why companies are moving toward implementation schemes of this concept.

24. What is an associative data relationship and why does it impact data storage techniques?

25. Define client/server computing and downsiz- ing. Discuss how they relate.

26. What is multimedia and how does it relate to application development and method- ologies?

27. Describe some of the cultural issues in global information systems development.

28. What are the main issues in deploying global applications?

* EXTRA-CREDIT QUESTION 1. Change the scenario for ABC Video. Assume

ABC is an international organization that not only rents videos but also sells concert tickets, CDs, and other related entertainment and musi- cal merchandise. What cultural assumptions are

578 CHAPTER 13 Summary and Future of Systems Analysis, Design, and Methodologies

in the case description of ABC Video that need to be reexamined for an application to be used in locations all over the world? What other changes might be required for worldwide use of the rental application? Don't concentrate on merchandise; concentrate on the cultural and

equipment differences. If each of 3,000 stores in 60 countries send information to a single site in, let's say, Los Angeles, once each day, what technology considerations might be required?

CHAPTER14

___ THE FORGOTTEN ___ ANALYSIS AND ------------------__________ r---- ___ DESIGN ACTIVITIES ________ -

INTRODUCTION ____ _

The forgotten activities of systems analysis are design of the human interface, conversion/imple- mentation process, and user documentation. This chapter concentrates on human interface because the guidelines are not context specific and are based on research as well as practice. Rules of thumb for the other activities are discussed. Both the human inter- face and conversion are planned for ABC Video's rental processing application.

HUMAN ___________ __ INTERFACE ______ _

DESIGN

The presentation of information for selection and data entry is the single most important design item in an application. The format, type, size, color, and content of the display all are important to a user locating, controlling, entering, or monitoring infor- mation. A badly designed screen makes a user tire faster, make more mistakes, and miss information that might have disastrous effects on decision mak- ing. Misrepresented data can have the same effects. The user's perception of the application and how it helps or hinders in performing his job is directly related to the human interface. If a user perceives the

application as helpful and facilitating productivity, the application will be used with a high degree of satisfaction. If a user perceives the application as dif- ficult, obscure, or reducing productivity, the applica- tion will not be used voluntarily and user satisfaction will be low.

Interface design is one of the most intensely researched areas of computing, yet much of the research has not found its way to business applica- tion design. In this section, we try to remedy that situation. First, the conceptual foundations of inter- face design are reviewed briefly. Then the options and guidelines for each major activity during inter- face design are presented. Following each section, we discuss how to apply screen design guidelines to ABC rental processing.

Conceptual Foundations of Interface Design A combination of research, theory, and practice blend to provide the guidelines for interface design. In general interface design needs to answer ques- tions about when, what, and how to enter data into, and present data from, applications.

First, when to collect data has been resolved through long experience and research. The ideal data entry point is at the data source. There should be no

579

580 CHAPTER 14 The Forgotten Analysis and Design Activities

creation and collection of paper from which data is then keyed into a machine. The more people who touch a transaction, the more errors it will have. Therefore, eliminate all middle men, enter data at its source, and errors are greatly reduced.

Second, which data to collect and display are also issues. The general answer, based on practice, is all data required for business reasons. Data may be expanded to include company specific requirements. Also, data items IS staff think might some day be necessary, but for which users have no current or future business need, should not be collected or displayed.

Last, and most complex, is how human-computer interactions should be structured and presented to ease learning, minimize errors, and facilitate use. Research and theory on physical and cognitive aspects of memory, information processing, pacing of work, color perception, icon perception, and key- stroke effectiveness all are used to determine guide- lines for interface design. The results of applying the research versus not applying the research are increased productivity and reduced errors. Since the research is so voluminous, it is presented in the con- text of the chapter.

With all the choices and research recommenda- tions, deciding how to actually design functional screens can be a confusing exercise. In the next sections, practical guidelines from research and practice are developed. Information from the analy- sis phase is used to define the display requirements of the human interface. The analysis information is used to define a task profile for the application. Then, a profile of users is developed to identify screen requirements that relate to users rather than to functions of the system. The task profile is matched to guidelines for the application type to define and select the general interface as menus, windows, or commands. Application type also sug- gests functional screens as forms oriented, ques- tions and answers, or direct manipulation. Once the general and functional interfaces have been defined, individual field presentation is defined and format- ted for the screen. Finally, extra field characteris- tics, such as color, are decided and added to the design. Each of these topics is summarized below and addressed in the following sections.

1. Define task profile. 2. Define user profile and application design

response. 3. Choose option selection screen type. 4. Choose functional screen type. 5. Design option selection interface. 6. Design functional screen interface format. 7. Choose field format options for normal,

abnormal, alert, and alarm data conditions. 8. Design on-line user documentation, error

messages, and abnormal processing for all interfaces.

9. Design reports as required.

Develop a Task Profile Guidelines for Developing a Task Profile

The first activity is to develop a task profile which summarizes work requirements of the application. The level of detail in developing a task profile depends on the type of application being developed. The first task, then, is to classify the application as either transaction, query, DSS, ESS, or process monitoring and control (a special type of TPS). Since transaction processing is the most frequent application type in businesses, they are discussed here. The level of detail and activities for task profile develop-ment are summarized in Table 14-1 for the above application types.

For each activity, a hierarchy of processes is defined. This is the basis for screen navigation design. The top activities identified become selection options on a menu. Upon selection, the entries at the second hierarchic level are presented, and so on until a functional work screen is presented. The level of detail for the hierarchy should match the level of processing detail for the application type.

Next, required and optional data are defined for each task (see Table 14-1). For business applications, following the methodologies discussed in Chapters 7-12, required and optional data for entities should have been defined and documented in the data dic- tionary. For most business applications, this infor- mation can be developed at the entity/relation level rather than the attribute/field level. The idea is to identify multivariate dependencies which, in real-

Human Interface Design 581

TABLE 14-1 Task Profile Development Activities

Activity Transaction Query

Define Task Process Level Activity Level Hierarchies

Define Transaction/ Entity Level Required/ Field Level Optional Data

Define Data Only if greater Only if greater Precision than 2 decimal than 2 decimal

places for places for numbers numbers

Define Data Process/ Activity Level Source Transaction

Level

Define Entity/ Entity Level Purpose Transaction

Level

Define Only if it varies Only if it varies Accuracy from 100% from 100%

Define Domain Field Level Field Level

ID Specific Field Level Field Level Display Criteria

time systems, may need to meet synchronization and timing constraints.

If not already defined, precision requirements should be specified, by field, for all numeric fields (Table 14-1). Precision requirements specify the number of decimal places and special display char- acters required for numeric information. Precision is very important in mathematical, statistical, and process control applications. Precision beyond two decimal places is frequent in business applications dealing with large financial transactions. Banking applications, for instance, frequently require preci- sion to five decimal places for computing interest due and paid. Specific maximum field size, need for sign (e.g., +), and need for debit/credit indicators [e.g., CR or ( ) ] should all be defined. For text fields, the maximum length should be defined, if not already done. Possible edit characters for numeric

Process DSS ESS Control

Activity/ Activity/ Process Level Process Level Process Level

Entity Level Entity Level By input source

Only if greater Only if greater For each field than 2 decimal than 2 decimal places for places for numbers numbers

Activity/ Activity Level Field Level Process Level

Entity Level Entity Level Field Level

Only if it varies Only if it varies Field Level from 100% from 100%

Field Level Field Level Field Level

and text fields might be blanks, commas, or slashes. These definitions limit the number of data fields on a line while defining specific screen contents.

The source of data for each process should be identified next (see Table 14-1). Data source can be user-provided through data entry, measured data entry, or system-derived through computation. The key to identifying source, if it is unknown, is to determine where users go when they have a question about data on a screen. The answer might define a user, instrument type, or application as the informa- tion source. When user data entry is the source, train- ing needs and help facilities are required to ensure proper entry. Edit checks for entry errors are re- quired. When instrument measures are the source of data, the signal-to-noise ratio should be analyzed to determine the need for filtering devices or software. Fields for which the application is the source are

582 CHAPTER 14 The Forgotten Analysis and Design Activities

called derived fields for which data entry is not allowed.

Next, the purpose of every entity or field should be defined, depending on the type of application. Possible choices for purpose are forms completion, information, alert, or alarm. Business applications data purposes are usually form completion and information. Rarely are data items used to alert or alarm the user. Because alarms are rare in business, entity level checking is sufficient for all but critical applications. For each entity, then, the task profile identifies needs to send alert or alarm signals to the user based on data changes or system process out- comes. For critical and process control applications, each data element should have its purpose defined since the task of process control is to monitor changes in a system and correct any abnormal or undesired processes. Alerts to changed conditions and alarms to abnormal conditions are an integral part of process control interface design.

The need for accuracy for each task and, if less than 100%, of the data processed should be assessed. In business applications, this definition should be provided only when it varies from 100%. Typically, variation in business is in query or ESS applications for which ballpark numbers are acceptable for many types of processing.

For instance, a marketing person may want to tar- get a product to one or more specific demographic groups. If the target mailing is 1,000,000 pieces, the marketer needs to know how many groups he needs to meet this goal. A sample based on selection crite- ria (e.g., age, education level, and zip code) can be used to project the size of the population ±5%. In this case, a 0.1 % sample might be sufficient. Rather than read a 20,000,000 record file, only 20,000 records are needed.

The last two pieces of task information-domain and display criteria-are defined if not already com- plete. The domain is the set of allowable values for each field. Special display criteria might include translations of data to text (or vice versa), or a spe- cial color for some field, and so on.

All of these task characteristics are used to deter- mine the type of interface in system terms, and to determine training needs for users.

ABC Rental Application Task Profile

There is no special complexity in the ABC rental application, so completion of the task profile is rela- tively simple. We are using the Information Engi- neering analysis in Chapter 9 as the basis for this discussion. The first action is to create the task hierarchy. Using activity level as the top of each hierarchy, we rearrange the processes as the next level and their subprocesses as the third level, con- tinuing until all processes are elementary (see Figure 14-1). This diagram is the basis for navigation between screens. Each leg of each level on the hierarchy is translated eventually into a menu selec- tion list.

As of the analysis, all data were required for all entities (see Table 14-2). Precision for money fields is two decimal places. All other numeric fields are dates or integers. The source of Customer, Video, and Copy data is user data entry, so extensive edits will be needed in the entry programs to ensure that only correct data enters the system. End of day, Video History, and Customer History are all derived by the system and have no human interaction. The derived relations identify testing requirements for specific verification. The Rental relation is a combi- nation of entered and derived data which identifies both edit and testing requirements.

The purpose of the entities is either forms com- pletion or information with one Rental relation exception. The credit field will be used to deny rental privileges to customers who have a poor credit rating. Some special processing may be desired to highlight bad credit ratings. The possibility of high- lighting bad credit rating information should be discussed with Vic and his approval obtained. No decision is made at this time.

Accuracy for all maintenance, rental, return, and query tasks is assumed to be 100% (see Table 14-2). If Vic, while performing ad hoc querying, chooses to sample the data rather than read the entire database, that is okay, but not of interest for this definition.

The domains of each field are in the data dictio- nary. No special display criteria are identified at this time.

Get Valid Customer

Get Valid Video

Get Return ID

Add Return Date

Get Open Rentals

Check Late Fees

Print Receipt

Human Interface Design 583

FIGU RE 14- 1 Process Hierarchy Chart for ABC Rental Application

Develop a User Profile Guidelines for User Profile Development

A user profile is developed to determine the need for special interface design requirements that relate to the user rather than the task. User profile criteria include physical, educational, computer, and task capabilities (see Table 14-3). At the same time the user profile is developed, a matching profile for the

application and how it will address the user needs is also developed.

Information in the user profile is obtained from users through interviews, questionnaires, or person- nel file searches. If personnel file searches are per- formed, only average ratings of user skills should be computed unless each employee gives permission to use his or her information. Use of employee records for other than personnel purposes without permission is considered an unethical violation of privacy rights.

584 CHAPTER 14 The Forgotten Analysis and Design Activities

TABLE 14-2 ABC Rental Task Profile

Activity

Define Task Hierarchies

Define Required and Optional Data

Transaction

Process Level

Transaction/Field Level

ABC Rental

See Figure 14-1

All data required

Define Data Precision Only if greater than 2 decimal places for numbers

None. Dollar amounts have 2 decimal places.

Define Data Source Processffransaction Level User Entry and Derived, See Data Dictionary

Define Purpose

Define Accuracy

Define Domain

Entityffransaction Level

Only if it varies from 100%

Field Level

Form Completion, Information

100%

See Data Dictionary

ID Specific Display Criteria Field Level

In critical applications with possible life threaten- ing consequences, each individual user should be profiled and reviewed for proper skills, computer experience, and task expertise before being assigned to use the new application. Education can take care of some deficiencies in skill levels, but with some critical applications, people may be reas- signed to other jobs when their knowledge does not match the application requirements. For noncrit- ical applications, the profile can average user skills for each characteristic. User profile is used to deter- mine sophistication of the interface and train- ing needs.

Physical skills include color perception, typing skill, and physical disabilities that might be present in the user population. Color perception problems mean that reds and greens might not be perceived. If colors are used, users should be screened to ensure that they can recognize the selected colors. Also, color selection should relate to conventional mean- ings for each color used. For instance, red is the usual alarm-signaling color. In an application using red to signal an alarm condition, then, all users should be screened for their ability to perceive the color red.

o Required for Change field negative values. No other special requirements.

Typing is the other typically used physical skill. If user typing skills are low, either the application must be designed not to require typing, or typing training should be provided to users.

Education and math profiles can be either individual or average analyses (see Table 14-3). Education level determines the level of writing required to explain errors. For math-intensive or numerical control applications, specific math skills might also be necessary of users. When this is the case, math skills needed are defined for each task (e.g., one task might need algebra, one might need the ability to interpret geometric drawings, and so on). Users whose profile does not match the required skill levels are trained or reassigned. Many compa- nies, such as Texas Instruments, Chevron Oil, and others, retrain their employees in math skills needed to manage complex computerized manufacturing equipment.

When the average education and math levels are lower than high school-graduate level, the applica- tion interface must be designed as simply as possi- ble. Instructions and text help must be written using sentences under 25 words and use words averaging less than three syllables. Different indexes can be

TABLE 14-3 User Profile and Application Response

User Characteristic

Physical Skills:

Color Perception

Typing

Disabilities

Educational Skill:

Education Level

Math Proficiency

Language

Native

Proficiency with application language

Computer Proficiency

Average Proficiency

Number of packages

Job Characteristics:

Turnover

Experience

Description

Red/Green/Blue Color Perception

Ability in words/minute

Sight, hearing, or physical impairment that might change application hardware, software, or interface design

Average or actual level of highest degree

Average or actual level of math proficiency

All native languages not the same as intended implementation language

Average or actual level of proficiency

Average or actual level of proficiency in years of experience

Number and type of packages with which users are familiar

Average % new employees per year

Average years task experience

Human Interface Design 585

Application Response

Either design application without the problem colors or reassign the users.

Either design the application to fit the skill level or schedule typing training to increase skill level.

Either design application to accom- modate impairments or reassign the users.

For both education and math, design application help and training to ensure users can learn and uSe the application.

International applications should use language native to the region for the application interface.

Training and text descriptions in appli- cation can be no more difficult than the aver- age level of proficiency. Training should be provided to ensure that all users attain the average level (i.e., the average becomes the minimum).

Design the application help, messages, and user documentation to ensure understanding of all functions, messages, and menu options.

Define training method and requirements.

Define level of supervision after training is complete.

Determine interface option selection type.

Determine level of help and location (auto- mated vs. manual and immediate screen mes- sage vs. requested help).

586 CHAPTER 14 The Forgotten Analysis and Design Activities

used to compute reading level of text. For instance, the software RightWriter©, l provides the Kincaid reading grade level (scale of 1-16), Flesch index of readability (scale of 1-10), and a fog index (ratio of nouns and verbs to total words in a sentence) as mea- sures of text difficulty.

Information about native language is important to determining the language of the interface. As glob- alization of the economy and development of global organizations increases, the need to implement the same system worldwide will become commonplace. When applications are implemented in other coun- tries, the native language should be used as much as possible. From research we know errors are reduced and some user satisfaction comes from working with applications in one's native tongue. Sometimes, this requirement is government imposed. For instance, in the early 1980s, the King of Saudi Arabia declared that as of 1990 all communications, documentation, and application interfaces used in the kingdom would be in Saudi language. This posed a tremen- dous challenge to every company doing business with Arabia because Arabic is read right to left, fre- quently omits vowels, and has as much as 50% of every sentence in a local dialect. At the time of the declaration, there was no one recognized Arabic dictionary for the Arab world. Rather, each country had its scholars map the language for their country. In general, the more critical the application for controlling some potentially catastrophic process, the more important native language processing becomes. I would not like to think of a person who barely speaks English as the controller of a nuclear power plant with all systems and manuals in English!

Next, computer experience is profiled. The aver- age and range of number of years experience, num- ber of software packages used, type of software (e.g., spreadsheet), and whether the individual develops his or her own software are all important to know. The level of computer experience, coupled with the skill level required of the application, determine the type of training that is most effective. For applica- tions that are complex, critical, or have many vari-

1 RightWriter is a copy-protected product of RightSoft, Inc.

able activities, classroom and hands-on training would be indicated. For applications that are simple and have few activities, classroom, computer-based training or on-the-job training are sufficient. Assign- ment of new staff on the job might require close supervision for a period of time to ensure that they possess the skills to use the system properly. Close supervision should be used for all critical applica- tions regardless of complexity or method of training.

The level of task turnover in the next rating cate- gory determines which of the training methods is actually used. If turnover is low, classroom or com- puter lab training reach the most people at once and are the cheapest. If turnover is high, some method of individual training is required. Some alternatives for individual training are on-the-job, programmed instruction manuals, or computer-based programmed instruction. All can be effective means of training.

Finally, task experience is estimated. If the aver- age user has a high level of task experience, the labels for fields can be more abbreviated, less text is needed to guide data entry, and an expert mode of operation might be preferred. If the average user has a low level of task experience, or experience is vari- able, novice and expert modes might both be needed.

Task experience and turnover information to- gether determine the mode of interface as novice or expert, and the extent to which on-line help should be provided. Figure 14-2 shows that with low expe- rience levels, novice-only modes are required. With a high experience level, either a mixed mode or an expert mode-only are required.

Figure 14-3 shows that the type of message and extent of on-line assistance also varies with experi- ence and turnover. Low experience with low turn- over suggests use of meaningful text error messages with on-line help to elaborate on the error messages. With high turnover, the on-line help should include information on menu options, fields to be completed, and error messages for data entry errors. With high experience levels, the on-line messages can be abbreviated (or eliminated with use of a beep instead of any text message), and with high turnover, sup- plemented with a paper manual documenting errors and error recovery.

Last, effective training for the application type, user education level, and experience level can be

Task Experience

Low

High

Low

Novice Mode

Expert Mode

Human Interface Design 587

Turnover

High

Novice Mode

Mixed Novice and Expert Modes

FIGURE 14-2 Turnover and Task Experience Determine Mode of Processing

Task Experience

Low

High

Turnover

Low

Extensive error messages with on-line help

Simple, short error messages only

High

Extensive error, field, and menu prompting

Extensive on-line help for all functions and options

Simple error messages

Paper manual for look-up of help

FIGURE 14-3 Turnover and Task Experience Determine Level of On-line Assistance

588 CHAPTER 14 The Forgotten Analysis and DeSign Activities

decided. Training choices include classroom in- struction, computer-based training (CBT), or on- the-job training (OJT). Classroom training is the most cost-effective for groups of students. Students can ask questions and receive personalized training while a number of people are being trained simulta- neously. The disadvantages of classroom training are high cost and the fact that training cannot be repeated without additional cost.

CBT is most effective for training one or a small number of people simultaneously and at different rates. CBT is self-paced, low pressure, and does not require a senior person to monitor the training. The major disadvantage of CBT is its cost, which is steadily dropping. Much training in business will be computer-based by the year 2000 because, by then, it will be cost-effective for most business uses.

On-the-job training is cheap but requires a senior person to teach trainees. The senior person is assumed to be a good teacher who can explain all necessary variations to someone else. These as- sumptions may not be valid. If OJT is used, some manager or senior staff person should monitor train- ing and privately correct the teacher if a problem arises.

ABC Rental User Profile

Video stores hire younger people, who are frequently in high school. The turnover is high because it is part-time work with mostly evening hours (prime date time) and because the business is somewhat cyclical in video rental patterns. Since the specific users are not known, the average user is estimated based on the four current ABC employees. The analysis is summarized in Table 14-4.

In the ABC example, current employees have no physical impairments and none are anticipated. Typ- ing skill is expected to be low. No particular prohi- bitions on color or special ttquipment will be needed except to compensate for the lack of typing skills.

The application will use a bar code reader, as sug- gested by Vic, to replace the need to type most information. The bar code reader minimizes the key strokes required of users. The reader will scan user IDs, if they are used, and video bar codes to enter the information to the computer. If user IDs are not used,

the phone (or other ID) number will be typed. An- other typed entry is the total amount paid. This should not be too error prone because most people pay in even dollars, receiving change. If the need to enter a few numbers really worried Vic, user ID cards can replace the need to type user IDs, or, alternatively key pads are less error prone than type- writer keyboards and could be used.

The average education and math levels of em- ployees is expected to be at the 10th-grade level. This means that algebra is the most abstract level of math skill. The system design criteria are KISS- keep it simple, stupid-so the 10th graders can do the work easily. The math level should be acceptable since the only skills required are to enter the amount paid and to make change.

The language of employees and the language of the application is expected to be English.

Task turnover is high and task experience varies from low to high. Vic has one employee who has worked there four years and two who have been there two months. The task experience of the long~r employee is significantly greater than the other two. While the video rental business is not complex, the two newer employees cannot be expected to perform all functions. The system design criteria in response to high turnover and variable task experience is to provide a simple interface with message help on request for all selections, fields to be completed, and error messages.

Computer experience is also expected to be vari- able but generally low. Number of years' experience for the three employees ranges from zero to two years. Number of software packages ranges from zero to three. The software used is word processing by two people, and database and spreadsheet by one person. One person wrote his own software.

With little computer experience, high turn- over, low task experience, low task complexity, and 10th-grade education, two alternatives are recom- mended. First, individual, self-paced, computer- based instruction (CBT) is recommended because the students can come in on their own time to train whenever it is convenient. When the store is not busy, they might continue their on-the-job training using the CBT. The method would be to give the per- son one each of the different transaction types. The

Human Interface Design 589

TABLE 14-4 ABC User Profile and Application Response

User Characteristic

Physical Skills:

Color Perception

Typing

Disabilities

Educational Skill:

Education Level

Math Proficiency

Language:

Native

Proficiency with application language

Computer Proficiency:

Average Proficiency

Number of packages

Job Characteristics:

Turnover

Experience

Description

No Problems

Less than 15 WPM

None

10th Grade

Algebra

English/Spanish

High

Low, 0-3 yrs. Average = 1 Yr.

0-3, Lotus, WP

65% Yr.

Low to High, Average = Low

person would enter the information and the com- puter would automatically do all subsequent pro- cessing. Then, the person would do several of each type of transaction completely. The system would intercept their entries and prompt them for correc- tion, displaying reasons for the correction when they made errors.

Second, if CBT is too costly, on-the-job training (OlT) with a senior person monitoring and assisting

Application Response

None

Design to minimize data entry by using bar code reader for Video ID, Copy ID; data to be entered Customer Phone, Amount Paid

On-line help

Needs no special design. Users must be able to make change.

None unless Vic wants to verify user ability to read all display text

English will be the implementation language.

Training in basic computer skills, startup, shutdown, etc., required.

Use extensive on-line help for all options, entry types, data types, forms fields. Provide expanded on-line help to supplement mes- sages for errors.

Provide extensive training in all transaction types, beginning with turning on the machine. Monitor performance for first week on the job to ensure that training was sufficient.

the trainee should be sufficient. If this is the chosen alternative, the trainees should learn rental and return processing first. This can be followed with less important tasks after several days. If OlT is the preferred training method, Vic should monitor the trainer(s) and trainee(s) closely for several days to ensure that the trainers cover all alternatives, pace the instruction to fit the person, and make no as- sumptions about the trainees' skills.

590 CHAPTER 14 The Forgotten Analysis and Design Activities

Option Selection Once the user profile is complete, the general form of the human interface is decided by mapping the user and task to the implementation environment. When this activity is complete, all interface recom- mendations are presented to the user for discussion and decision. Two choices are made from the mapping of user and task to implementation envi- ronment. Either or both of the choices may be con- strained by particular hardware and software if these are already known. The choices are for general option selection screens and general functional screens. Each of these are summarized in Table 14-5. Each set of alternatives and guidelines is followed by a description of how to apply the information to screen design for ABC rental.

TABLE 14-5 Summary of Interface Choices

Interface Level

Option Selection

Functional Screen

Data Presentation

Screen Item

Alternatives

Menu Window Command Language

Form Question & Answer Direct Manipulation

Analog Binary Digital BarChart Column Chart Point Plot Pattern Display Mimic Display Text Text Form

Color Size Type Font Type Style Blink

Option Selection Alternatives

Choices for interface option selection design are menus, command languages, and windows for get- ting to some functional screen. Menus are lists of items from which a selection is made. Command languages are high-level programming languages that communicate with software to direct its execu- tion. Windows are a form of direct manipulation environment that combine full screen, icon symbols, menus, and point-and-pick devices to simplify the human interface by making it represent a metaphor- ical desk environment.

In general, menus and windows are novice modes of operation, while command languages are expert modes. Windows are the interface design most rec- ommended because they simulate an office desk and present the most familiar interface to users. The next section presents design guidelines for the selection level of processing. Details of design for menu and window design are presented in the following sections.

General Option Selection Guidelines

General design guidelines relate to the development of a consistent, standardized interface, consisting of a header, a body, and a footer (Figure 14-4). The screen may include error message lines and com- mand entry lines as well. Many companies have standards for screen design, so much of the work is already complete.

The header section of the screen should contain an identifier of the application, function, date, time, screen ID, and program ID. An example is shown in Figure 14-5.

The body of the screen contains variable infor- mation (see Figure 14-4). In hierarchic menu pro- cessing applications, the body contains menu selection, forms for completion, graphics output, or graphical monitoring measures. The body of the screen is subject to many other guidelines which are discussed in the next section.

IBM standards also suggest a user message line and an error message line (see Figure 14-6). Defin- ing user commands and error message lines as fixed may take too many lines away from the screen, so these are optional.

Human Interface Design 591

FIGURE 14-4 General Screen Design

Scr001 Company Name Header MM/DO/vY

FIGURE 14-5 Screen Header Example

592 CHAPTER 14 The Forgotten Analysis and Design Activities

Scr001 Company Name Header MM/ODIYY

Task:Main Menu Task/Menu Header HH:MM:SS

xxx001 Error Message Command:

FIGURE 14-6 Command and Error Line Examples

The footer screen section contains indicators of navigation choices. Nav.igation choices should iden- tify which key to select for each allowable move- ment option. Movement can be within a screen, between screens, or between menus and functional screens. Usually, screen navigation actions are taken by using special keys: escape (ESC), delete (DEL), or programmed function keys (PF or F keys). The al- lowable actions should be identified at the bottom of the screen in a manner similar to that shown in Figure 14-7. The identifiers should always contain a connector (such as colon) between the key label and the action label. The action labels should be con- cise, clear, and consistent across the entire applica- tion (see Figure 14-7). Ideally, only actions allowed from the current screen should be shown. Others might be blanked out or muted to indicate that they cannot be chosen here.

Menu Standards

The research on menu processing has given us guidelines for location and ordering of menu op- tions. U ser/SE choices .prevail for menu option

names and option selection technique. First, based on the number of items on the menu, location is decided. If the number of options is less than 10, the items should be centered as a left-justified list of options. If numbers or letters are assigned to the options, they should be right-justified, followed by a period, and two spaces to the left of the corre- sponding choice (see Figure 14-8).

When the number of options is 10 or greater, you should experiment with different layouts to make the menu simple and easy to use. If the options are all independent, separating sequences of four or five options by blank lines enhances understandability (see Figure 14-9). If list options are interrelated, then experiment with segmenting the screen into different areas with each area containing an area ID and a cen- tered, justified list of options for the area (see Fig- ure 14-10).

The options for menu selection are entry of an option ID without cursor movement, point and pick, or entry of an option ID with cursor movement. Either of the first two are recommended and selec- tion should be based on user preference (see Figure 14-11). The third option requires more key strokes

Human Interface Design 593

Scr001 Company Name Header MM/DDIYY

F7:End Trans F9:Pg Dn F11 :Sh L Tab:Nxt Fld F8:Pg Up F10:Sh R I\Tab:Lst Fld ESC:Quit

FIGURE 14-7 Screen Footer Example with Function Keys

Customer Maintenance

1. Add 2. Delete 3. Update 4. Query

FIGURE 14-8 Numbered Menu Option List, Less than 10 Choices

594 CHAPTER 14 The Forgotten Analysis and Design Activities

1. Consultant Assessment 2. Consultant Selection 3. Applicant Scheduling 4. Consultant Maintenance

5. Consultant Contract Creation 6. Interview Scheduling 7. Client Maintenance 8. Client Contract Creation

9. Query Consultants 10. Query Clients 11. Client Billing 12. Consultant Payment

13. Business Trend Analysis 14. Accounts Payable 15. General Ledger 16. Payroll

FIGURE 14-9 Menu Option List, More than 10 Independent Choices

and is more error prone; therefore, it is not recom- mended. Option IDs can be alphabetic or numeric; alphabetic options can be the first letter of the

Rental/Return Processing

Create Update Query

Video Maintenance

Add Delete Update Video Update Copy Query

option or letters assigned from the alphabet in sequence. Again, there is no one right answer and user preference should prevail. If a point-and-pick

Customer Maintenance

Add Delete Update Query

Periodic Processing

End of Day Report Startup Shutdown History Update Query

FIGURE 14- 10 Menu Option List, More than 10 Interrelated Choices

Data Entry without Cursor Movement

1. Create 2. Delete 3. Update 4. Query

Enter Selection:

Cursor Movement and Selection

Cursor to the Option, Press Return:

Create Delete Update Query

Data Entry with Cursor Movement

Move to Option, Enter Number

1. Create 2. Delete

_3. Update _4. Query

FIGURE 14- 11 Menu Selection Options

device, such as a mouse, is used, no option IDs are required.

In all cases, when entry of a selection option is used, the message requesting the data entry should be centered on the screen, two lines under the last menu item, and should be in this location on all screens. This means that the location of the entry line should be two lines under the longest list in the entire

Human Interface Design 595

application, and that it is always displayed on that line.

The listing of options within the menu should be based on frequency of choice when point-and-pick selection is used, and should be based on alphabetic order of choices when entry of a selection ID is used. Frequency listing is used for point-and-pick selec- tion because the cursor should be positioned auto- matically at the most frequent choice (see Figure 14-12). The positioning by frequency of use mini- mizes keystrokes when moving to other choices. Alphabetic sequence of choices is used when a selection ID is entered, because users can read and understand an alphabetic list faster than a random list (see Figure 14-13). Both alternatives assume a novice user who does not know the options from memory.

The last issue in menu design is option names. Some authors 2 recommend specific names even if it means repeating some information (see Figure 14-14). Other authors3 recommend concise but meaningful names with no repetitive information (see Figure 14-15). Combining these guidelines, we can design screens that are easily understood and used. First, the option names should be listed to com- pletely define the process and entity(s) (as in Figure 14-14). Then, any information repeating in all entries should be removed and placed in a header for the menu list (see Figure 14-16). The result is the concise list from Figure 14-15 with a short header providing the additional information from Figure 14-14.

To summarize, menu applications should be designed in the context of a standard screen format that is used throughout the application. Menu items should be centered, selection action should be obvi- ous, and minimal information should be in the body of the screen.

Window Standards

Windows are rectangular screen areas used to display information. Window displays differ from

2 For instance, Banks &Weimer [1992].

3 For instance, Galitz [1981]; Thomas.[1982].

596 CHAPTER 14 The Forgotten Analysis and Design Activities

Customer Maintenance

Add Change Query Delete

FIGURE 14-12 Menu Options Listed By Frequency

menu-driven full screen displays because users can view different, possibly unrelated information at the same time in different windows. For instance, in ABC's rental application, we might be looking for rental information for Sarah Cropley. We can begin a query function, then type, for example a '?' in the Customer Name field to indicate a look-up. A new window opens up and shows customer names. We select Sarah Cropley, the window closes, the name is moved to the first window, and we continue the query. Look -up and selection of information from a

Customer Maintenance

Add Delete Query Update

Enter Selection:

FIGURE 14- 13 Menu Options Listed Alphabetically

Cursor to Selection, Press Enter

Add Customer Change Customer Query Customer Delete Customer

FIGURE 14- 14 Complete Menu List

window is simpler than a menu system which uses the entire screen for one thing at a time. Because windows are different from menus, they have dif- ferent guidelines and standards for their use.

A typical window can have the components shown in Figure 14-17. A Close Box stops process- ing and is similar to an F3 key use defined for a menu. The Title Bar names the window the same as the header line in the header portion of a menu. Location ID and status indicator identify where the user is in the window and whether or not processing

Cursor to Selection, Press Enter

Add Change Query Delete

FIGURE 14-15 Concise Menu List

Customer Maintenance

Cursor to Selection, Press Enter

Add Change Query Delete

FIGURE 14-16 Combined Menu List

Close Box Menu Bar

Human Interface Design 597

is normal. The zoom box and resize box both are used to change window shape. Zoom toggles between current size to full screen and back. Resize allows the user to customize the desired width and height to the window. Scrolling elements, arrows, bars, and boxes are used to move vertically and horizontally in the window, and are similar to func- tion keys FS-Fll we defined for the menu sys- tem. A scroll box is dragged to move a variable distance, while a scroll bar pages up or down depending on where it is touched, and a scroll arrow moves one line at a time. Most window elements are available for use in a windowing application, such as Paradox, but usage is selected by the pro- grammer. All are recommended if the application contains multi screen forms completion. At least one type of scrolling element for each dimension should be provided.

Window Title Zoom Box

I Box

Body of Window ... Variable Information

Scroll Arrow Scroll Box Scroll Bar Scroll Arrow

FIGURE 14-17 Window Components

598 CHAPTER 14 The Forgotten Analysis and Design Activities

FIGURE 14- 18 Window Component Hierarchy

Windows have two basic varieties: tiled and over- lapping. Tiled window systems only create non- overlapping windows. These work best for process control and non data intensive applications. When many functions and types of data may be active at once, overlapping windows might be desired. Over- lapping windows layer windows as opened, one on another, until the application maximum. To move from one window to another, the user clicks on the edge of the desired window to bring it to the front of the stack.

Windows are defined as hierarchies of objects for management. Figure 14-18 shows the hierarchy for the window components in Figure 14-17. As new windows are opened, a new hierarchy is built. All of the window hierarchies are managed by a screen manager which links all hierarchies.

Windows should be set off from each other and from the background by thick, easily recognized borders. Tiled windows should provide blank space,

if it is available, between windows. In current win- dowed systems, the user has little choice about positioning of selected options for title bar and scroll bars, for example, but, if choice is allowed, the design should be consistent in all tasks. One of the best features of the Macintosh environment is that Apple Computer requires any software operating on the Mac to use exactly the same interface definition as the Apple operating system. All software seems familiar before it is even used. Finally, if no other features beyond a window space are used, scrolling to allow viewing of all window accessible informa- tion should be provided.

Window menu styles include horizontal pull- down, Lotus-style horizontal pop-up, and vertical pop-up. Horizontal pull-down menus show the top-level selection choice across the top of the screen, taking the least screen space of all menu options (see Figure 14-19). When a menu is acti- vated, by having the cursor moved to its location, the

Human Interface Design 599

FIGURE 14-19 Horizontal Pull-Down Menu Example

second-level menu is pulled-down from the original entry. To make a selection, the cursor is moved to the desired option and activated. Activation is either through a return key or by pressing a mouse button.

Lotus-style horizontal pop-up menus present a second level of options shown as menu items (see

Figure 14-20). The main difference is that pop-up selection continues to show between pull-down and pop-up menus the second level actions, whereas pull-down menus disappear as a selection is made.

Vertical pop-up menus are long lists that con- tain a portion of the list in a scrollable window (see

FIGURE 14-20 Lotus-Style Horizontal Pop-Up Menu Example

600 CHAPTER 14 The Forgotten Analysis and Design Activities

FIGURE 14-21 Vertical Pop-Up Menu Example

Figure 14-21). To select an action not currently showing, the menu is scrolled until the desired action is visible. Then it is activated. Vertical pop-up menus also disappear once an action is activated. In Figure 14-21, the items that would not be showing on the screen are in the gray area.

There is no research on the effectiveness of these three types of menus. In general, though, we know from past research that familiarity with the interface type leads to greater satisfaction with the software. Both horizontal pull-down and Lotus-style pop-up screens are familiar to most PC users. Vertical pop- ups remain useful for long lists.

Both pull-down and vertical pop-up menus offer a simple means for providing expert and novice modes of work. Command keys can be defined for specific functions and shown on a menu for optional use (see Figure 14-22). Novices can use the menu without paying attention to the commands, while experts can learn commands as they need them, becoming proficient in some areas and remaining a novice in others. This option, plus the office desk metaphor that people easily relate to, make win- dowed environments the preferred development screen style.

Scrollable elements not shown on the screen

ABC Rental Option Selection

The ABC rental application is mostly transaction processing with some query processing. Both windows and menus are recommended for transac- tion systems, with windowed query development recommended for query applications. Both graphical and digital presentation are recommended. If hard- ware has not already been chosen, these recom- mendations imply math and graphic capabilities for the workstations. Standard displays should be suffi- cient unless Vic wants many graphics, in which case, one display should be high-resolution for graph- ical use.

The key screen design decision is between win- dows and full screen menus for selection. There is no one best choice in this decision. When software is chosen before screen design, software sometimes dictates the interface. For instance, mainframe soft- ware, for the most part, does not support windows as this text is written. The most advanced screens require a full-screen menu interface. Conversely, some PC software does not support anything but menu bars and windows. To use full-screen menus in this software is cumbersome and costly. User pref-

Human Interface Design 601

FIGURE 14-22 Function Keys on Pull-Down Menu for Expert Use

erence for selection tends to be strong and should be the deciding factor.

Assume no software is selected yet. To give Vic an informed choice we should sketch both window and menu screen and let Vic choose which he likes best. To do both, we have to design the interface to accommodate the application. For windows, the menu bar should include each major entity and/or process. The menu bars and subchoices for ABC rental processing are shown in Figure 14-23. This design might change with software selection, such as dBase IV, so a sample menu bar with subchoices for dBase is also shown as Figure 14-24. Next, a hierarchic menu system is defined for contrast (see Figure 14-25). The hierarchy menus mirror the task hierarchy defined above. One menu is present for each activity and for its successive lev- els of subactivities until the functional screens are reached.

The recommended design uses windows. Vic selects windows with the Figure 14-23 menus to be used. He dislikes the dBase menu because none of the functions relate.to his applications. Finally, Vic requests a 'quick look' at the screens on the com- puter to confirm his choice.

Functional Screen Design

Functional Screen Design Alternatives

Once all navigation through menus or commands is complete, the functional level of screen is presented for the real work of the application. Functional level screen choices are direct manipulation, question and answer, and form filling. Direct manipulation inter- actions are those in which the user performs an action directly on some display object. CAD/CAM, CASE, and some computer-based training (CBT) systems have direct manipulation interfaces.

Question and answer (Q&A) interfaces are those in which progressively more focused dialogue takes place based on responses to preceding ques- tions. Artificial intelligence applications and some CBT systems are the most common uses of the Q&A format.

Form-filling interfaces are most common in transaction processing applications but can be used for any application needing to collect discrete, single values for variables. Form-filling interfaces present the user with labels and indicators of where data is to be entered. Users are led through the form

602 CHAPTER 14 The Forgotten Analysis and Design Activities

FIGURE 14-23 Menu Bar for ABC Rental Processing

completion process by cursor movement and mes- sages from the software.

Functional Screen Design Guidelines

In general, the application type determines the most appropriate functional screen design. Recommended

interface designs are shown in Table 14-6 for all application types. Windows are the preferred method of selection presentation because they can be layered to keep track of thinking processes during long selection sequences, and because their pop-up action matches the way people think more closely than menus. Command languages are not preferred for

FIGURE 14-24 dBase IV Menu Bar for ABC Rental Processing

Human Interface Design 603

Option 1: All Menu Choices on One Screen

Customer Maintenance

Create RenVReturn Update

Delete Query

Video Maintenance Periodic Processing

Create End of Day

Update History

Delete Update

Query Query Startup Shutdown

Option 2: Individual Menus for Each Level of Choice

SCR01 ABC Video mmddyy

Rental Processing Application

Main Menu

Move the cursor to your choice, Press Enter

Rental Processing Customer Maintenance Video Maintenance Periodic Processing

F1 :Hlp F3:End

SCR03 ABC Video mmddyy

Rental Processing Application Video Maintenance Menu

Move the cursor to your choice, Press Enter

Create Update Delete Query

F1 :Hlp F3:End F5: Main

SCR02 ABC Video mmddyy

Rental Processing Application Customer Maintenance Menu

Move the cursor to your choice, Press Enter

Create Update Delete Query

F1 :Hlp F3:End F5: Main

SCR04 ABC Video mmddyy

Rental Processing Application Periodic Processing Menu

Move the cursor to your choice, Press Enter

End of Day History Update Query Startup Shutdown

F1 :Hlp F3:End F5: Main

FIGURE 14-25 Hierarchic Menu Set for ABC Rental Processing

604 CHAPTER 14 The Forgotten Analysis and Design Activities

TABLE 14-6 Interface Design by Application Type

Application Type

DSS and ESS

Process Monitor/Control

Query

Transaction Processing

Selection

Window Menu Command Language

Window Menu

Window

Command Language

Window

Command Language

Window

Command Language

DSS and ESS because the users of these applications are usually managers who should not be expected to know a command language. DSS and ESS may be used infrequently and the interface should chauffeur and lead the user as much as possible. Command languages are the third choice for all application types because they assume expert level knowledge both of the task and of the computer system doing

Function

Q&A Fonn

Fonns Windows

Analog display

Mimic display for multi valued or multidimensional data

Digital display for specific numbers with symbols, numbers or indicators (e.g., alert)

Command Language

Direct Manipulation

Window

Fonn

Command Language

Fonns

Display

Text short answer is usual display; could also include graphic results.

Graphical-bar column, point plot

Digital

Need help and cautionary comments for inappropriate output fonn use.

N/A

Graphical-bar column, or point plot

Digital

Fonns

the task. Ideally, a combination of windows with optional expert commands should be provided.

For transaction applications, forms completion screens are preferred for functional processing. Q&A is much less efficient for transaction applica- tions (TPS) than forms because line-by-line entry takes longer and is fatiguing. Direct manipUlation is inappropriate for TPS.

For query applications, all options can be used for selection and query generation. Query generation is the functional processing in a query application. For query generation, windows with query criteria are preferred. For experts, direct command language use is preferred. Query results can use graphical or digi- tal styles of presentation.

DSS and ESS should use a consistent interface until data results are presented. Either window selection with window request formulation or menu selection with form request formulation are recom- mended. Results screens can combine any graphic and digital presentation styles, although warning messages for inappropriate display selections might be desirable.

Artificial language applications usually result in a Q&A format. Each AI language environment uses its own method. For instance, Turbo Prolog TM4 uses a combination of windows and command language to initiate processing. A text answer which may have an associated probability of correctness is the usual AI output. Some AI language environments also support limited graphical display.

Last, in process control applications, the func- tional display is the results display. Analog, mimic, and graphical display are all common in process control, sometimes on the same screen. The display usually has a command line at the bottom of the screen. Commands are limited to requesting addi- tional information about a certain measurement or part of the system being monitored, or requesting a different display. The most flexibility and sophisti- cation of design are required in process control ap- plications because they are most likely to be critical in terms of having life-sustaining responsibility.

ABC Functional Screen Selection

ABC rental processing is a TPS and will use forms for the data entry functions. The forms screens for data entry include rental, return, customer mainte- nance, video maintenance, periodic, and query selection processing. These screens should not change regardless of which option selection inter-

4 Turbo Prolog is a trademarked product of Borland International.

Human Interface Design 605

face is selected. Therefore, they could be designed at the same time the general interface is being decided. In any case, the forms screens should be presented to Vic to get general comments and to correct any design he might dislike before a prototype is built.

Presentation Format Design Once the general form of the interface is decided, details of display are decided. The first set of choices are for data presentation based on the type of data. The second set of choices are for specific field for- mats. Presentation format describes the method of displaying data on a screen.

Presentation Fonnat Design Alternatives

The options for presentation format include analog, digital, binary graphic, bar chart, column chart, point plot, pattern display, mimic display, text, and text forms.

ANALOG. Analog displays are for continuously variable data (see Figure 14-26) and are usually used in direct manipulation interfaces. Analog displays use a pointer of some kind to show a position that is analogous to a value the position represents. Ana- log displays all should have a scale, pointer, a di- rection indicator of increasing/decreasing measure, and an indicator of normal/abnormal measures (see Figure 14-26). For instance, analog display is effec- tive for the pounds per square inch of pressure (psi) to show a measure of exerted force. Another exam- ple from manufacturing is the continuous flow of various densities of oil from a cracking plant which is effectively conveyed via analog display.

The scale is a numeric indicator of the item mea- sures. A pointer indicates the current position on the scale. Pointers might be arrowheads or needles and may be fixed or moving. The indicator of increas- ing/decreasing direction is usually a combination of arrows and text to indicate the meaning of direction of pointer movement. Normal and abnormal mea- sures can be indicated by a shaded section of the scale, different colors to scale numbers, a change in color of the pointer, a tone for abnormal measures, or

606 CHAPTER 14 The Forgotten Analysis and Design Activities

Numeric Scale, Normal Range Indicated, Arrow Moves

FIGURE 14-26 Examples of Analog Displays

some means of showing expected and unexpected numbers.

The guidelines for analog displays are summa- rized as follows:

Display Contents Scale to which the measure applies Pointer to indicate position on the scale Indicator of increasing/decreasing direction Normal/abnormal measures indicated

Display Design Use conventional user mental model of item Use moving points on fixed scales Use same analog design for all analog

measures on display Use design method~ircular scale or open,

partial circle scale-to facilitate user recognition

Usage Rate of change Range of values for continuous data Determine acceptable operation

In general, the most effective displays fit the users' mental model of the measure, use moving pointers on fixed scales, and are consistently designed when more than one analog measure is used. If numeric analog values must be tracked, a semicircular open scale using a fixed pointer with a moving scale allows faster numeric recognition.

Analog displays are best used for monitoring rate of change, monitoring a range of analog values, or

Changeinlasthour:+.2

Partial Circle, Numeric Scale, Amount and Direction of Change Shown, Arrow Moves

for determining ranges of acceptable operation. Examples of rate of change are the flow of oils in a cracking plant or the voltage fluctuation in cables. A monitoring example is a speedometer for speed limit. Pressure gauges in a nuclear power plant or bond ratings selections that must fall within company guidelines are examples of ranges of operation.

DIGITAL. A digital display is used to convey exact numerical information. Digital displays are most effective when used for variables that have one value at a time. Each value requires a label to iden- tify the data value.

Guidelines for digital data and an example are shown in Table 14-7. In general, only that data of required precision for accuracy should be displayed. Field size should provide for the maximum and min- imum values. If data displayed changes frequently, as in a stock trading application, the data should stay on the screen long enough for comprehension, about five seconds, before being changed. If the user is monitoring change, an arrow, plus/minus signs, or other indicator of direction of change might be shown.

BINARY. Binary means having two parts. A binary display shows some graphic to indicate a two-value selection option. Usually, we think of binary items as having on-off, or yes-no, or zero- one values.

TABLE 14-7 Guidelines and Example of Digital and Binary Data

Display Contents: a. 'Y' or 'N' or other character b. 0 or· c. 'On' or 'Off' d. lor 0 (One or zero) or other numerals e. -J or blank

Display Design If text form, use contents a, c, or d above If analog display, use bore If in a menu list, use b or e

Usage To indicate an item that is 'turned on' or 'turned off'

Human Interface Design 607

correspond to the values of related variables (see Figure 14-28),

By convention, bar charts show increases in value as the chart is read left to right. Bar charts are effec- tively used to show task plans over time, percentage of task completion, comparisons of item values (i.e., item 1 value vs. item 2 value), and cyclic data (e.g., product sales over a fixed period). In business applications, bar charts are rarely used on screens with other graphic displays; they are generated by applications as summary output for managers, and can be easily generated on-line by many software packages.

To indicate a two position setting COLUMN CHART. A column chart is a bar

Example of digital time display

Binary interface information can be presented using text or graphics in several ways (see Figure 14-27). The binary item can be displayed in text using the words yes-no or on-off, or with letters 'y', 'n'. A menu can list the option with a check mark to indicate an 'on' condition, A graphical button, or circle, can be used-when the button is empty, the item is not on; when the button is filled in, the item is on.

By itself, binary indicator selection may not be a major decision. It becomes important when used with other information on the screen at the same time. If used in a menu, a check mark, change of color intensity, or change of color can all be used to effectively indicate an 'on' condition without using any extra characters. If used within a line of text, text presentation (e.g., 'y' or 'n') is more effective.

BAR CHART. A bar chart summarizes numeric data as one or more horizontal bars whose lengths

chart using vertical bars rather than horizontal ones. Bar charts are most often used when time is a fixed period (or is not relevant). Column charts are most often used when time varies and is shown on the x-axis (across the bottom), For instance, cyclic data is most effective in a bar chart when comparing a fixed period (see Figure 14-29). When compar- ing cyclic data over periods, a column chart is more effective.

The general rule is to use column charts for mul- tiple time periods, to compare different items on the same scale, or for consistency with cultural conven- tions which assume a vertical scale (e.g., plotting temperatures, times, revenues, sales).

POINT PLOT. A point plot is a column chart that shows the x-y points on the diagram with or without a line connecting them (see Figure 14-30). Point plots might have trend lines generated to show the direction of change. A band chart is a special type of point plot that plots several variables on the same diagram. Band charts use shaded areas of the dia- gram to show variable participation. Bar charts are most effective for showing cumulative variable par- ticipation or percentage of participation of each vari- able (see Figure 14-31).

PATTERN DISPLAY. Pattern recognition is a human strength. When designing displays that are monitored for change in complex systems, patterns

608 CHAPTER 14 The Forgotten Analysis and Design Activities

Example of Alphabetic Listing Using YIN Indicators

Name Sex Married? Deceased? Jones, Sandra Andrews, Darcy Lane, Bruce

F F M

N Y Y

Example of Menu List Using' or Blank Indicators

10 Pt. 12 Pt.

·14 Pt.

Cairo Helvetica

• New Century Times Roman

N Y Y

FIGURE 14-27 Examples of Binary Indicators

are effective. Pattern displays repeat the same graphic several times with identical 'normal' dis- plays (see Figure 14-32a). When a change to one portion of the pattern occurs (see Figure 14-32b), it is easily perceived by users. These are not very common in business applications.

John

Jane

Marsha

MIMIC DISPLAY. A mimic display shows a schematic or other replica of a system to allow the user to monitor its functioning (see Figure 14-33). Because mimic displays are usually symbolizing complex systems, the information presented should be kept to a minimum needed to control, monitor,

8 10, 12 14 16 18 20

Years of Education

FIGURE 14-28 Example of Bar Chart

Blue Jeans

Sweaters

T-Shirts

20MM

15MM

10MM

5MM

Legend: 1:::1 T-Shirts

1:::::::::::::::::1 Sweaters

1:::::::::::::::::1 Blue Jeans

Bar Chart of Sales Data by Product for Month of

September, 1993

5MM 10MM 15MM

Column Chart of Sales Data

Sept. Oct. Nov.

FIGURE 14-29 Bar and Column Charts of Sales Data

or obtain information needed. The symbols, spacing, and relative sizes of symbols used in the display should conform to business conventions to convey immediately meaningful information. For example, Figure 14-33 shows an electrical diagram, not a plumbing diagram; therefore, the users should be electricians or electrical engineers.

Mimic displays are best used when a monitoring application requires a view of the whole system.

Human Interface Design 609

They provide understanding of system component relationships and can be more easily understood than other types of graphics for the same information. Colors can be used to highlight abnormal function- ing of components. In business applications, mimic displays are effective for monitoring network com- ponents, telecommunication linkages between net- works, and even for tracking problems in application interfaces.

NARRATIVE TEXT. Text is verbiage in which words, rather than numbers or symbols, are used to describe the intended information. Text is hard to read, time consuming to understand, and requires a high skill level of the user. Ideally, text is mini- mized; but some applications require comments or special, noncodable instructions that must be in text format. Some guidelines for text usage are the following:

• Use no more than 60 characters per 80 charac- ter line.

• Wrap text as a word processor does. Do not require the user to change lines.

• Use abbreviations common to the work con- text, and use abbreviations sparingly.

• Allow users to scroll, change paragraphs, and control the text creation process.

TEXT FORMS. One of the major uses of displays in business applications is for data entry that corre- sponds to a form. Form screens present a series of labeled fields of information for which some infor- mation is completed by the user and some informa- tion is generated by the application. Forms screens simulate paper forms that they replace or automate. Because forms automate information from paper, the format, sequence, spacing, and information to be completed should mirror that of the analogous paper form.

Forms screens should have standard header, instruction, body, and footer information that differs from the general screen format (see Figure 14-34). The different areas should be clearly delineated, grouping information that is related (e.g., the header) or that repeats. The header should contain an appli- cation identifier, function identifier, date, time, and a

610 CHAPTER 14 The Forgotten Analysis and Design Activities

Connected Line Plot Sales By Month LfL;.

JFMAMJJASOND

Sales Unconnected Line Plot

By Month

JFMAMJJASOND

Sales By Month By Location

FIGURE 14-30 Example of Point Plots

screen/program ID as discussed above, The header may be the same as the general screen header.

The instructions can be in the form of screen text, help availability, or a short description of expected action. As much as possible, the screen should pro- vide intuitive guidance. Instructions should lead the user to supply information to get to the next step.

The body of a form contains the labeled fields to be entered in an easily understood, contextually related format. The footer should provide screen

Trend Line Shows Average

Change

1\, /' : .. ' -'_. JFMAMJJASOND

summary totals or other summarizing informations. Footers and instructions are optional. The body, then, is the main focus of attention.

The body of the form should be partitioned or windowed to mirror sections of data to be entered (see Figure 14-35), The screen in Figure 14-35 shows a simple Customer Add screen for ABC Video, All information relates to the customer and there is no additional family metp.ber information in the application. If additional family members are

Human Interface Design 611

$100MM Cumulative Revenue By Product

$75MM

$50MM

$25MM

T-Shirts Sweaters Blue Jeans

FIGURE 14-31 Example of Band Chart

added to the membership, the Customer Add screen might look like Figure 14-36 which shows two sec- tions, one for general customer information and one for additional family members.

Each field or group of fields should be clearly labeled to identify the required information. Cus- tomer preferences are needed to design identification for some fields. For instance, three variations of name and address information are shown in Figure 14-37; all three conform to different, good design

a. Normal Pattern Display for 100 Indicators

00000 00000 00000 00000 00000

b. Abnormal Pattern Display for Several of 100 Indicators

00000 00000 00000 00000 00000

0000. 00000 00000 00000 00000

00000 00000 00000 00000 00000

00000 00000 00000 0000. 00000

FIGURE 14-32 Normal (a) and Abnormal (b) Pattern Displays

guidelines. The first variation shows each field labeled. The second shows major fields labeled and minor fields with understood labels. The third shows one heading for all fields; this heading minimizes the text on the screen. No one of these is preferred over the others. Rather, the customer should be allowed to choose the preferred design.

Labels, and any codes designed as well, should be designed to be familiar, less than five characters long, and include letters and numbers. For instance, Figure 14-38 shows four possible codes for a Cus- tomer ID. The first alternative, 913-8041, is a phone number. It is low in recognition for the clerks in the store, but the highest of any choice for the customer. Who doesn't know their own phone number? For that reason, high customer recognition, a phone number, is a good choice for Customer ID.

The second choice, CONG001, is a combination alpha and numeric code. The first four characters are the first four letters of a last name and the last three characters are a sequential number. This is also high in recognition for both customers and clerks. It is less recognizable than a phone number, but a good choice in any case. The next code, 03001 uses '03' to denote 'C' and a sequential number '001' to denote sequence within the Cs. The purely numeric code is

612 CHAPTER 14 The Forgotten Analysis and Design Activities

~ Connections that can fail are exaggerated to ease monitoring.

FIGURE 14-33 Mimic Display for Electrical Monitoring System

cryptic but short. It is less useful than the first two choices.

Text information, such as names, should always be left-justified. Ideally, they should be long enough

Header

f----- Instructions

Body

Totals

f---------------- Footer

FIGURE 14-34 Sections of a Form Screen

to provide for the maximum length of the informa- tion. This is difficult with names, especially hyphen- ated names. Each application defines its own maximum; but, in general, over 90% of names in the United States are shorter than 35 characters. If disk storage space is tight, shortening fixed-length text fields is one way to conserve space; another is to define a variable length field that does not store unused spaces.

After the individual labels, fields, and field codes are defined, the next task is to position them on the screen. The design is context related and should group fields that logically go together. From cogni- tive psychology research we know human brain capacity is limited to holding 5-7 bits of information called 'chunks' in our short-term memories. Short- term memory (STM), also called 'active' memory, is what is in your head while you are thinking. STM is measured in nanoseconds of response time for processing and is analogous to the arithmetic/logic unit (ALU) on a computer where all processing takes place. In designing presentation formats, we try to group items to take advantage of the chunking phe- nomenon. For instance, in the Customer Add screen

ScrCM1 ABC Video Rental Processing 12/12/93 Customer Maintenance 2:30:15

Create a Customer

Name: ____ _ Address: ____ _

City: ___ St: _ Zip: __

Credit Card Type: _ (A, V, M)

Credit Card Number: ____ _

Expiration Date: _ / _ /_

F1 :Hlp F3:Quit F5:Undo F6:End Ent F7:Save Tab:Nxt "Tab:Last ESC:Del Ent

FIGU RE 14-35 Customer Add Screen

in Figure 14-35 above, address and credit card infonnation fonn two natural groupings of infonna- tion that should be on the screen as a group. Another aspect of short-term memory chunking is to posi- tion required fields first, followed by optional fields. This placement should allow users to signal com-

ScrCM1 ABC Video Rental Processing 12/12193 Customer Maintenance 2:30:15

Create a Customer

Customer Number: aaa999 Name: ____ _

Address: ___________ _ City: ___ St: Zip: __

Credit Card Type: (A, V,M)

Number: -----

Date: _I - 1_

Additional Members: First Name Last (if Different)

F1 :Hlp F3:Quit F5:Undo F6:End Ent F7:Save Tab:Nxt JITab:Last ESC:Del Ent

FIGURE 14-36 Customer Add Screen with Additional Family Members

Human Interface Design 613

pletion of data entry without having to tab through or touch unneeded fields.

We must also account for long-tenn memory pro- cessing in screen design. Long-term memory is what is stored in your brain, similar to disk storage for a computer. Retrieval of stored infonnation uses a schema, or mental model" of what are effectively primary and secondary keys for retrieving infonna- tion. Retrieval time is measured in 100s of millisec- onds or slower. When chunking cannot be done, screen items should be spatially separated to allow users to switch contexts as they move their eyes from one section of the screen to another.

When positioning information on screens, you should also consider possible reusability for screens. For instance, the Customer Add screen above could also be used for delete verification, updating, and individual customer query.

When positioning is complete, each screen should be given a system name that is added to the task hierarchy to relate screens to tasks.

It used to be thought that the shortest possible ter- minal interaction time was desirable, but this is not true any more. Research shows that we need to pace work so that 'psychic overload' does not occur. Chunking items for data entry that logically go together is one way of pacing work. Another is in pacing the response time for different types of work. Long transactions can take a relatively long time, up to 20 seconds, while short transactions should take a short time, less than five seconds. Keystroke response is a simple, direct interaction and should have immediate response from the computer. A query, request to activate a function, and selection of a menu item are all examples of simple interac- tions. Examples of complex interactions are a data- base update, saving a word-processed document, or sending a facsimile transmission of several pages. Delays of up to 20 seconds are acceptable if the user is kept informed on the status of the processing. Some methods of telling the user the system is work- ing are a message, ' ... Working . .. ', a clock icon with hand movement synchronized to different per- centages of completion, or a whirring sound from the equipment.

Other field definitions for forms relate to char- acter entry and default values. Guidelines for

614 CHAPTER 14 The Forgotten Analysis and Design Activities

a.) All Fields Labeled

Last Name: ________ First: _____ _

Address: _______ _

City: _____ State: Zip: __ -__

b.) Major Fields Labeled

Name: ___________ _

Address: ___________ _

---------

c.) One Heading, All Fields

Name and Address

---------

FIGURE 14-37 Label Variations for Name and Address Information

character entry are listed below; examples of field guidelines are shown in Table 14-8.

• Always display keyed information. • Never require delimiters to be keyed. For

instance, in a social security number, provide dashes to split the numeric parts: xxx-xx-xxxx.

• Do not require entry of leading zeros for nu- meric fields or of following blanks for text fields.

• Make areas of the screen not used for input inaccessible to the user.

Guidelines for default values are:

• Display all defaults before any data entry begins.

• Confirm defaults by tabbing past the field. • Default replacement should not alter current

default value. For instance, if the default date is today's date, and the operator places yester-

a.) Customer 10 is Phone Number

913-8041

b.) Customer 10 is Alphanumeric Code

I CONG001 I

c.) Customer 10 is Numeric

03001

FIGURE 14-38 Variations for Customer ID Code

day's date in the field, the next transaction should still have the default of today's date.

ABC Rental Presentation Format

First, design the standard interface for all functional screens in the application. This should include header, date, time, screen ID, and program ID (see Figure 14-39).

Next, design the keys for navigation, error cor- rection, and help and design the footer to identify them and their functions. The standard used here is fairly common. Program keys and their meanings are shown in Figure 14-40.

We need to know when a portion of processing is done, for instance, when returns are complete (F6), and we need to know when the transaction is complete for inputting the total amount paid (F6). The F8-Fll functions are used for retrieval and query processing to browse through multiscreen out- put (F8-F9) that is longer than 80 characters (F10- F 11). The other keys are for changing actions during data entry.

The designations for F1, F3, and F8 through Escape (ESC) are IBM standards that have been fol- lowed by many PC applications. The remaining keys: F2, F4-F7 are open to definition. F2 and F4 are not used here and can be used for future changes. We could have assigned the End Entry type and End

Human Interface Design 615

Transaction functions to F2 and F4 as easily as to F6 and F7 (see Figure 14-40). F2 and F4 are not used to minimize the probability of hitting the wrong key and canceling a good transaction. If either of these keys is pressed accidentally, it should have no effect.

Finally, we design the detail form screen for rental/return processing. The periodic processing and customer and video maintenance screens are left as assignments at the end of the chapter. Rental/ return processing includes chunks for Customer information, Open Rental information, New Rental information, and Payment information. Correspond- ing to the chunks of information, the screen can be thought of as having four sections. The middle two sections are identical except that New Rentals cannot have return dates, late fees, or other fees applied. So, we design three different sections. Each section is designed separately, keeping in mind that there are 20 usable lines on the screen and that we want about 75% blank space. For this screen design, we assume a screen size of 24 lines by 80 characters per line.

The sections of screen information should be pri- oritized for condensation and crowding if it becomes necessary. For ABC rental processing, the priorities are highest to lowest: rentals, payment information, returns, and customer. Since new rentals are generat- ing the payment information, they are most impor- tant. Payment information is second because it must be accurate and easily understood for the clerk to handle money properly. Returns are a low priority here because 90% of returns are on time; Customer information is only important for the clerk to verify the customer name. If necessary, the remaining cus- tomer information could be condensed onto one line for display.

The first section of the screen is for Customer information. The information to be included is name, address, city, state, zip, phone number, and credit status.

The first issue to be decided is what type of field labels to use. For example, the options for Customer are individual field identifiers, only a Customer identifier, or some combination of the two (see Figure 14-41). To minimize information on the screen, we use only the word Customer (Option 2, Figure 14-41). This also makes sense since the Cus- tomer ID probably is to be scanned to minimize data

616 CHAPTER 14 The Forgotten Analysis and Design Activities

TABLE 14-8 Field Fonnat Guidelines

Content

Do not intersperse letters and numbers

Use alpha mnemonics that are meaningful, predictable, easy to remember, distinct

Try not to mix special characters with letters and numbers

Break long codes into groups of three and four digits

Do not use frequently confused letters in codes

Identify maximum number of spaces for item data entry; replace space marker as data is entered.

Labels

Use abbreviations and contractions

Try to keep labels less than eight characters long

Design abbreviations to be less than five characters

Separate mnemonics by hyphens

Place label to left of single occurrence field

Place label over column of repeating information

Poor Design

A1B1C1

ZXCVB001

User types: $123.45

277426631

oandO 1 and I

Enter Vid-ID

Poor Design

Video Identification

Customer name and and address

Ident

VidlD

Name: Sam Jones

Name: Sam Gerry Leonard Jesus

Better Design

ABC001

VideoOOl

Pre formatted

$--_. -- User types: 12345

277-42-6639

Use zero, 0, only Use one, 1, only

Enter Vid-ID: ____ _ after three char. Vid-ID:123

Better Design

Video ID

Customer:

Vid-ID

Name: Sam Jones

Name: Sam Gerry Leonard Jesus

entry and the Customer information is displayed automatically.

The second issue is format of the information.

inducement to keep it the same. Unless screen space is a major problem, the post office format will be kept.

The options in Figure 14-41 all follow a conven- tional post office address format. The address need not be formatted in that manner, but the high recog- nizability of addresses in this format is a strong

Two fields remain: Customer ID and Credit Sta- tus. Customer ID is an important field as the identi- fier of the information and should be positioned in a way that highlights its presence. Conversely, Credit

TABLE 14-8 Field Fonnat Guidelines (Continued)

Error Messages

Use upper and lower case if possible

Only use asterisks in extreme situations

Error IDs should be in a consistent location

Should be brief

Should be positive

Should be constructive

Should be specific

Should be comprehensible

Should allow the user to feel as if they control the system rather than the system controlling them.

Provide levels of messages with less detail for error message and more detail for requested help.

Poor Design

ALL UPPER CASE IS DIFFICULT TO READ

*****This ***** is *****very ***** distracting *****.

PFOO 1 Error 00 1 Error 002 PF002

Numerics were expected by the application but you entered some nonnumeric information.

You entered an illegal date format.

You idiot! This mistake should NEVER occur.

Illegal entry or ?

FACDB 29081230123

Human Interface Design 617

Better Design

Mixed case is preferred to enhance readability.

*****ALERT***** The database may have been destroyed.

PFOO 1 Error 001 PF002 Error 002

Numerics expected.

Enter date format mm/dd/yy

Reconstruct database and begin again.

Enter data format mm/dd/yy

Database error. Call the DBA at x3456.

To undo, press F5.

Status is only important when it is the cause of a can- celed request. So, Credit Status needs some sort of 'alert' design but, otherwise, can be positioned to conserve space. Several alternatives for Customer ID and Credit Status formats are shown in Figure 14-42. All alternatives are acceptable; the third option is selected because it minimizes labels and has credit in

an easy-to-spot location-the upper right comer of the screen.

The second section of the screen is for Open Rentals information. The information needed on the screen includes Video ID, Copy ID, Descrip- tion, Rental Prices, Rental Date, Return Date, Late Fees, and Other Fees. By convention, a typical bill,

618 CHAPTER 14 The Forgotten Analysis and Design Activities

Screen 10. . mm/ddlyy ABC Video Rental Processing hh:mm:ss

Activity Name Screen Function

Body

~------------- Allowable Function Keys

FIGURE 14-39 Standard for ABC Video Functional Screens

invoice, purchase order, or shipping papers list the item identifier followed by its description. We follow this convention for ABC. Two basic alternatives for fees and dates are shown in Figure 14-43. Since the same line design will be used for the New Rentals screen section, the alternatives as they would display for new rentals are also shown.

F1 F2 F3 F4 F5 F6 F7 Fa F9

F10 F11 DEL ESC TAB

ShiftlTab

Functions

Help Not Used Quit/No Save Not Used Undo Last Entry End Entry End Trans/Save Page Forward Page Back Shift Page Right Shift Page Left Delete Character DEUCancel Field Go to Next Field Go To Last Field

FIGURE 14-40 Program Keys and Functions

The alternative which is easier to read and un- derstand should be selected. If neither is obviously easier to read, the user should be consulted. The choice here is the first alternative. Keeping the dates together allows fast understanding of a tape's late- ness, while keeping the rental information and return information separate allows fast understanding of rental fees owing. Vic has stated that no rentals are made without payment of rental fees, so the second option loses some appeal. The first option is selected then on the basis of keeping like things together- dates with dates and money with money. When returns are processed, the default of today's date should be placed in the Return Date field.

The third section of the screen is for New Rentals information. For this section, we use the Open Rentals line definitions and blank out the fields for return dates, late fees, and other fees (Figure 14-43a). A default of today's date should be placed in the Rental Date field. The only issue is how many tapes should a customer be allowed to rent at anyone time. There are arguments for any number one can select and they all are determined by opinion. There- fore, Vic should select the number of allowable tapes out on rent at anyone time.

When asked, Vic wants no restrictions at first. Then, he reconsiders. "If I allow unlimited tapes, someone could theoretically give me a stolen credit card as identification, rent many tapes, leave town, and I'm out the tapes. Maybe I should limit the num- ber. But, one or two does not seem enough. What if they are short, like music videos? What if they want to watch movies all day? Why should I stop them? Hmmmm. I think someplace between 10 and 20 is probably okay because most people would never rent that many. My biggest customer is George Anderson and he takes out about six tapes at a time. So, I guess 10 is a reasonable limit."

With ten tapes as the limit, the screen needs no scrolling because all information will fit on one screen. Because this choice turns out to be an important design decision, Vic should be reconsulted and told that scrolling will not be available for rent/ return processing. If he chooses to change the num- ber, or asks for scrolling, it should be provided.

The fourth section of the screen is for Payment information. For payments, the fields are the Total

Human Interface Design 619

a.) Label Each Field

Customer Name: ___________ _ Address: ___________ _

City: ______ St: Zip: __ - __

b.) Customer Only

Customer:

FIGURE 14-41 Customer Name Screen Options

Amount Due, Total Amount Paid, and Change. These could be on one line, two lines, or three lines as shown in Figure 14-44.

The choices for payment should be first, readabil- ity and understandability, and second, space avail- able. For ABC, all infoffi1ation can fit on the screen with three-line spacing and still have room left over. So, the last alternative (Figure 14-44c) is selected as most easily read. The money fields should be

a.) Label Each Field, Position on Same Line for Easy Location 10

Customer 10: Credit: Name: ___________ _

Address: ______ --=--_---=-,--__ City: ______ St: Zip: __ - __

b.) Label Each Field, Position Separately

Customer 10: Name: ______ _ ____ _

Address: ______ --=--_---= __ _ City: ______ St: Zip: __ - __

Credit:

---I right-justified with one set of numbers on the rental! return lines. The title fields should be right-justified for the group of three lines.

Last, we consider placement of the entire screen in the blank area between the standard screen header and footer. So far we have 22 lines accounted for in the rental screen: two standard header, one screen header, two footer, four customer, ten rent/return, two rent/return header, and three total lines. There

c.) Minimal Labels, Position on Same Line

Customer:

-~I d.) Minimal Labels, Identify Main Fields

Customer: ---

Credit: _

FIGURE 14-42 Alternatives for Customer ID and Credit Status

620 CHAPTER 14 The Forgotten Analysis and Design Activities

Alternative A. Dates First, Fees Second

Video Copy Rental Return Rent Late Other 10 # Description Date Date Fees Fees Fees

xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99/99/99 99/99/99 99.99 99.99 99.99 xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99/99/99 99.99

Alternative B. Rental Information First, Return and Extra Fees Second

Video Copy Rental Rent Return Late Other ID # Description Date Fees Date Fees Fees

xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99/99/99 99.99 99/99/99 99.99 99.99 xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99/99/99 99.99

FIGURE 14-43 Alternatives for Dates and Fees

are no extra lines on the screen (see Figure 14-45). Ideally, one blank line should separate the header and footer from the body. Also, one blank line is de- sired to separate the rental/return information from customer information. To provide blank lines, we either delete a header line or change the arrangement of information on the screen. According to our pri- 0rities' customer information should be condensed onto fewer lines to gain the blank lines. The Cus- tomer ID can be added to the customer name line

A. One line

Total Due 999.99 Total Paid 999.99 Change 999.99

B. Two lines

Total Due 999.99 Total Paid 999.99 Change 999.99

C. Three lines

Total Due 999.99 Total Paid 999.99 Change 999.99

FIGURE 14-44 Alternatives for Payment Information

and given its own label to specifically identify it (Figure 14-46a). This makes reading the Customer ID somewhat more difficult but adds to the readabil- ity of the rental information. A better choice is to redesign the standard header and make it two lines, with the second line identifying the function, and only display function keys available and use one line. This screen (Figure 14-46b) is preferred and recommended. In the end, Vic should select his pre- ferred screen and it should be the final design. Vic selected the recommended screen for the same rea- sons that informed its design.

Field Format Design Field Fonnat Alternatives

Field format design selects the characteristics of individual fields or values of fields on a screen. The alternatives for field format design include size, font, style, color, and blink for individual field values, and include coding options for field labels.

SIZE. Size is an issue in field attribute definition when it is selectable. For many software platforms, the size, spacing, and selection of characters is fixed within the application. Size of characters is mea-

Human Interface Design 621

SCRR01 ABC Video Rental Processing Rent/Return Processing

Rentals and Returns

12/02/94 02:03:15

Customer: #Xxx999 xxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx, xx 99999

Video Copy 10 # Description

xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxx xx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Rental Date

99/99/99 99/99/99 99/99/99 99/99/99 99/99/99 99/99/99 99/99/99 99/99/99 99/99/99 99/99/99

Cr: x

Return Rent Late Other Date Fees Fees Fees

99/99/99 99.99 99.99 99.99

99/99/99 99.99 99.99 99.99 99/99/99 99.99 99.99 99.99 99/99/99 99.99 99.99 99.99

99.99 99.99 99.99 99.99 99.99

Total Due: 999.99 Amount Paid: 999.99

Change: 999.99

F1: Hlp F3: Quit F5: Undo F6: End Ent F7: End Trans F8: Pg Up F9: Pg On F1 0: Sh R F11: Sh L Tab: Nxt Fld "Tab: Lst Fld ESC: Cncl

FIGURE 14-45 Alternative 1 for ABC Rental Screen

sured in points. A point is a measure of type that is approximately 1/72 of an inch (about 2.8 mm). In general, the size of characters should be no less than 10 points and no more than 14 points unless an alert or alarm situation is being shown. These sizes are in the range of normal printed point sizes for display processing. An example of the range of point sizes is shown in Figure 14-47.

The default in most applications is 12-point type. As you can see from Figure 14-47, the larger the point size, the fewer characters fit on a screen. At 18 inches, the minimum point size should be about 9 and a comfortable point size is 12. The further away from the screen the user is, the larger the point size should be. At 30 inches, the minimum point size should be 10 points and either 12 or 14 points print size are acceptable. At 10 feet, the size should be about 72 points, or one inch.

FONT. Most software applications have a fixed default for type font as well as type size. Most applications default to a serif style such as that used in this text. A serif font has been proven easier to read and faster to comprehend than a sans-serif style such as this. If fonts are selectable, the rule of thumb is to select one or, at most, two fonts and use them consistently throughout the application for obvious distinctions. For instance, use one font for all field labels and another font for all information entered by the application user. Do not mix fonts for the same purposes or users will get confused and error rates will increase.

STYLE. Type styles might include regular, bold, italic, oudlillll.e, reverse video, SMALL CAPS, ALL CAPS, underline, or stfilEB dUSHgk. While the options make for interesting reading, interchanging

622 CHAPTER 14 The Forgotten Analysis and Design Activities

a.) Customer 10 on Customer Name Line

10: xxx999 Customer: xxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx, xx 99999-9999

b.) Recommended Screen Design

SCRR01 ABC Video Rent/Return Processing

Customer: #xxx999 xxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx, xx 99999

Video Copy Description Rental 10 # Date

xxx xx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99/99/99 xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99/99/99 xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99/99/99 xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99/99/99 xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99/99/99 xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99/99/99 xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99/99/99 xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99/99/99 xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99/99/99 xxxxx xx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99/99/99

Cr: x

Return Rent Date Fees

99/99/99 99.99 99/99/99 99.99 99/99/99 99.99 99/99/99 99.99 99/99/99 99.99

99.99 99.99 99.99 99.99 99.99

Total Due: 999.99 Amount Paid: 999.99

Change: 999.99

Late Fees

99.99 99.99 99.99 99.99 99.99

F1: Hlp F3: Quit F5: Undo F6: End Ent F7: End X Tab: Nxt Fld ATab: Lst Fld ESC: Cnd

FIGURE 14-46 Alternative 2 for ABC Rental Screen

12/02/94 02:03:15

Other Fees

99.99 99.99 99.99 99.99 99.99

the styles on a form to be completed make it much harder to comprehend and will increase error rates. In general, regular print is acceptable in all applica- tions for text display. For general purpose, noncriti- cal text, regular print is recommended.

video is to show cursor position. The character at which the cursor is positioned is shown in reverse video and switches back to normal as soon as the cursor is moved.

Bold print and reverse video are useful to call attention to a field if it is warranted. For instance, bold type style is effective for alert field values on a monochrome screen. A common use of reverse

Italics and outline are not generally used because they are harder to read and, therefore, increase com- prehension time. Strike-through and underline are used mostly in word processing applications and can be effective in that context. For most forms-

Human Interface Design 623

This is 10 point type.

This is 12 point type.

This is 14 point type.

This is 18 point type.

This is 24 point type.

FIGURE 14-47 Sample Point Sizes

completion TPS applications, neither of these is rec- ommended. Finally, research studies have shown that use of all capital letters increases comprehension time and they are not recommended.

COLOR. Color can be an effective addition to screen design, or it can seriously detract from the understandability and readability of the information. For indicating binary or ternary conditions, color is faster and easier to comprehend than any other cod- ing scheme.

Research provides clear guidance on appropriate and inappropriate uses of color for application dis- plays. Color is most effectively used for search tasks in which the goal is to find one or two objects (of the same color) that differ from surrounding objects. This type of search does not occur often in business applications. Color coding also is effective for:

• unformatted display of information • symbols which may be within a high density

of information on the screen • tasks in which the position of the item to be

identified is not known but the color is • screens for which color relates to the task • user tasks involving search and recognition of

differences in symbol color

Color is least effective for tasks in which a large number of colors are indiscriminately used, for

which colors selected do not differ sufficiently to enable distinction, and for tasks in which the goal is to identify large numbers of objects (of the same color) when surrounded by a large number of objects of other colors. These ineffective color uses result in problems of discrimination. Research findings show that performance deteriorates with more than six colors on a screen. Many writers suggest using no more than four colors at anyone time for business tasks.

Research on color selection recommends selec- tion by wavelengths, ensuring sufficient contrast to speed comprehension. For instance, Figure 14-48 shows common colors on a spectrum by wavelength. Poor choices would be blue, blue-green, and green for different meanings on the same screen. Good choices would be red, yellow, and blue, because they are sufficiently different to facilitate understanding.

Because color blindness and other color percep- tion problems are common, user profiles and user testing should be used to guarantee that all users can recognize all colors on a screen. Bold or odd colors of any type, for example, olive-green, should be avoided.

Common meanings ascribed to colors should be used in the application, and the common meanings which change by culture should be adapted. The government recommends using red only for alert conditions, yellow for warning, and green for normal

624 CHAPTER 14 The Forgotten Analysis and Design Activities

Wavelength in Nanometers-Color

420 r-- Violet

460 - Blue

500 -

Green

540 I-- Yellow

580 I--

600 I--

640 '--- Red

Adapted from Banks, William W., & Jon Weimer, Effective Computer Display Design. Englewood Cliffs, NJ: Prentice-Hall, 1992, p. 128.

FIGURE 14-48 Color Spectrum

because that is the common, conventional use for these colors. The use of a flashing red signal should be limited to an emergency condition requiring im- mediate action.

BLINK. Blinking characters or 'flashing' is a use- ful attention-getting device for monochrome or lim- ited color displays. Blinking is considered more annoying than color codes by most users and should be limited to no more than one field at a time or one meaning at a time. An example of effective flashing would be to flash all data entry fields in error. As errors are corrected, flashing stops.

Blinking rates need to be monitored for the flash rate or speed of blinking. The optimal flash rate is

2-3 times per second with equally spaced intervals for on and off. Rates of 8-12 flashes, while discem- able, can cause nausea and even seizures in people with photo-epilepsy. For those of us over age 30, a phenomenon called flicker fusion causes us to see constant light when the flash rate is very high, over 50 times per second.

Guidelines for Field Format Design

Assignment of field format characteristics is a judg- mental activity based on SE experience and common sense. Follow the tenet 'less is more' in defining field formats that add formatting options. The use of these options diverts attention, causing a delay in the thinking process. If delay and attention shift are not desired, the result will increase error rates and reduce productivity.

Effective uses of color, blink, or audio sound for directing attention should be considered; however, user approval should be obtained before adding for- matting changes to the screens.

ABC Field Format Design

One field on a rental screen, credit standing, might be worth highlighting. In addition, when processing takes place, several other items might be high- lighted. In particular, data entry errors and insuffi- cient payments, late tapes, and special fees should be considered for use of color, blinking, or bold type. These items are chosen because they represent all of the abnormal conditions that occur during rental processing.

A customer's credit standing is acceptable unless it is specifically changed by Vic during an update process. Since its change requires management action, a customer with a poor rating should proba- bly be denied rental rights. This process has never been discussed with Vic and needs verifying. If he

. approves, the credit standing for poor ratings only could be displayed as a red or a blinking field to highlight credit status.

Data entry errors can also be highlighted. Since red is being used to signify denial of rental rights, a

different color should be chosen. If data entry errors are highlighted, the recommended colors are either yellow or blue to make them distinct from the red used for credit standing.

Insufficient payment occurs when the Change Amount is a negative number. The current design calls for moving the cursor to the payment field which is updated with the new Total Amount Due. Since this is not an expected occurrence, clerks might miss the cursor movement and complete the transaction even though insufficient payment has been made. Some method of highlighting is also desirable to ensure against such mistakes. The rec- ommendation is to blink all money fields and move the cursor to the new Total Amount Due.

Late tapes might cause a justifiable denial of rental rights, but this has also never been discussed with Vic. The number of days that constitutes sig- nificant lateness needs to be defined. If monitoring of lateness is desired, a red, blinking value in the rental date field could be used to represent signifi- cant lateness.

Last, special fees, which require management update, might also be highlighted and a cause for rental denial. The use of special fees is not well- defined to the project team at this point. Presumably Vic is using special fees for lost or damaged tape assessments. Perhaps if the fees are over a certain amount, to be defined, Vic would want the field high- lighted and, unless paid, rentals would be denied. If Vic wants this highlighting, a red, blinking field, consistent with other rental denial fields, would be suggested.

A long conversation with Vic resolves all of these issues. The recommendations for errors, credit prob- lems, and insufficient payments are all accepted. Vic likes the idea of denying rental rights when tapes are over 10 days late. He questioned the use of the same blinking red signal, however, thinking that white blinking might be more effective. The SE explained that if one signal, blinking red, is used for rental denial regardless of reason, it will be more easily learned by the clerks. Vic agrees with the rec- ommendation. He does not want special fees high- lighted, nor does he want rental denied. He is using special fees for the two purposes described, but he

Conversion 625

also is using it for tapes purchases with money still owing, a usage never before defined.

Design of Report Output In many companies, formal reports are no longer produced from application systems. Instead, users are provided with a query language and told to develop ad hoc reports as they are needed. When formal reports are required, they usually are based on queries of the same information. The guidelines for reports, then, follow similar guidelines for screens.

1. Design a standard header and footer and be consistent in the general format on all reports.

2. Keep report body as close to query screens as possible.

3. If query screens are not present for the speci- fied reports, follow the design'guidelines for screens. Define clearly identifiable areas for grouping information that is related or that repeats. Follow reasoning for individual fields on a report that parallels the reasoning used for screen design.

The ABC rental receipt is shown in Figure 14-49 as an example of a report that follows the design of its related screen. Notice that while the receipt has a header, it is preprinted and differs from that of the screen. Preprinted information is most effective when it is printed in some unobtrusive color, such as turquoise, which users can ignore when they become familiar with the report format.

CONVERSION------

Conversion of applications is a systems analysis and design in miniature. The activity is only concerned with transforming data from its current format and storage media into a new application's format and storage media. Conversion is usually concurrent with design and done as a side activity by a small group of one to three people who report to the PM

626 CHAPTER 14 The Forgotten Analysis and Design Activities

ABC Video Rental 5930 Preston Rd.

Atlanta, Ga. 30303

Customer Information: #xxx999 xxxxxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx,xx 99999

Video Copy ID # Description

99999- 999 xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99999- 999 xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99999- 999 xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99999- 999 xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99999- 999 xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99999- 999 xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99999- 999 xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99999- 999 xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99999- 999 xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99999- 999 xxxxxxxxxxxxxxxxxxxxxxxxxxxxx 99999- 999 xxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Rental Date

mm/dd/yy mm/dd/yy mm/dd/yy mm/dd/yy mm/dd/yy mm/dd/yy mm/dd/yy mm/dd/yy mm/dd/yy mm/dd/yy mm/dd/yy

MM/DDNY

Return Rent Late Other Date Fees Fees Fees

mm/dd/yy 99.99 99.99 99.99 mm/dd/yy 99.99 99.99 99.99 mm/dd/yy 99.99 99.99 99.99 mm/dd/yy 99.99 99.99 99.99 mm/dd/yy 99.99 99.99 99.99 mm/dd/yy 99.99 99.99 99.99

mm/dd/yy 99.99 99.99 99.99 99.99 99.99 99.99 99.99

Total Fees Due: 999.99 Total Paid: 999.99 Change: (99.99)

Accepted By: ________________ _

FIGURE 14-49 ABC Rental Receipt

and work with the DBA to define and populate the new database environment. The activities of con- version are:

1. Identify current and future locations for all data items.

2. Define edit and validate criteria for all attributes.

3. Define data conversion activities. 4. Define options for data conversion. 5. Recommend and gain approval for data con-

version strategy. 6. Develop a schedule for data conversion

based on estimates of time to convert one data item.

7. Define options for application conversion and implementation.

8. Recommend and gain approval for imple- mentation strategy.

9. Develop a schedule for application implementation.

Identify Current and Future Data Locations The first task is to identify the data being converted. A matrix listing every relation with its attributes/ fields is developed. Then, in one column, the present location of each attribute is identified. An automated data field entry has the current file, relative address in the logical record, length, type characters, and current data name. A manual field entry identifies the data source and person responsible for data accuracy.

A third column is created to identify specific con- version errors if they are known.

Define Attribute Edit and Validate Criteria For attributes that are simply being moved from one location to a new location, the edit and validate cri- teria should already be defined in a data dictionary. If this information is not already defined, the conver- sion team defines and documents necessary edit and validate criteria.

When attributes are being encoded to use a short- ened storage format, the encoding scheme must have been defined. If a coding scheme is not already defined, the conversion team works with the design team to define and document the encode-decode scheme.

Define Data Conversion Activities and Timing Three major issues relate to data conversion. First, the automation status is either automated or man- ual; second, is data accuracy and reliability; third, is the ease of mapping from the old data storage technique to the new data storage technique (see Fig- ure 14-50).

The extent to which data is already automated, clean, and has a simple mapping from the old to the new data storage technique, makes conversion sim- ple. When data are manual, inaccurate, or not easily mapped, conversion is difficult. When data are all three-manual, inaccurate, and not easily mapped- conversion becomes a critical task that may define the critical path for the application development.

Manual data that must be automated require extensive edit and validation criteria in the data entry program to prevent bad data from getting into the database. Data that are not easily mapped may have no simple way for conversion staff to verify accu- racy of processing, therefore, testing and test verifi- cation with user assistance become critical tasks in determining data conversion success.

Data that are inaccurate require two things. First, the conversion team must define what the possible

Conversion 627

correct data values are. Second, the conversion team and user must define the mapping from incorrect val- ues to correct values. Then, any new values that might change the mapping from old to new storage technique must be reviewed with the systems design team to ensure that the application design is still valid. Third, an army of clerks must be hired to cor- rect the errors. This means that special training for data correction is required. Fourth, training for the new application must address the data inaccuracies, the new values, and their interpretation for all cur- rent data users.

Data that have combinations of problems require multiple skills of conversion team members and complicate the conversion process. Data conversion planning should be complete early in the design stage. The planners should know which types of these problems are present and how the conversion team is planning to minimize their impact.

Select and Plan an Application Conversion Strategy The methods of conversion are direct cutover and gradual conversion. Both methods mayor may not be supplemented by continuing parallel execution of the old application to allow comparison of results and verification of processing.

Direct cutover means that on the set day, the old way of work is abandoned and the new way begins to be used. This is a risky method since few appli- cations work perfectly the first time. There is no way to compare results and verify correctness of the new processing.

Gradual, or incremental, cutover is a conver- sion approach in which the new application is implemented in some piecemeal form. The imple- mentations may be geographic, functional, iterative, or some combination of these. Geographic conver- sion is an approach in which the entire application is implemented in each location, one location at a time. The application that is used to account for pay telephones in the United States, COIN, has several different versions in operation across the country at a time. As a new version is implemented, one of the locations volunteers to be the first to use it. It is

628 CHAPTER 14 The Forgotten Analysis and Design Activities

Best Case. Simple programmable conversion

Difficult programmable conversion

Significant edit/validate with clerical clean-up, but still programmable

Significant edit/validate with clerical clean-up, difficult to program

Use create program with all edit/validate high clerical support

Use multiple create/merge programs, difficult verification, high clerical support

"-_------- Use create program with all edit/validate high clerical support

Worst Case. Use multiple create/merge programs, difficult verification, high clerical and management support

FIGURE 14-50 Decision Tree on Ease of Data Conversion

implemented in that one geographic location for six months. Then another location is added. After another six months, a third location is added. The timed geographic technique keeps the lives of the implementers relatively stable and allows the dis- tributed companies using the software to choose their own implementation times.

Functional conversion has three variations. First, work functions can be cut over one at a time to the new application. This is a local version of the geographic conversion method. Second, incre- mental software development can place spe- cific work functions into production use as soon as

they are tested. Third, small numbers of trans- actions or one type of transaction might be imple- mented first using transaction conversion. Then, as the users gain experience and the application stabilizes, more transactions are cut over until all are in production. In the first variation, the entire application is implemented in one department or group at a time. In the second, pieces of the application are implemented one at a time, and may be in production company-wide or by group. In the transaction variation, the whole application is complete, but it is implemented piecemeal by trans- action type.

When a new application changes the old method of work, or when a specific problem is highlighted during feasibility or analysis for immediate imple- mentation, some form of functional, incremental conversion is useful. Both of these circumstances occur in large business applications. Small applica- tions may not have enough functionality to allow iterative conversion, requiring the complete appli- cation to be placed into production at one time.

Gradual conversions can not always be done. When the new application is automating a previ- ously manual process, gradual conversion may be difficult unless unrelated transactions can be identi- fied. When this occurs, the project team should develop a final test using live data that parallels daily production and can, therefore, be checked for accuracy.

Parallel conversion means that the new and old methods of work, including any applications work, are both done every day for some period, usually one or two cycles of processing. Parallel conversions only work if the new application produces the same outputs as the old application and has comparable formulae and processing on the data. In the parallel method, the people using the application would do their jobs in the new way and follow it by doing the work in the old way with the same data. That is, the same information is processed twice. If the formu- lae, processing, or outputs are very different, parallel processing might not work. Parallel conversion is also difficult when the number of people doing the work is insufficient for processing the double vol- ume of work. Then, if parallel conversion is desired, some gradual method should be coupled with paral- lel execution.

ABC Conversion Strategy Conversion in ABC is from a totally manual to a totally automated application. This means that the planning for conversion should follow the need for data. Each relation is examined individually to determine its criticality for processing on the first day of Rental/Return use (see Table 14-9).

Of the seven relations in the application, four (i.e., Rentals, Customer History, Video History, and End of Day) are derived from processing and need no conversion. The other three-Customer, Video,

Conversion 629

TABLE 14-9 ABC Rental/Return Data Relations and Conversion

Relation Status Priority

Rental! Derived from 0 Return Processing

Customer Manual/Clean 1

Video Manual 2 Clean if known

Copy Manual 3 Need a count

Customer Derived from 0 History Processing

Video Derived from 0 History Processing

End of Day Derived from 0 Processing

and Copy-are manual and needed the first day of operation. All could have the same priority because the application cannot be tested without all three relations. The customer relation is given the highest priority because it has accurate data from the card file, and therefore, should be more easily converted. Another reason for choosing the customer relation first is because if it turns out to be error-ridden, the other two files can be assumed to be as bad or worse. Customers tend to overestimate the quality of their data, and errors become known when the method of processing changes.

The strategy then is convert the customer file from the existing card files, followed by the video and copy information. The next issue is who is to do the data entry. The clerks might enter Customer information during nonbusy work hours or could be hired for extra hours of work. The estimate of con- version for customer information is approximately 70 hours (4 minutes * 1,000 customers / 60 minutes in an hour). This assumes four minutes of data entry time for each of 1,000 customers. The ideal solution is to hire clerks for extra work so their entire atten- tion is only on conversion at the time. This speeds the process and minimizes errors that might occur from interruptions during the work day.

630 CHAPTER 14 The Forgotten Analysis and Design Activities

One alternative for doing the data conversion is to hire the current staff to work more hours. If three ABC clerks each worked two extra hours each day, and all work a five-day week, the customer conver- sion would take between ten days and two weeks. This alternative is attractive because the current clerks know the data. The disadvantage of this alternative is that the clerks don't type and the four minute estimate might be very low for them. Another disadvantage is that because the clerks' typ- ing skills are low, name and address errors, which are very difficult to identify via computer, might get into the file.

A better alternative is to hire an experienced data entry person(s) from a temporary agency. The cost is not too high, $10-14/hour, and their accuracy will be greater. For an experienced typist, the four minutes is probably a high estimate.

The next relations to be converted are Video and Copy. One issue in this conversion is the high amount of time for bar coding each copy of a video. Assignment of bar codes affects database design. Al- ternatives are to use the bar code to identify each tape uniquely and duplicate video information in the copy relations, or identify each video with a portion of the bar code and identify each copy by a unique sequence number within bar code. The preferred solution from a data perspective is to generate one Video ID bar code that is the same for all copies of a tape. Database storage and typing time are mini- mized, and retrievals will be faster. This solution is recommended. The only advantage to the other alternative is that no sorting of the physical inven- tory is required. The disadvantage of the unique base code for each tape alternative is that video informa- tion is replicated a number of times thus increasing the time for data entry, error rates, and retrieval time.

The related issue in video-copy conversion is the physical inventory identification of all copies of each video for entry into the application. The scheme we chose of one Video ID bar code for all copies of the same tape makes data entry easy but makes the phys- ical work more difficult. The people doing this work must sort all of the tapes by video, assign the Video ID, and generate and affix the bar codes to each copy. Last, each copy's bar code must be entered into the system. Since we chose one Video ID bar code

for all copies, we can enter the video information and a count of copies and have the application gen- erate all Copy relations. Part of the change procedure for a video, then, must include changing the number of copies. Increasing the number poses no problems. Decreasing the number means that a check for outstanding or past rentals must be made and, if present for a number to be removed, the number may not be removed. These maintenance requirements should be discussed with the design team to ensure that they treat video processing in this way.

The last issue to decide about data conversion is who should do the video and copy conversion data entry. The estimated time for a complete physical inventory is about 28 hours. This number assumes six seconds of inspection time per tape for 10,000 tapes, plus four seconds overhead for extra move- ment of tapes to make room for the sorted ones (i.e., 10 * 10,000/60 seconds per minute / 60 minutes per hour = approximately 28 hours). This includes sort- ing the tapes by title alphabetically and keeping them in that order until the data are completely entered. Tapes out on loan must be included in each day's conversion process to ensure 100% conversion coverage. Once the tapes are in sequence, the clerks putting tapes back into inventory are assumed to alphabetize them automatically, adding no extra time to the conversion.

The data entry for each tape, because of the cod- ing scheme defined, should take only about two min- utes per tape for a total time of about 33 hours (i.e., 2 * 10,000/60). The total conversion time for the ABC rental/return application is about 120 hours, or about three weeks.

Again, the clerks, who know the inventory best, could be hired extra hours to work on conversion sorting and data entry, or Vic might hire outside workers to come in daily for 8-10 hours for sev- eral days.

If Vic wants to use his current clerical staff to use otherwise idle time, the amount of time for conver- sion is 120 hours divided by the number of idle hours per day. If the three clerks are idle a total of six hours per day, the conversion will take approximately 20 days. This is a long period of time and usually, the longer conversions continue, the greater the likeli- hood of errors. The recommended approach is to hire

temporary data entry clerks to sort the tapes, assign bar codes, and enter the data into the system.

The alternatives and recommendations are pre- sented to Vic for his approval. He chooses to hire two temporaries for two weeks to work full-time on converting all data. His rationale is that he really wants his clerks to concentrate on customers, and he decides they can help with the physical inventory sort in their spare time. The remainder of the time they should be working at helping customers. If videos are missed during the inventory sort, they will be found as they are rented and their information will be entered into the application then.

USER ______________ __

DOCUMENTATION ______ __

Mix of On-Line and Manual Documentation User documentation is important because it is usu- ally the first information about an application that new employees are given. Therefore, it should be developed and maintained to disclose accurate usage information about an application. User documenta- tion is started after analysis and can be a parallel activity to design. Some researchers and practition- ers recommend developing the user documenta- tion before design begins. The application is then designed to meet the requirements of the user documentation.

Frequently, users develop the manual documenta- tion and define what they would like for on-line help and messages. At the least, users should participate in developing user documentation. The arguments for having users develop their own documenta- tion are:

• Users are less likely to assume knowledge that SEs take for granted (e.g., how to start an application).

• Users know what to do better than SEs. • Users who develop their own documentation

require less training because they already know how the system will work.

User Documentation 631

With complete novices who have never used a computer system, having them develop the user manuals is NOT a good idea.

Contents of the user documentation vary with each project and company. In general, the writing sty Ie should not be patronizing, but should take the users' general level of computer expertise into con- sideration. This means that documentation written for experts can be concise, use jargon, and have less explanatory information about how to get started. Documentation written for novices should begin at an elementary level, for example, "The button to turn on the machine is located .... "

An outline for general contents of user documen- tation is provided in Table 14-10. First, any docu- ment should contain a table of contents. A system overview describing the scope of processing is next. Assumed level of user and expected system-user interactions should be included in the overview. Diagrams should be frequent and 'understood by your mother.' Also in the overview, include in- formation about whom to call for help and what kind of help they offer. For instance, Operations pro- vides assistance if the terminal malfunctions, or the Information Center assists in developing ad hoc queries.

Describe the hardware, software, and at a very high level, how the equipment is connected. This is especially important when LANs, distributed appli- cations, or PCs hooked to mainframes are being used and some functions are local and some remote. Be specific about what work is performed in what loca- tion and how to determine problems.

Next, describe the general formats for screens and functions. Begin the details of system operation with startup and shutdown, including security informa- tion, without documenting security codes! Describe all function keys and what they do.

Then, for each screen in the application, present the screen and the required/optional entries made by the operator. Be specific about the type of data to be provided. Present an example of a correct screen and of an incorrect screen with error messages. Sequence this information by logical groupings of activities. For instance, for ABC, there would be four functional description sections: rental/return, customer maintenance, video maintenance, and

632 CHAPTER 14 The Forgotten Analysis and Design Activities

TABLE 14-10 User Documentation Contents

Introduction Application Overview Special Features Format of Document Support Group Services, Contacts

General System Information Obtaining a User ID Starting the Machine Shutting the Machine Down

System Access Procedures Logon Procedures Logoff Procedures

General Data Entry Information Menus and Menu Selection with examples

of all screens Data Entry Screen Format with one example screen Function Key Assignments

Rent/Return Procedures Customer Maintenance

Procedures Video Maintenance Procedures Periodic Processing Procedures Backup/Recovery Procedures Error Recovery Procedures Error Messages

For each section: List screen( s)

Required entries

Optional entries

Procedure for screen completion

periodic processing. For each screen, describe nor- mal, error, optional, and required processing.

Include backup and recovery information if the user is expected to perform those activities. Be specific about what actions are performed and the sequence of actions. If recovery must be acti- vated from a specific terminal, for instance, begin the instructions with something like the following. "At Terminal 011, located on the 2nd floor of 235 West Covina in the southwest corner, and labeled 'MAIN OPERATOR TERMINAL,' enter the following."

In an appendix, provide a list of all error mes- sages, by message ID with a detailed description of how to correct the error. Format the appendix to cor-

respond to the sequence of functional sections in the body of the report.

AUTOMATED __________ __ SUPPORT FOR __________ _ FORGOTTEN __________ __ ACTIVITIES ___________ _ Many products are available to support the activi- ties in this chapter. For screen design, screen 'paint- ers' and application generators both provide screen design. Screen painters are forms-oriented design tools that allow fast prototyping and layout of screens that then generate coded descriptions of the screens. A user identifies that screen design is desired; if the relation is described in the tool, the fields can be listed to provide screen design guid- ance, and the user 'paints' the screen by placing labels and field names on the screen in the target location. When complete, the screen can be called up to allow printing and viewing of the screen as it would be presented to the data entry clerk. Screen painters can be stand-alone software pack- ages but are more frequently a function of CASE environments.

A second type of software support for screen design is available in application generator software. The screens for menus are designed first with menu entries typed in by the software user. Then as func- tional screens are reached, the program code to gen- erate the requisite screen interaction (e.g., SQL) is coded. If custom form design for data entry is required, some packages include that activity, too; others require the designer to generate the code within the package.

Conversion software support is mostly in the form of utility programs that allow easy reformatting of data to move from a current automated file to one or more new files. Merging of information from two sources to create new composite files is some- times provided but requires more complex soft- ware coding.

Manual-to-automated data conversion ideally uses the application code for data creation to further test it and increase estimations of reliability. Sev-

eral application generator packages, for example, Focus™ ,5 provide automatic screen generation with no underlying edit or validation for 'quick and dirty' data entry. This is useful in proto typing and demon- strating prototypes, but should not be used for the production application. Focus generates the screen by sequentially listing the fields as defined in the database. As a line fills up with data, a new line is generated. This automatic screen utility only works on files with no repeating information and cannot join files for combined data entry.

Help packages are now plentiful in the market- place. Help used to be totally manual and all mes- sages had to be in the user documentation. As help has moved to become an on-line function, more mes- sages are documented on-line than in manuals. The advantage of a Help package that is independent of specific software is that it, and its messages, can be used across applications and software environments. This cross-application use can help ensure that defi- nitions are consistent throughout the company and can make data administration standards compliance easier to monitor.

The automated packages supporting the screen design, conversion, and help processing are summa- rized in Table 14-11.

SUMMARY ________ ~ __ _ In this chapter, human interface, conversion, and user documentation were discussed as three required activities during analysis and design that are omit- ted from many methodology discussions.

Human interface design focuses on screen inter- actions between users and the application. Using a task profile and user profile to guide the design process, first the option selection method is chosen. The alternatives for option selection are menus, win- dows, or command languages. Then, the presenta- tion format(s) most effective for the data to be displayed are decided. Presentation formats include analog, digital, text, text form, bar chart, column chart, point plot, pattern, and mimic displays. Within

5 Focus is a product of Information Builders, Inc., New York.

References 633

the presentation format, each screen item's charac- teristics of size, type font, style, color, and blink rate are defined. In designing forms, decisions about the chunks of data to be presented and formatting of chunks on the screen are required.

Conversion alternatives are direct conversion or incremental conversion. Incremental conver- sion may be geographic or functional (by transac- tion, by department function, or by application func- tion). Direct conversion has the highest risk of failure because the old method disappears at con- version; therefore, when an alternative is present, it is usually recommended. Incremental conver- sion type selected is determined by the context of the application.

Reports are designed following the same general guidelines as those of screens. Whenever a report is of displayed information, both screen and report should use the same format.

User documentation is an important introduction to an application for many new employees. As such, it should be easy to read, oriented toward the educa- tion and computer experience level of the reader, and should include all information for normal and abnor- mal processing of an application. Lists of contacts for different types of problems should be identified.

REFERENCES __________ __

Bailey, R. W., Human Performance Engineering: Using Human Factors/Ergonomics to Achieve Computer System Usability, 2nd ed. Englewood Cliffs, NJ: Prentice-Hall, 1989.

Banks, William W., Jr., and Jon Weimer, Effective Com- puter Display Design. Englewood Cliffs, NJ: Prentice-Hall, 1992.

Carter, R. c., "Visual search with color," Journal of Experimental Psychology: Human Perception and Performance, Vol. 8, 1982, pp. 127-136.

Christ, R. E., "Review and analysis of color coding research for visual displays," Human Factors, Vol. 17, 1975, pp. 542-570.

Cohen, Barbara F. G. (ed.), Human Aspects in Office Automation. New York: Elsevier, 1984.

Galitz, Wilbert 0., Human Factors in Office Automation. Atlanta, GA: Life Office Management Association, Inc., 1980.

634 CHAPTER 14 The Forgotten Analysis and Design Activities

TABLE 14- 11 Automated Support for Interface Design, Conversion, and On-Line Documentation

Product

APS Dev. Center

Deft

Easytrieve

Focus

Foundation

IEF

IEW, ADW(PS/2 Version)

PacBase

Teamwork

Telon and other products

Visible Analyst

Company

Sage SW Rockville, MD

Deft Ontario, Canada

Ribek, Inc. Tacoma Park, MD

Information Builders, Inc. New York, NY

Arthur Anderson & Co. Chicago, IL

Texas Instruments Dallas, TX

Knowledgeware Atlanta, GA

CGI Systems, Inc. Pearl River, NY

Cadre Technologies Inc Providence, RI

Pansophic Systems, Inc. Lisle, IL

Visible Systems Corp. Newton, MA

Technique

Screen/Form/Report Painters

Form/Report Painter

Data Conversion Utility

Prototyper Screen Generator Application Generator

Prototype Generation Screen Design Version Control

Dialog Flow Screen Design

Screen Design

Screen Flow

Screen Painter

Screen/Report Layout

Screen Painter/prototyper

Galitz, Wilbert 0., Handbook of Screen Format Design. Powell, James E., Designing User Interfaces. San Mar- cos, CA: Microtrend Books, 1990. Wellesley, MA: QED Information Sciences, Inc.,

1981. Martin, James, Design of Man-Computer Dialogues.

Englewood Cliffs, NJ: Prentice-Hall, 1973. Mayhew, D. J., Principles and Guidelines in Software

User Interface Design. Englewood Cliffs, NJ: Prentice-Hall, 1992.

Morland, D. Verne, "Human factors guidelines for terminal interface design," Communications of the ACM, Vol. 26, #7, July 1983, pp. 484-494.

Olsen, Dan R., Jr., User Interface Management Systems: Models and Algorithms. San Mateo, CA: Morgan Kaufmann Publishers, 1992.

Schneiderman, Ben J., Designing the User Inter- face: Strategies for Effective Human-Computer Interaction. Reading, MA: Addison-Wesley, 1987.

Thomas, John c., "User interface design," Proceedings of NYU Symposium on Human Factors, New York, NY, May 1982.

Tullis, T. S., "Screen design," Handbook of Human Computer Interaction, Mark Helander (ed.). New York: Elsevier, 1988, pp. 377-411.

KEYTERMS ____________ _

analog display band chart bar chart binary binary display body of form body of screen classroom instruction close box column chart command language computer-based training

(CBT) derived field digital display direct cutover direct manipulation direction indicator field format flash rate flicker fusion footer screen section form screen functional conversion geographical conversion gradual cutover header screen section horizontal pull-down menu incremental cutover incremental software

development location ID long-term memory Lotus-style horizontal

pop-up menu

menu mimic display normal/abnormal measures on-the-job training (OJT) option selection overlapping windows parallel conversion parallel execution pattern display paint point point plot pointer precision requirements question & answer format resize box scale screen painter scroll arrow scroll bar scroll box scrolling elements short-term memory status indicator task profile text tiled windows title bar transaction conversion user profile vertical pop-up menu window zoom box

EXERCISES _______ _

1. Complete the screen design for Customer and Video data entry for ABC Video. For video data entry, keep in mind how conversion defines the add function to automatically provide for Copy relation creation. Specifically, identify reused

Study Questions 635

portions of screens or whole screens for differ- ent functions. Discuss why complete reuse of Create Video screens is not possible for Video Update processing.

2. For the CCD Medicaid case described in Appen- dix A, design windowed menus for the applica- tion. Design the screen for Patient Information Creation. How much scrolling is necessary? What colors, type, style, font, and so forth, do you recommend for each field?

STUDY QUESTIONS ___ _

1. Define the following terms: analog display OIT field format scrolling elements flash rate user profile form task profile horizontal pull-down

menu 2. Why is the data source the best location at

which data should be entered into automated applications?

3. Why should screen design guidelines be followed?

4. Describe a task profile and how it is used in the application development screen design and conversion.

S. When should individual users be profiled and when can average user information be used?

6. Describe how novice/expert modes of opera- tion should be determined.

7. Describe how extent and type of on-line mes- sages and help are defined.

8. Describe the option selection choices and how you decide which to use.

9. Why is command language use by itself rare? 10. What is a screen window and why are they

popular? 11. How many scrolling options are available?

What is the minimum scrolling that should be provided in an application?

12. What are the differences between tiled and overlapped windows?

13. Why should function keys be consistent?

636 CHAPTER 14 The Forgotten Analysis and Design Activities

14. Describe general screen design contents. 15. What is direct manipulation interface? 16. What application types use forms as the most

common functional screen design? 17. List and define five data presentation alterna-

tives. For each alternative, describe one possi- ble business application use.

18. When are bar and column chart use recom- mended?

19. How are fields positioned on a screen? On a line?

20. Why are short-term memory (STM) and long- term memory (LTM) important in screen design?

21. When is color effective in screen design? How many colors should be used on screens at any one time?

22. How can type font be varied for effective screen design?

23. What are three options for incremental conver- sion? How do you choose which to use?

24. Discuss issues in data conversion. 25. Why should users do user documentation?

Why should application developers do user documentation?

26. Discuss how contents of user documentation can be varied to match user skills and computer expertise.

* EXTRA-CREDIT QUESTION 1. Define a poorly designed menu and functional

screens for ABC Customer Maintenance. Use at least 10 bad design elements. Then, fix the design problems and define effective screens for the same function. Describe the guidelines fol- lowed in defining each element of the good screens. Write a paragraph discussing the kind of errors that users might make from using the poorly designed screens.

PAR T IV IMPLEMENTATION ----------------.. --------~----I ~D ____________________ ~ ____ ~_I ~AINTEN~CE _______________ ~ _____ _1

The five chapters in this section discuss implementa- tion and maintenance issues. An application is never completed until it is retired. After analysis and design, we must be able to implement the design on computer hardware using computer software or our work is useless. The first three chapters in this sec- tion relate to implementation issues: selecting a computer language; evaluating and selecting hard- ware, software packages, or consulting services; and testing/quality assurance of the finished product.

Chapter 15 defines characteristics of languages, to allow us to distinguish between ten languages that are evaluated. Then, the languages are matched to the application types discussed in Chapter 1 and to the methodologies discussed in Chapters 7-12. Language selection, rather than code structure, is emphasized because of the increased use of com- puter-aided software engineering (CASE) tools to

generate code. The language selected must be able to support the application requirements. In Chapter 15, we first describe identifying characteristics of lan- guages. Then, the implementation of each charac- teristic is described for ten languages. Based on the language characteristics, we define the types of ap- plications for which each language is best suited.

Similarly, outsourcing and use of software pack- ages are growing in all industries because it is frequently cheaper to buy rather than build an appli- cation and/or its environment. In Chapter 16, we dis- cuss the evaluation process and highlight the types and alternatives for soliciting bids from vendors. Sections and contents of a request for proposal (RFP) are defined and developed for the ABC case to show what they look like. Hardware, software, and consulting services might all be contracted for in the same request, or could individually be the subject

637

638 PART IV Implementation and Maintenance

of RFPs. Examples of RFP expectation criteria for each type of work are provided to give a sense of the level of detail to which work is defined in an RFP. Then, vendor proposal evaluation alternatives are defined and discussed in relation to ABC Video's application.

Regardless of the development product-pack- aged software, generated CASE code, or manually programmed code-proving that the software works by testing it at various levels of detail and aggrega- tion is required. Chapter 17 defines the different strategies for testing and types of testing performed. Test types are matched to strategies to develop an effective overall strategy for testing applications. For each level of testing, key issues in test case devel- opment are identified. Based on research on testing errors found, guidelines for deciding when to stop testing at each level are provided. The ABC case is then analyzed to demonstrate how the theories apply in practice.

The last two chapters relate to change. Chapter 18 discusses application change management that all take place throughout the life of a project. Change is a way of life in computing and application devel- opment is no exception. In Chapter 18, we first dis- cuss how to design for reusability by using templates and reusable modules. Then, change management techniques that apply to documents, decisions, soft- ware, and application configurations are presented. The automated tools section includes software rep- resentative of each type of change management.

Documentation for project work can be thousands of pages long. Since errors in code usually begin to be traced through documentation, it is important to identify changes to facilitate the error tracing pro- cess. Also, users and maintenance personnel who might only infrequently review documentation should be directed to the new information rather than having to read entire documents each time. The tech- niques for identifying change easily are identified in Chapter 18.

Similarly, application decisions might provide a useful trace of the considerations and discarded ideas throughout a project's life. Few project teams keep such a decision trace because, historically, to do so meant maintenance of more thousands of pages of paper. With automated decision support and sophis- ticated word processing, keeping a record of deci- sion history is now feasible and can be useful in organizations with rapidly changing management or on projects that support business functions that are subject to rapid industry change.

Software changes and application configuration management are the other major topics of Chapter 18. A recent buzzword identifies software reengi- neering, also called reverse engineering, as the back- ward design of undocumented programs and applications that were probably built without the team having followed a methodology to guide the work. Also called spaghetti code, such applications can be maintained beyond a useful life. In the chap- ter, we describe how to decide when to reverse

engineer, reengineer, or retire applications and/or individual programs. Once the decision is made to maintain software, management of the software maintenance process is an important task in deter- mining that the correct configuration of modules, functions, programs, and so on, is in production. The issue of configuration management is more compli- cated when multiple versions of software, such as a DOS and MVS versions, exist. Techniques and man- agement practices for configuration management are described in the chapter.

Finally, your career is important and requires management by you for your working life. It is dif- ficult to plan a career without having a sense of what opportunities and expectations are available. First, the typical job levels and types of jobs found in busi-

Implementation and Maintenance 639

nesses are described. Then, one way to plan a career by thinking through your wants and requirements for technical, job, company, geography, and opportuni- ties for advancement is developed. A method for defining your chances of job success is defined next. Trends of IS jobs over the last five years by geogra- phy, salary, and industry are discussed. Part of developing yourself into a professional and having a career is to maintain your professional status. Techniques for maintaining professional status and building on knowledge areas including education, professional association membership, accreditation, and reading are all defined, with suggested ap- proaches to applying the information to your own situation.

C HAP T E R 15 CHOOSINGAN

--------------------------------------------------~

IMPLEMENTATION ______ ---II LANGUAGE

INTRODUCTION ____ _

In this chapter, we discuss the selection of a lan- guage for implementing an application. Program- ming is the process of designing and describing an algorithm to solve a class of problems. As any pro- grammer knows, any activity can be programmed in any language ... just not necessarily as effec- tively or completely in each language. When work- ing on an application, we do not always have a choice of the language we use. But with the selec- tion of the wrong language, we constantly compro- mise the requirements to fit the constraints of the language. In this chapter, we discuss characteristics of languages and how to select a programming lan- guage based on requirements of an application so that, if there is a choice to be made, an appropriate language can be selected. The activity of program- ming is not discussed in this text because, with CASE environments and tools, much program code is automatically generated.

First, the characteristics of languages are defined. Then 10 computer languages--SQL, Focus, BASIC, COBOL, Fortran, C, Pascal, Ada, PROLOG, and Smalltalk--are evaluated according to the char- acteristics. These languages represent the major programming paradigms, including procedural (For- tran, COBOL, BASIC, Pascal), object orientation (Smalltalk, Ada), declarative processing (SQL,

640

PROLOG), fourth-generation languages (4GL, Focus), and expert systems (PROLOG). They also represent the most popular languages in use in busi- ness organizations today and in the years to come. Then, languages are matched to different types of applications and methodologies. Finally, automated support for programming is discussed. First, we develop the characteristics that distinguish between languages.

CHARACTERISTICS ___ _ OFLANGUAGES ____ _

To differentiate languages, we must evaluate how each language deals with data definition and processing, mathematical and logical processing, control, conditional, array, input/output, and sub- program processing in addition to nontechnical assessment of each language's ease of use, portabil- ity, and maintainability. Finally, available automated development aids such as CASE and code generators are noted.

Data Types Each language supports some data types. A data type is a language-fixed definition of data. All lan- guages support variables and constants for numeric

Data Type

Integers

Real

CharacterlStri ng

Example

1,2,3

-1.01,3.21

Abc12;'.

FIGURE 15-1 Examples of Universal Data Types

and character data. The universally supported data types are integers, real numbers, and character strings. Example of each are shown in Figure 15-I. Integers are whole numbers such as one, two, or three. Real numbers include positive and negative continuous numbers, including all decimals. Char- acter strings are any legal combination of alphanu- meric characters.

Fewer languages support one or more of logical, Boolean, pointer, object, bit, date, or user-defined data types. Logical data types are notation provid- ing for nonnumeric comparison including and, or, or not processing (see Figure 15-2 for example). Also, the comparison operators used in logical data

Data Type Example

Logical And, Or, Not, <, >, =, ~,~,i=

Boolean True, False

Pointers 16F26 (where 16F26 is a valid memory address)

Object Customer=12346, Add, Change, Delete, Inquire

Bit 0, 1

Date 022893

FIGURE 15-2 Examples of Nonuniversal Data Types

Characteristics of Languages 641

types include all variations of equality and inequality operators (see Figure 15-2).

Boolean operators generate binary true/false in- dicators based on some logical comparison (see Fig- ure 15-2). Pointers are addresses of other program or data constructs that are used for reference within a program.

Objects are programmed encapsulations of data with methods. The example in Figure 15-2 shows only the names and ID of an object with the names of the methods or program modules that can manipu- late the data. In actuality, an object contains all of the data and all of the program code for the methods.

A bit is an individual binary digit (see Figure 15-2). Bit manipulation is highly desirable in pro- grams using binary status indicators. In an eight-bit character set, use of one bit rather than eight to indi- cate a single value can save millions of characters of storage space.

Date data types define combinations of months, days, and years that support only legal date entries (see Figure 15-2). Rather than writing routines to validate dates, the language tnay have built-in validation processing.

Finally, user-defined data types are data defini- tions that become fixed within a program or appli- cation. User-defined data types can be for any application-specific combination of legal characters. A common user-defined data type is for a date con- struct when the language does not provide a date data type.

Data Type Checking Data type checking refers to the extent to which a language enforces matching of specific data defini- tions in mathematical and logical operations. There are four levels of type checking, ranging from type- less to strong checking. Which level is required is dependent on the application type. In general, the more stringent the requirements for accuracy and consistency of processing, the more desirable strong type checking becomes. With object methodologies, strong checking is desirable because with polymor- phism, the ability to have multiple modules process- ing the same function but on different data types,

642 CHAPTER 15 Choosing an Implementation Language

01 COBOL-INFO.

05 EXAMPLE-NUMBER PIC 9(5).

01 TARGET-INFO.

05 TARGET-NUMBER PIC 9(5).

PROCEDURE DIVISION.

Move 'A124X" to COBOL-INFO. *** Causes no errors ***

Move COBOL-INFO to TARGET-INFO. *** Causes no errors ***

Move EXAMPLE-NUMBER to TARGET- NUMBER. *** Abend-Illegal data in EXAMPLE-NUMBER ***

FIGURE 15-3 Cobol Typeless Checking

the probability of errors is reduced with strong type checking.

Typeless checking means that there is no explicit checking performed. In typeless languages, such as BASIC or COBOL, alphanumeric characters are allowed in an integer field, but might cause an abend if the field is referenced as an integer (See Figure 15-3). Operations using typeless fields are not guar- anteed to execute successfully. Typeless field pro- cessing is not consistent across languages or compilers.

The next level provides automatic type coercion in which mixed data types are allowed, but conver- sion of incompatible types occurs when used together. Also called mixed mode type checking, different data types within a category (e.g., numeric) are converted to a single target type for mixed mode operations. In Fortran, for instance, mixing a real and integer number in a mathematical operation leads to unpredictable results because the target type is determined by the result field definition (see Fig- ure 15-4). If the result field is defined as real, the process will yield a real number. In Fortran, the first character of a field determines its data type. Names beginning with A-H and O-Z are real; names beginning with I-N are integer. In Figure 15-4a, the result field begins with B; therefore, the result field is

a real number. If the result field is defined as inte- ger, the process rounds the answer and the result is integer. In the example in Figure 15-4b, the answer is either zero or one depending on the computer system and how it rounds integers. Obviously, without detailed knowledge of the internal language process- ing, programming errors can result.

Pseudostrong type checking, the third level of data type checking, permits operations only on data objects of the same data type when they are defined in the same module. But, unlike strong type check- ing languages, there are language inconsistencies, or undocumented features, that allow programmers to mix data types. Pascal is a pseudostrong type checking language in that it supports strong typing within modules, but has no type checking across modules. So, data passed from one module to an- other for processing may be combined in the called module with another data type with no penalty.

At the highest level of data type checking, lan- guages with strong type checking permit operations only on data objects of the same, prespecified data type whether in the same or other modules. If a mod- ule contains an illegal data type, the application would stop processing and issue an error message. Ada provides strong type checking.

Language Constructs Language constructs determine what and how operations on data are carried out. They provide for sequencing, iteration, selection, and data structure

a. The formula is: I/A = B 5/10.0 = 0.50

The data are converted to real because B is a real name.

b. The formula is: IIA = J 5/10.0 = 1.0 or 0.0

Data are converted to integer and rounded. Results vary depending on the computer system.

FIGURE 15-4 Mixed-Mode Data Type Checking

processing, and differ for each language classified. In general, the richer the language, the more these constructs will be present. However, with the rich- ness comes a trade-off in language complexity that forces users to learn more language details to become proficient.

The need for rich language constructs depends somewhat on the language paradigm. For instance, SQL is a declarative, set processing language that does not need iteration because iteration is embed- ded in the language. In a declarative language, you code what you want to do, not how. With set pro- cessing, you identify the database and the language controls all file manipulation. The more procedural the language, the richer the language constructs need to be. The more detailed the application, the richer the language of the application should be.

Sequencing occurs between and within com- mands. Between-command sequencing is controlled by you as the programmer who defines the order of commands. Intracommand sequencing is part of lan- guage definition and is called operator precedence. Operator precedence is the prioritizing of symbols to manipulate data. All languages have at least four arithmetic symbols in common: + for add, - for sub- traction, * for multiplication, and / for division. Most languages also have many other symbols and opera- tions supporting unary and binary operations includ- ing relational processing (e.g., "less than," "less than or equal," etc.), logical processing (e.g., "and," "or," or "not"). A list of operators available in different languages is provided in Figure 15-5.

Control language constructs support iteration, sequential or selection processing via loops, exits, conditional statements, or case constructs. Loops provide iterative, repetitive processing and are usu- ally supported through structured programming notations such as "do while ... " or "do until. ... " Conditional statements support "if ... then .. . else" processing. Conditional statements are used in some languages to control iterative loop processing. Common loop notations are shown in Figure 15-6.

Case statements allow identification of code seg- ments that combine to identify the "case," for example, in Focus file maintenance processing you can code screen processing cases for add, change, and delete cases. This simplifies the thought pro-

Operator

Add

Subtract

Multiply

Divide

Exponent

And

Not

Equal

Characteristics of Languages 643

Symbol

I, -7-

**, A

AND

Less <

Greater >

Less or equal ~, =<, <=

Greater or equal 2, =>, >=

FIGURE 15-5 Language Operators

cesses involved in programming by "chunking" case contents.

Exits leave the current code module and return to the calling module or to some other named mod- ule. Exits can be simple returns to the calling mod- ule, such as Return, Cut, or Exit statements (see Figure 15-7); exits can indicate the nature of the end as in PROLOG's Fail exit, or exits can return to a named module in a Goto statement.

Arrays, or tables, are a third type of language construct that mayor may not be supported by a lan- guage. Linear arrays, or lists, are one type of data that are relatively simple to support (see Figure 15-8). When higher dimension arrays are supported, the maximum number of dimensions are identified. Occasionally a language will support n-dimensional arrays, with a user-defined maximum.

Next there are four possible alternatives for phys- ical input and output (I/O) of information to and from automated files or data entry fields. First, spe- cific I/O statements (e.g., read/write) for externally stored data may be one of three types: record- oriented, set-oriented, or array-oriented. Record- oriented I/O reads (or writes) a physical record of

644 CHAPTER 15 Choosing an Implementation Language

BEGIN ... END

BLOCK

DO ... ENDDO

FOR .. .

FOR ... END FOR

if False ...

ifTrue ...

INDEX .. .

LOOP ... ENDLOOP

REPEAT ... END

REPEAT .. .

WHILE .. .

WHILE ... ENDWHILE

whileFalse ...

whileTrue ...

FIGURE 15-6 Loop Notations

information that may contain one or more logical records. Recall from database class that records (or tuples in relational terminology) are groupings of related fields. Record-oriented I/O requires opening and closing of files, reading or writing of records, and user management of all file processing, such as checks for end-of-file. COBOL, Fortran, Assembler languages, and Ada are record-oriented.

Exit Type Processing

Return Return to Calling Module

Cut Return to Calling Module/Instruction

Exit Return to Calling Module

Fail Go to Calling Module/Instruction with Boolean indicating process failure

Goto Go to Named Module

FIGURE 15-7 Exit Types

Linear Array, List

Two Dimensional Array of Months and Days

January

February

March

April

Three Dimensional Array of Sales By Year By Month

Year Month Sales

1996 January 220,000

1996 February 250,000

Year Month Sales

1995 January 150,000

1995 February 170,000

Year Month Sales

1994 January 100,000

1994 February 100,000

FIGURE 15-8 Types of Arrays

Set-oriented I/O assumes that all records (or tuples) are treated the same and that some selection criteria, when applied, identify the desired informa- tion. The language controls all file and read/write processing according to user-defined selection crite- ria. At the end of a procedure, the set of records (tuples) resulting from the procedure are stored in memory for printing or display. SQL is set-oriented.

Implicit I/O is similar to set-oriented I/O. Implicit I/O is used in 4GLs in which reading and writing of data is hidden from the user. The user specifies the type of process, for instance, TABLE FILE ... , and the language infers the type of file processing required from the command. Set-oriented I/O is

more rigorously defined and has provably correct contents based on mathematical set theory which underlies relationship processing. Implicit I/O, on the other hand, is in languages which predate rela- tional theory and do not have provably correct results.

Array-oriented I/O reads and writes strings of fields that are assumed to be some sort of array. The user is responsible for defining and manipulating the nature and data type of array. The language simply reads or writes until the end of the array. Pascal is an array-oriented language.

List-directed I/O is a variant of array-oriented I/O. List-directed I/O is used in Fortran to define a list of variable names to which items are directed as they are read. The language reads until the list is full, then continues processing until the read is again executed. Data items are not specifically formatted, rather the format is implicit in the variable names.

The extent to which data formats and I/O pro- cessing can be defined and controlled distinguishes languages as I/O-oriented versus CPU -oriented in their processing. The more elaborate the I/O pro- cessing, the more I/O-oriented the language. The more primitive the I/O processing, the more CPU- oriented the language. Fortran is an example of a CPU-oriented language, while COBOL is an exam- ple of an I/O-oriented language.

Modularization and Memory Management The extent to which modularization and memory management are supported is an indication of lan- guage sophistication. Modularization is the creation of subprograms or stored functions. Languages dif- fer in the manner in which the subprogram and their data are supported. First, the ability to define sub- programs or functions is important to attaining desirable program characteristics such as maximal cohesion. Not all languages allow SUbprograms. In particular, set-oriented languages (SQL) do not eas- ily support subprograms.

Second, how data in modules is managed is important. Data can be local or global. Local data storage defines data variables and constants that are

Characteristics of Languages 645

only used within a given module. Global data are accessible to any module in the application. The ability to have local data is important to attaining information hiding and minimal coupling. The extent to which global data is required limits the quality of resulting programs by limiting informa- tion hiding and cohesion.

Subprograms' activation is similar across lan- guages. Called modules are referenced by module name. For instance, "CALL FACTORIAL, 5" might be a subprogram call that passes the value five for factorial computation. Modules must reside in a library that is linked to the calling module via control language (e.g., JCL). Options for call processing include passing of variable data either by name, by address, or directly, by value. Value passing requires local data definition while passing data by name or address is used with either local or global data.

Generally, when using subprograms, a main mod- ule calls the subprogram which performs its process- ing and returns to the calling module. The ability to support subprogram processing requires one or more entry and exit points. Exit and return processing are also important when passing control of processing between modules. In general, the more opportunities to enter and exit a given module, the more proficient the programmer needs to be to ensure proper pro- cessing. According to structured programming ten- ants, a well-designed module should have one entry and one exit point. Some languages, such as Small- talk and Ada, enforce this idea by allowing only one entry and one exit per module. One entry-one exit modules are less error-prone than modules that allow many alternatives.

The next level of sophistication is the extent to which programmers have control over their own memory management. Memory management refers to the ability of a program to allocate more computer memory as required. This is an option frequently desired in variable list processing and real-time applications that manage multiuser resources. Mem- ory in less sophisticated languages is static: The pro- gram is assigned a maximum at the time it is initiated for processing. If more memory than that allocated is needed, the program abends, more memory is requested manually via job control language, and the program is rerun.

646 CHAPTER 15 Choosing an Implementation Language

With dynamic memory management capabili- ties, the program monitors its own use of storage and allocates more memory as needed. In sophisticated languages, the capability to dynamically allocate memory is present.

Exception Handling Exception handling is the extent to which programs can be coded to intercept and handle program errors without abending a program. This capability adds to both the complexity and the range of usefulness of a language. This capability ranges from none to some. For instance, COBOL allows you to intercept data errors such as overflow or divide by zero, but not others, such as invalid data definition or read past end-of-file. In contrast, Smalltalk allows the inter- ception of any error.

Multiuser Support The extent to which language constructs for memory management, global/local variables, and subprogram management are available, determines the extent to which a language can support multiple users. There are three levels of support for multiple users that relate to program modules having the properties of reusability, recursion, and reentrancy. Reusability, also called serial reusability, is a property of a mod- ule such that many tasks, in sequence, can use the module without its having to be reloaded into mem- ory for each use (see Figure 15-9). To accomplish this level of program, any changes to local variables must be reset to their original contents before the completion of processing and return to the calling module. The easiest way to develop reusable pro- grams is to provide global variables that can change contents and local variables that either cannot change or are always reset after the module's use. Reusable programs can support sequential or inter- active processing, but not multiuser or real-time processing.

Recursiveness is a property of modules such that they call themselves or call another module that, in turn, calls them. An example is factorial multiplica- tion in which the same process is performed on a dif- ferent number of variables a number of times (see

Reusable Pseudo-code

Factorial (N, Nfact) End=O If N=O or 1

Loop. go to exit.

If N=1

else go to exit

Nfact = N * (N-1) N = N-1 go to Loop.

Exit. Exit.

Recursive Pseudo-code

Function FACT (N) Begin

If N =0 Then Factout = 1

Else Factout = N * FACT(N-1) End {Function Fact};

FACT is a function that recurs continuously until N = O.

Reentrant Pseudo-code

Load N, Nfact, First-Exec If N = (0 or 1) and First-Exec = 0

Then Nfact = 1 Else

If N > 1 Nfact = N * (N-1) N = N-1 First-Exec =1 Save N, Nfact, First-Exec.

FIGURE 15-9 Examples of Reusable, Recursive, and Reentrant Modules

Figure 15-9). Processing with recursion is explicitly outlawed in some languages, while it is considered a main strength of others, such as PROLOG. Recur- sion requires serial reusability of programs in addi- tion to the ability to maintain a queue (or stack) of outstanding requests to be completed. This queue- ing support provides for multiple uses of the mod- ule by one user.

Reentrancy is a property of a module such that it can be shared by several tasks concurrently. There is a constant part and a variable part to each reentrant

module. The constant part is loaded into memory once and it services tasks in a serially reusable man- ner until it is overwritten by another program. A copy of the variable part is activated for each task when it is initiated (see Figure 15-9). A queueing mechanism keeps track of the user's identification, the location of the variable part, program status word, and register contents for the task. This infor- mation is swapped into (or out of) the active area as the user becomes activated (or interrupted). Only one task is active at a time, but several tasks might be in various stages of task completion. Only the property ofreentrancy allows true real-time process- ing and support for multiple concurrent users. Both serial reusability and recursiveness are required to achieve reentrancy in programs.

To summarize, programming languages differ in the extent to which they support alternatives for defining data types, input/output process- ing, mathematical, relational, logical, bit, control, array, subprogram, and memory processing. The less extensive the language constructs supported, the simpler the language, but the more restricted the domain of problems to which it is amenable. The more extensive the language constructs sup- ported, the more complex the language, and the more extensive the domain of problems to which it is appropriate.

NONTECHNICAL ____ _ LANGUAGE _____ __ CHARACTERISTICS ___ _

Nontechnical characteristics are at least as important as technical characteristics when selecting a lan- guage. The nontechnical characteristics evaluated here are uniformity, ambiguity, compactness, local- ity, linearity, ease of design to code translation, com- piler efficiency, and portability. The availability of CASE tools, availability of code generators, and availability of testing aids also add to a language's attractiveness, and are discussed in a later section.

Uniformity is the use of consistent notation throughout the language. An example of nonunifor- mity in Focus is the use of single quotes for cus-

Nontechnical Language Characteristics 647

tomized report column titles and the use of double quotes for customized report page titles. This type of inconsistency hinders the learning of the language and almost guarantees that novices and infrequent users will make mistakes.

Ambiguity of a language refers to the extent to which humans and compilers will differ in their interpretation of a language statement. Ideally, humans' thinking should be identical to compiler interpretation, and that compiler interpretation should be intuitive to humans. Unfortunately, ambi- guity may be inherent to some problems, such as artificial intelligence applications which reason through a process. As new rules and inferences are added to an AI application, interpretation of exist- ing data and rules might also change, thus intro- ducing ambiguity into a previously unambiguous application.

Compactness of a language is its brevity. The presence of structured program constructs, key- words and abbreviations, data defaults, and built-in functions all simplify learning and programming. Contrast SQL or Focus, both fourth-generation languages, with COBOL, a third-generation lan- guage. A report that takes three to five lines in 4GL procedure code requires 50-150 lines of COBOL code (see Figure 15-10). That learning time is con- siderably shorter for Focus than COBOL, partly due to the compactness of the language.

In turn, compactness implies locality in providing natural "chunks" of code that facilitate learning, mental visualization of problem parts, and simula- tion of solutions. Locality is provided through block, case, or other similar chunking mechanisms in lan- guages. Chunks might be implemented via a per- formed section of code in COBOL, a case construct in Focus, or an object definition in Smalltalk. In all three of these examples, a user's attention is focused only on the chunk of the code present. By being able to ignore other parts of the code, learning of the chunk is simplified.

Linearity refers to the extent to which code is read sequentially. The more linear a language, the easier it is to mentally "chunk" and understand the code. Linearity facilitates understanding and main- tainability. In Figure 15-10, the COBOL code chunks in paragraphs and performed sections; these

648 CHAPTER 15 Choosing an Implementation Language

4GL-Focus TABLE FILE SALES HEADING CENTER 'SAMPLE SALES REPORT' SUM SALES

BY REGION ACROSS MONTH BY YEAR

ON YEAR SUMMARIZE ON YEAR PAGE-BREAK END

3GL-COBOL

WORKING-STORAGE SECTION. 01 CONTROL-TOTALS.

05 LINE-COUNT 05 END-OF-FILE

88 EOF

PIC99 PIC9

05 CURRENT-REGION PIC 99 VALUE ZERO. 05 SUM-SALES.

10 JAN-SUM PIC 9(5) 10 FEB-SUM PIC 9(5) 10 MAR-SUM PIC 9(5) 10 APR-SUM PIC 9(5) 10 MAY-SUM PIC 9(5) 10 JUN-SUM PIC 9(5) 10 JUL-SUM PIC 9(5) 10 AUG-SUM PIC 9(5) 10 SEP-SUM PIC 9(5) 10 OCT-SUM PIC 9(5) 10 NOV-SUM PIC 9(5) 10 DEC-SUM PIC 9(5)

01 REPORT-HEADER. 05 FILLER PIC X(48) 05 HD1 PIC X(19)

'SAMPLE SALES REPORT'. 01 COL-HEADER1.

05 FILLER PIC X(132) 'REGION MONTH'.

01 Cot-HEADER 2. 05 FILLER PIC X(132)

JAN FEB MAR APR MAY JUNE JULY AUG SEPT OCT NOV DEC'.

01 REPORT-DETAIL. 05 FILLER PIC XXX 05 REGION PIC XX 05 FILLER PIC X(10) 05 SALES PIC X(84)

FIGURE 15-10 4GL versus 3GL Language Compactness

VALUE 55. VALUE ZERO.

VALUE ZEROS. VALUE ZEROS. VALUE ZEROS. VALUE ZEROS. VALUE ZEROS. VALUE ZEROS. VALUE ZEROS. VALUE ZEROS. VALUE ZEROS. VALUE ZEROS. VALUE ZEROS. VALUE ZEROS.

VALUE SPACES. VALUE

VALUE

VALUE SPACES. VALUE SPACES. VALUE SPACES. VALUE ZEROS.

VALUE 1.

Nontechnical Language Characteristics 649

05 SALES-NUMERICS REDEFINES SALES. 10 JAN-SALES PICZZZ,ZZZ 10 FEB-SALES PIC ZZZ,ZZZ 10 MAR-SALES PICZZZ,ZZZ 10 APR-SALES PICZZZ,ZZZ 10 MAY-SALES PICZZZ,ZZZ 10 JUN-SALES PICZZZ,ZZZ 10 JUL-SALES PIC ZZZ,ZZZ 10 AUG-SALES PIC ZZZ,ZZZ 10 SEPT-SALES PIC ZZZ,ZZZ 10 OCT-SALES PIC ZZZ,ZZZ 10 NOV-SALES PICZZZ,ZZZ 10 DEC-SALES PIC ZZZ,ZZZ

PROCEDURE DIVISION.

PERFORM SUMMARY-CONTROL THRU PRINT-REPORT-EXIT.

SUMMARY-CONTROL. IF REGION = CURRENT-REGION

GO TO PAGE-CONTROL ELSE

PAGE-CONTROL.

MOVE SUM-SALES TO SALES-NUMERICS MOVE YEAR TO REGION WRITE REPORT-DETAIL AFTER 3. ADD 3 TO LINE-COUNT.

VALUE ZEROS. VALUE ZEROS. VALUE ZEROS. VALUE ZEROS. VALUE ZEROS. VALUE ZEROS. VALUE ZEROS. VALUE ZEROS. VALUE ZEROS. VALUE ZEROS. VALUE ZEROS. VALUE ZEROS.

IF LINE-COUNT> 50 OR REGION NOT = CURRENT-REGION WRITE REPORT-HEADER AFTER PAGE WRITE COL-HEADER1 AFTER 2 WRITE COL-HEADER2 AFTER 1 MOVE 4 TO LINE-COUNT.

MOVE REGION TO CURRENT-REGION. PRINT-REPORT.

MOVE CORRESPONDING INPUT-SALES-SUMMARY TO REPORT-DETAIL. WRITE REPORT-DETAIL AFTER 1. ADD 1 TO LINE-COUNT.

PRINT-REPORT-EXIT. EXIT.

FIGURE 15-10 4GL versus 3GL Language Compactness (Continued)

650 CHAPTER 15 Choosing an Implementation Language

language features facilitate COBOL program under- standability.

The ease with which program specifications are translated into code is also important in language selection. In general, more declarative languages, such as SOL, are considered easier to code than more procedural languages such as Fortran. How- ever, PROLOG and other inferential languages, while declarative and simple in developing single rules, are not simple when trying to determine whether the rules aggregate to the proper knowledge structures.

Compiler efficiency is the extent to which a compiled language generates efficient assembler code. Compiler efficiency varies by vendor and by language. Compiled code efficiency is important especially when programming for small computer systems or for embedded applications that interact with other system components as part of a larger system.

Along with efficiency of executable code, porta- bility of code is important. Portability is the ability to transplant the code without change to a different operating platform that might include hardware, dif- ferent operating system, or different software envi- ronment. A hardware platform may be a single-user personal computer, a workstation, or a mainframe. Each of these might run the same operating system, for example Unix, or might use a different operat- ing system. The more code that must be changed to accommodate a specific hardware or operating environment, the less portable the language. As global and distributed applications become more prevalent, the need for language portability will increase. Ideally, programs should be able to be developed anywhere for execution on any hardware or operating system platform.

In summary, when technical characteristics do not distinguish languages for application use, nontechni- cal characteristics of languages become important to their selection. The nontechnical characteristics evaluated here include uniformity, ambiguity, compactness, locality, linearity, ease of code devel- opment, compiler efficiency, portability, and avail- ability of automated development tools. In the next section, we discuss ten popular programming lan- guages and the extent to which they contain the

language constructs above. Then we discuss appli- cation characteristics and how they map to the languages.

COMPARISON OF ____ _ LANGUAGES _____ _ Ten languages are evaluated in this section to high- light the differences across paradigms and language generations for all of the characteristics defined above. The ten languages selected were chosen because of their current and expected future popu- larity either in academic circles (e.g., Pascal) or in industry. The languages include SOL, COBOL, For- tran, BASIC, Focus, C, Pascal, PROLOG, Ada, and Smalltalk. Each language is discussed briefly below to highlight the characteristics that make it popular and unique. Table 15-1 summarizes the 10 languages on all of the characteristics described above.

SQL As the American National Standards Institute's stan- dard for database query language, SOL has enjoyed a successful life. SQL pervades any database course taught in North America and is a query language front-end to virtually every database package on the market regardless of machine size, number of users supported, or complexity of the database. SOL's virtues are mostly nontechnical: ease of learning, compactness, uniformity, locality, linearity, portabil- ity, and availability of automated tools (see Table 15-1). The simplicity of the language is evident in the small number of hours of learning time it takes novices to begin using the language. A novice might begin writing queries in literally minutes. Profi- ciency, of course, takes longer, but time to become proficient is shorter than most database languages.

Many CASE environments that support analysis and design also support logical database design through the process of normalization. Those prod- ucts also generate SOL database definitions as the logical DB design output. Many of the same

(Text continues on page 656)

Comparison of Languages 651

TABLE 15-1 Comparison of Languages

SQL Focus BASIC COBOL Fortran

Data Types Real Yes Yes Yes Yes Yes Integer Yes Yes Yes Yes Yes Character Yes Yes Yes Yes Yes String No No No Yes No Boolean No No No No No Date No Yes No No No User-Defined No No No No No Pointer No No No No No Bit Identification No No No No No String-Mask No No No No No

Data Type Checking Typeless X X Automatic type

coercion X Mixed mode X X Pseudo strong Strong

Operator Precedence 0" */± 0" */± 0" */± 0" */± 0" */±

Binary and Unary Operators Yes Yes Yes Yes Yes

Arithmetic +,-, *,/ Yes Yes Yes Yes Yes

Relational <,=,>,::;,~ Yes Yes Yes Yes Yes

Logical and,or,not Yes Yes Yes Yes Yes

Bit No No No No No

Type Conversion No Yes, No Yes, Yes, Limited Limited and Limited and

Inconsistent Inconsistent Control

Loops No No FOR ... PERFORM ... FOR ... NEXT UNTIL CONTINUE

Exits No EXIT, EXIT, EXIT EXIT, GOTO GOTO GOTO

Conditional WHERE IF ... IF ... IF ... THEN IF ... Statements ELSE ... ELSE

Case No Yes (not in No COBOL 88 No Statements query language) only

Arrays Linear Arrays No No Yes Yes Yes Multiple

Dimensions No No Upt02 Up to 3 Up to 3

(Table continues on next page)

652 CHAPTER 15 Choosing an Implementation Language

TABLE 15-1 Comparison of Languages (Continued)

SQL Focus BASIC COBOL Fortran

Input/Output I/O of Records No No Yes Yes Yes I/O of Arrays No No No No Yes Implicit I/O Yes Yes No No No Format Control Automatic or Automatic or Programmed Programmed Programmed

Programmed Programmed only only only Data-directed I/O No No No No Yes

Subprograms Subroutines Nested Yes Yes Yes Yes Functions Limited Yes Limited Limited Limited Local/Global No Yes Limited Programmed Yes

Storage only Static/Dynamic No No No No No

Storage Entry Points No Yes Yes Yes One Pass Parameters No Yes Yes Yes Yes

Call by Address No No No No No Call by Value No No No No No Call by Name No Yes Yes Yes Yes

Reusability No Yes Yes Yes Yes

Reentrancy No No No No No

Recursion No No No No No

Concurrency Only when Yes No No No used with DB2

Exception Handling No Limited Limited Limited Limited

Nontechnical Uniformity High Medium-High Medium Medium Medium Ambiguity Low-Medium Low-Medium Medium Medium Medium Compactness High High Medium-High Low Medium-High Locality High High Programmed Programmed Programmed

only only only Linearity High High Low-Medium Low-Medium Low-Medium Ease of design

to code High High Low-Medium Low-Medium Low-Medium Compiler Yes, Medium, Medium, Medium-High Medium-High

Efficiency when used Mostly Mostly as embedded Interpreted Interpreted language; otherwise SQL is interpreted

Source code portability High High Medium High High

Comparison of Languages 653

TABLE 15-1 Comparison of Languages (Continued)

SQL Focus BASIC COBOL Fortran

Nontechnical, cont. Availability of

CASE tools Yes Yes No Yes No Code generators Yes No No Yes No Testing aids Yes No Yes Yes Yes

Maintainability High Medium-High Low-Medium Low-High Low-Medium

C Pascal PROLOG Ada Smalltalk

Data Types Real Yes Yes Yes Yes Yes Integer Yes Yes Yes Yes Yes Character Yes Yes Yes Yes Yes String Yes Yes, Limited Yes Yes Yes Boolean No, but can be Yes No Yes Yes

user defined Date No No No No No User-Defined Yes Yes No Yes Yes Pointer Yes No No Yes Yes Bit Identification Yes No No Yes Yes String-Mask No Limited Yes No No

Data Type Checking Typeless X Automatic Mixed mode X Pseudo strong X Strong TurboProlog X X

Operator Precedence () []-> not () ** not abs unary + - (unary) */ div mod + -unary * /mod rem binary ++-! ~ * + and-or mod div + - unary keyword

& size of =<> < <= > >= * / + - & binary (type) <in + - binary relational

* / % +-«» relational logical <= >=!= == operators short -circuit &/\

&&11 ?:

= op=,

No exponent No exponent No exponent No exponent operator operator operator operator

Operators Binary and Unary Yes Yes Yes Yes Yes

(Table continues on next page)

654 CHAPTER 15 Choosing an Implementation Language

TABLE 15-1 Comparison of Languages (Continued)

C Pascal PROLOG Ada Smalltalk

Operators, cont. Arithmetic +,-, * ,/ Yes, also % for Yes Yes Yes Yes

modulus Relational <,=,>,:::;;,~ Yes Yes Yes Yes Yes Logical and,or,not Yes Yes Yes Yes Yes Bit Yes No No Yes Yes Type Conversion No No No No No

Loops DO WHILE ... Simulated via BEGIN ... END iITrue FOR ... REPEAT ... WHILE ... ifFalse REPEAT ... WHILE ... FOR ... whileTrue

END INDEX ... BLOCK while False LOOP ... END LOOP

Exits RETURN RETURN FAIL EXIT GOTO CUT GOTO

RETURN

Conditional IF ... ELSE IF THEN None IF ... THEN iITrue Statements BEGIN ... ... ELSE ifFalse

END ... ELSEIF whileTrue ELSE ... ; CASE whileFalse

Arrays Linear Arrays Yes Yes Only as LIST Yes Yes Higher No limit to No limit to No No limit to No

Dimensional number of number of number of Arrays dimensions dimensions, dimensions,

Some dynamic Dynamic allocation allocation support support

Input/Output I/O Statements Only using No TurboProlog, Yes Yes

defined function else No I/O of Arrays Only using Yes No No No

defined function Implicit I/O Only using No TurboProlog, No No

defined function else No Format Control Only using Limited Yes Yes Yes

defined function Data-directed I/O No No No No No

Comparison of Languages 655

TABLE 15-1 Comparison of Languages (Continued)

C Pascal PROLOG Ada Smalltalk

Subprograms Subprograms Yes Yes TurboProlog, Yes Yes

else No Functions Yes Yes Yes Yes Local/Global

Storage Both Both Both Both Both Static/Dynamic

Storage Both No control Both Both Both Entry Points One per One per One per One per One per object

function routine program routine Parameters

Call by Address Yes No No Yes No Call by Value No Yes No Yes No Call by Name Yes Yes Clause name Yes Yes

as subgoal Reusability Yes Yes Yes Yes Yes

Recursion Yes Yes Yes Yes Yes

Reentrancy No Yes No Yes Yes

Concurrency No, unless Concurrent Depends on Yes Yes C++ Pascal only version

Exception Handling Yes No Yes Yes Yes

Nontechnical Uniformity Low-High Medium-High Medium-High Medium-High Medium-High Ambiguity Low-Medium Low-Medium Medium-High Low-Medium Low-Medium Compactness Low-High Medium-High Low-High Low-High Low-High Locality Low-High Low-High Low-Medium Low-High Low-High Linearity Low-High Low-High Low-High Low-High Low-High Ease of design

to code Medium-High Medium-High Medium Medium-High Medium-High Compiler High High Usually Medium-High High

Efficiency interpreted Source code

portability High Medium-High Low Medium-High Low Availability of

CASE tools No In academia, No Yes Yes yes

Code generators No No No No No Testing Aids Yes Yes No Yes Yes

Maintainability Low-High Low-High Low-High Medium-High Medium-High

656 CHAPTER 15 Choosing an Implementation Language

products also provide code generation of Cobol with embedded SOL providing DB access. Examples of CASE products are ADWTM and IEFTM. These prod- ucts have their own code generators and can inter- face to code generation software.

In terms of technical capabilities, SOL is limited. It is assumed that complex programming is done in some other language with SOL embedded as described above. SOL can define and modify data- bases, perform simple mathematical processing on fields for reporting, and generate default or cus- tomized reports.

Focus As a fourth-generation language, Focus consists of a database engine with its own query language, SQL compatibility, a full-screen processor, and language subsets for graphical, statistical, file maintenance, and intelligent processing. Focus DB supports rela- tional, hierarchic, and network files as well as pro- viding an interface to many popular mainframe DBMSs, such as IMS, IDMS, Adabas, Model 204, and so on.

Like SQL, Focus' main strengths lie in the non- technical characteristics of the language: compact- ness, locality, linearity, ease of code translation, portability, and availability of CASE tools for docu- menting analysis and design (see Table 15-1). Occasionally, Focus can be ambiguous in interpret- ing handling of data across a hierarchy or in multiple joined files.

Focus is a full-function database language. This means that files can be defined, maintained, vali- dated, modified by transaction processing, and queried all in the same environment and the same language regardless of the hardware/software plat- form. This high level of portability and full- function nature of the processing make Focus a pop- ular 4GL for rapid application development and user query processing.

A reentrant version of Focus is available to sup- port multiuser processing. Application code in Focus is not reentrant. A compiler is available for file mod- ify routines; otherwise, Focus is interpreted. Focus is

a language of defaults that does not support user- defined or user-managed resources.

BASIC BASIC is short for Beginner'sAll-purpose Symbolic Interchange Code. BASIC is present in this evalua- tion because of the number of applications written in it regardless of whether it were appropriate or not. BASIC is, well, basic. Nothing fancy is supported in this language, but all rudimentary processing is present (see Table 15-1). BASIC is fairly easy to learn and write, with reasonable levels of uniformity, compactness, and good automated testing aids. The remaining characteristics vary considerably from one version of BASIC to another. In particular, its portability is low-medium since the I/O commands usually must change to suit a particular environment.

BASIC does standard programming operations, supporting a limited, but standard number of data types, with no type checking. There are language constructs for loop, condition, and array processing. Files can be read and written.

BASIC is popular because a whole generation of college graduates was subjected to it as the basis for learning programming. Provided an application does not require any nonstandard processing, BASIC can perform adequately.

COBOL COBOL stands for COmmon Business Oriented Language. It is the most frequently used language in computer history and continues to maintain that status even though its demise is regularly reported as imminent. COBOL can be likened to a bus. Buses are uncomfortable, take longer than most other modes of transportation, but are suited to many types of trips. Similarly, COBOL is uncomfortable to code, it takes a long time to develop code, but it is suited to many business problems. As an all- purpose language, COBOL does most everything, and it is written in a language that is close to English.

COBOL input/output processing is consistently superior in efficiency and range of data structures supported (see Table 15-1). COBOL is not good for

real-time applications and cannot be used to code reentrant or recursive structures. It is teamed with multiuser software, such as CICS for telecommuni- cation interface processing or IMS DB/DC for tele- communication interface and database manipulation, to build effective interactive, multiuser applications.

In the nontechnical areas, COBOL rates high on availability of CASE tools, code generators, and testing aids. As the most frequently used language, it was first on the list of languages for which auto- mated support was developed. It is a highly portable language and is supported by many efficient com- pilers. In the other nontechnical areas, COBOL rates less desirable than SQL and Focus, but is compara- ble to or better than other procedural languages.

Fortran Shorthand for FORmula TRANslation, Fortran gained popularity as a number-cruncher language in the 1960s and has maintained a dwindling, but steady, popularity ever since. Fortran's weakness is in the data and file structures it supports (see Table 15-1). It does not interface to DBMS software and is limited to sequential, indexed, and direct files. Also, input/output processing of most Fortran com- pilers is slow, character operations are awkward and not recommended, and data format control is more limited than other languages.

Fortran's strength is in the efficiency of algo- rithms generated to perform numeric processing. Fortran's compilers usually are accompanied by a subprogram library that includes many frequently used algorithms for sort, statistical, and mathemati- cal processing. Subroutine and subprogram process- ing is facilitated through easily defined and accessed global and local variables. The mixed mode data typ- ing in Fortran is an important language feature because numeric processing will have different results depending on the definitions of the fields being processed.

Reusable programs can be developed using For- . tran, but no one would use Fortran to develop a com- plete on-line, interactive system. Rather, Fortran routines for numeric processing might be embedded in a system developed in some other language.

Comparison of Languages 657

c C is a high-level language developed to perform low-level processing. 1 Its generality and lack of con- straints coupled with autonomy of data structure definition and a rich set of operators make it an effective language for many tasks, including interac- tive, reusable, and recursive applications (see Table 15-1). A C program is a series of functions that are invoked by embedding their names in code. Transfer of control is automatic as is return processing. Sys- tem operators, called escape sequences, are embed- ded in the program and recognized by a preceding backslash '\'.

C is a concise, cryptic language that can be effi- cient in the hands of an experienced, skilled pro- grammer and can be a mess in the hands of a novice or poor programmer. "The language imposes virtu- ally no rules regarding design or structure of pro- grams and enforces nothing at all. This is not a dummy-proof programming language, and it cer- tainly is not for beginners" [Friedman, 1991, p. 398]. As such, the nontechnical aspects of the language all range from low to high because the rating depends on the skill of the programmer. For expert programmers who understand how to build reusable modules, C language provides the capabilities to build reusable libraries with applications built from them.

Pascal Pascal is a language designed to be unambiguous for teaching students of computer science.2 Programs in Pascal are free-format, but the language contains natural structuring syntax that can be indented to make the language easily readable.

ConcurrentPascal provides for real-time control over processing. Other versions of Pascal support development of reusable and recursive programs and

1 C was developed at Bell Labs by Kernighan & Ritchie, 1978.

2 For instance, Cooper & Clancy, 1985, is a frequently used Pascal text.

658 CHAPTER 15 Choosing an Implementation Language

subprograms (see Table 15-1). However, standard Pascal cannot use subroutine libraries since it assumes all program modules are instream, that is, embedded within the code of a single program. There is little control over interrupt processing in the language, so abends cannot be intercepted and redirected. I/O processing is more limited than some languages in not supporting random access files and in very limited string processing.

Pascal is similar to C on the nontechnical char- acteristics in that the readability, ambiguity, local- ity, and so forth of the language are dependent on the author using indentation and separation of state- ments to ensure these characteristics. But, unlike C, the language constructs of Pascal support readability once the indentation is done. Pascal requires less technical knowledge of hardware or operating sys- tems to be efficient.

Because Pascal was developed as a teaching tool, automated programming support environments are available at least in academic settings.3 These envi- ronments require the student to enter the construct desired; the software then displays a template of options for which the student fills in the blanks of the selected subconstructs. There are also many automated testing aids such as visual execution environments available to support Pascal pro- gram testing.

PROLOG PROLOG is short for PROgramming in LOGic. PROLOG is the only strictly artificial intelligence language included in this group. PROLOG was developed at the University of Marseilles in the early 1970s with the most common version in the United States that of David H. D. Warren. PROLOG is a goal-oriented, declarative language with constructs for facts and rules. PROLOG facts are pieces of concrete, factual information. A fact might be: "A part of a widget is a wid." Another fact might be:

3 Thomas Reps, MIT, developed a Pascal programming envi- ronment for Cornell as part of his dissertation [Reps, 1984].

"A wid weighs 1.25 pounds." PROLOG rules define how facts are assembled to make information. An example of a rule might be: "If a widget is overweight, check the weight and tolerance of each component."

PROLOG goals are data that match some selec- tion criteria, for example, the probable cause of a manufacturing problem specified in the query: What could cause finished widgets to be 3.2 pounds over- weight? Subgoals, which would be subprograms in the terminology of the other languages, are deter- mined from the goal. In the example above, widget components, their weight, weight allowances, and how each is used in widget manufacturing might all be subgoal information to be determined to answer the query. Goals are satisfied/answered by satisfy- ing all subgoals. When a subgoal fails, an alternative for arriving at similar information is found via logi- cal backtracking through the rules. The subgoal might remain unsatisfied, leading to a low level of confidence in the deduced answer.

Although the constructs for PROLOG are simi- lar in many ways to those of declarative, procedural, and object languages, there are many significant dif- ferences in both data and program processes (see Table 15-1). Data are facts that are normally stored in the program rather than as separate files. This is a limitation in using PROLOG for general purpose business processing.

Program control is maintained through the order- ing of clauses for execution and through the use of verbs like fail, which initiates backtracking by fail- ing a subgoal, or cut, which prevents any more back- tracking when a subgoal is fulfilled. Subprograms are simulated via call/return processing to clauses. Iteration is performed via recursive processing of rules.

How one rates PROLOG on the nontechnical aspects of the language depends on the size of the problem being automated. For small problems, the language can be compact, local, and linear. For large problems, the language can be highly ambiguous, noncompact, difficult to follow in a linear manner, and without local references to facilitate understand- ing. Ironically, PROLOG is viewed as a good lan- guage for novices with little exposure to procedural

languages. It is easy to learn if one can think in the goal-oriented manner of the language.

Smalltal'k Smalltalk was developed as both operating environ- ment and language during the 1970s at the Xerox Palo Alto Research Center by the Learning Research Group. It is an object-oriented language that treats everything as an object, even for instance, integers. Smalltalk is highly customizable and can, therefore, be used to design efficient applications.

Many important object-oriented concepts are embodied in the language, including abstraction, encapsulation, and some class processing (see Chap- ters 11 and 12). Abstraction is the definition of iden- tifying characteristics of an object. Encapsulation is the term used to describe the packaging of data and allowable processing on that data together. Objects communicate with each other only by message pass- ing. An individual object is an instance of a class. Classes describe objects that share common data and processes but that also may have data and processes that differ. For instance, the class employee might have subclasses manager, professional, and clerk. All subclasses are also employees and share that data and processing as well as their own. In addition, an individual might be a member of professional and manager classes at the same time.

Smalltalk is a full-function, unconstrained pro- gramming language that can literally be used to do anything (see Table 15-1). The major weakness of Smalltalk is that it does not specifically support per- sistent objects, also known as files. But if the file is an object, then it, too, can be processed in Smalltalk.

The strength of Smalltalk is in its use for event- driven processing as in process control, heating sys- tem monitoring, or just-in-time notification of manufacturing needs. These types of applications use nonpersistent messages from the external envi- ronment to drive the processing done by the appli- cation; these applications do not necessarily need files for processing. Similarly, message processing support in Smalltalk assumes point/pick devices, such as a mouse, for interactive, nonpersistent com- munication with the application user. The only major

Comparison of Languages 659

caveat on Smalltalk use is that object orientation, and therefore object-oriented programming, requires a different kind of thinking than procedural language programming such as COBOL.

Ada Ada, the official language ofthe U.S. Department of Defense, with a user population in the hundreds of thousands, has had more thought about its imple- mentation than any other language. Ada was named after Ada, Countess of Lovelace, who originated the idea for stored programs to drive the use of comput- ing devices.

Ada's design by committee has not resulted in a perfect language, but in one that is better than most. Current versions of Ada are object based rather than object oriented. In object-based applications, pro- grams are cooperative collections of objects, each of which represents an instance of some object type. All object types are members of a hierarchy of types which are linked through processing rather than through inheritance relationships. Classes, rather than types, are not formally recognized; there are no persistent objects such as files, and inheritance is not supported (see Table 15-1).

Ada files, as in Smalltalk, are defined as a type within the constructs of the language and all pro- cessing is on the type. Also, there is no real message processing in Ada, at least as of 1992. Rather, the system is fooled through function calls and parame- ter passing to simulate message processing. Like Smalltalk, Ada's strength is its ability to support event-driven processing, like missile guidance in embedded defense-related systems.

Future versions of Ada are expected to adapt mul- ticlass inheritance structures and processing, dynamic binding of objects, real message process- ing, and persistent objects that provide a variety of data structures. With these extensions, Ada is suit- able for virtually any application. The same warn- ing about the difference in object-oriented thinking expressed about Small talk is also appropriate here: Object-oriented design and program development is different in kind than procedural development of applications via languages such as COBOL.

660 CHAPTER 15 Choosing an Implementation Language

PROGRAMMING ____ _ LANGUAGE ______ _ EVALUATION _____ _

Two ways of matching program languages are con- sidered in this section. The first is to match the pro- gramming language to the application type (from Chapter 1). The second is to match the language to the methodology used for developing the application (from Chapters 7-13).

Language Matched to Application Type Few heuristics have been available to guide pro- grammers in matching a programming language to application type. The lack of heuristics is due mostly to the newness of most languages and their restricted use in academia (e.g., Pascal and PROLOG). Part of the reason for a lack of heuristics is also because most businesses have developed only transaction processing applications until the late 1980s; one or two languages were sufficient for most computing in the organization. With the development of query lan- guages, AI applications and object orientation, more languages have proliferated and heuristics have slowly developed. Keep in mind that as experience with emerging paradigms, such as object orientation and intelligent applications grow, the heuristics will be refined and changed from those presented here. For each application type discussed in Chapter 1, the normally relevant characteristics and language choices are discussed below and summarized in Table 15-2.

Transaction processing applications are divided for classification into batch, on-line, and real-time as the predominant form of processing. For batch applications, COBOL and Focus are best suited (see Table 15-2). For on-line applications, all languages except Fortran and PROLOG might be used. For- tran is excepted because of its poor I/O processing; PROLOG is not recommended because data are usu- ally embedded in the code, precluding most TPS processing. Language actually chosen should be based on the transaction volume, with high volume

TPS moving away from the SQL and 4GL languages toward compiled, full-function languages. If there is a DBMS or other special data access software, the choices narrow to Focus or COBOL depending on the specific DBMS.

Some business systems are specialized because they are real-time and have stringent response time requirements in addition to being critical to at least one organization. Examples of real-time TPS include airline reservations, securities transaction process- ing, manufacturing process control, robotics control, or analog I/O applications. For such systems, the language recommendations are restricted to C, Pas- cal, Ada, and Smalltalk (see Table 15-2). Any of these languages can be used to develop reentrant, multiuser, real-time applications, although attention to a specific dialect (or vendor version) is required to choose a reentrant version of the language. An alternative is to develop such applications using assembler language as the reentrant base with one or more of the application languages used for indi- vidual modules.

Query processing is restricted to SQL, Focus, and PROLOG (see Table 15-2). SQL, Focus, and PRO- LOG support declarative statements of what is desired without having to anticipate the outcome in advance. As such, they are the only three languages of these ten to support query processing. PROLOG has the added feature that it can explain its reasoning process and provide probabilities of accuracy for its data. Both SQL and Focus assume they are work- ing on complete information and there is only one answer to a given query. PROLOG can be pro- grammed to develop confidence estimates in answers as well as to develop all possible answers to a query.

Data analysis applications are those in which sta- tistical routines, trend analysis, or other mathemati- cal manipulation of data is desired. Data analysis applications can be programmed or can use pack- ages combined with programs. For such applica- tions, Focus, Fortran, Pascal, PROLOG, Ada, and Smalltalk might be used (see Table 15-2). COBOL is conspicuously absent from this list because it is not as adept at data analysis as other languages. Focus provides statistical modeling, financial modeling, graphical processing, and query processing all

Programming Language Evaluation 661

TABLE 15-2 Application Type Matched to Language

Application Type SQL Focus BASIC COBOL

TPS-Batch X X X X

TPS-On-Line X X X X

TPS-Real- Time

Query X X

DSS/Data Analysis X X

AI/Expert Systems

EIS X

within its one language. As such, it is the most full- function data analysis tool in this group. The other languages have the individual tools for a program- mer to build a data analysis application, but the assumption is that some processing would be done by general purpose modeling languages (e.g., Sta- tistical Analysis System-SAS.4 If complex simul- taneous equations are required, Focus is not the appropriate language. Then, choices are restricted to Fortran, Ada, or Smalltalk. Fortran does not actu- ally provide simultaneous equation solutions, but it can be 'fooled' into performing as if it does. The other languages are better choices for simultaneous equation processing. Some dialects of C (Le., Con- current C) and Pascal (i.e., Object Pascal) might also be used for simultaneous equations.

ESS or DSS applications may have changing requirements that are not well understood due to the unstructured nature of the problem domain. For such applications, C, Pascal, PROLOG, Ada, or Smalltalk might be used (see Table 15-2). One or more of these languages might be combined with purchased soft-

4 SAS is a registered trademark of the SAS Corporation, Cary, NC.

Small- Fortran C Pascal PROLOG Ada talk

X X X X

X X X

X X X X X X

X X X

ware packages to provide all the functions of such applications.

GDSS applications almost always use packages to support group decision processes, but might use C, Pascal, PROLOG, Ada, or Smalltalk for part of the processing, depending on the environment (see Table 15-2).

Finally, artificial intelligence applications, specif- ically expert systems, might use PROLOG (see Table 15-2). Only PROLOG supports inference through logic programming. None of the other lan- guages is appropriate to AI applications.

Language Matched to Methodology The experience with methodologies is similar to that of languages in that few heuristics are known to guide methodology selection. Rather, at the present time, a company tends to adopt and learn one methodology and it is used for all applications, whether appropriate or not. The position taken here is that the methodology and language should match the application type. In this section, the ten

662 CHAPTER 15 Choosing an Implementation Language

TABLE 15-3 Application Type Matched to Methodology

Methodology SOL Focus BASIC COBOL

Process X X X X

Data X X X

Object

languages are matched to methodologies which were discussed in Chapters 7-13.

Process methodologies which prevailed in busi- ness until the mid-1980s are most successfully used with SOL, Focus, BASIC, COBOL, Fortran, C, Pas- cal, and Ada (see Table 15-3). The other languages require too much attention to data or program design to lead to optimal language use with process meth- ods. Also, the use of process methods should not be used with data-intensive applications because of the lack of specific attention given to data with such methods. The C-language is here because it is process oriented; if C++ were the language, it should only be used with object-oriented (00) methods. Similarly, Ada can be used here but it is best used with 00 methods.

Data methodologies balance the design of pro- cesses and data evenly and are useful with SOL, Focus, COBOL, C, and Ada applications (see Table 15-3). For interactive applications in which the pro- grammer needs only limited control, SOL and Focus are useful. For more complex applications, COBOL, with a DBMS and telecommunications monitor, provides interactive processing capabilities. The process discussion on C and Ada applies here; both languages can be used with data methods but are recommended with 00 methods.

Finally, for object methodologies, C++, PRO- LOG, Ada, and Smalltalk are most likely to lead to successful implementations (see Table 15-3). The languages omitted in the object category do not eas- ily support one or more of the object tenets of poly- morphism, message passing, class inheritance, or encapsulation.

Small- Fortran C Pascal PROLOG Ada talk

X X X X

X X

c++ X X X

AUTOMATED------------ SUPPORT FOR __________ _ PROGRAM ____________ _

DEVELOPMENT ----------------- In the age of the smart machine, the availability of developmental aids, CASE environments, code generators, and testing aids such as debuggers, incremental compilers, windowed execution envi- ronments, and so on, all speed development of work- ing code. Any language which has such automated development aids is assumed to lead to increased programmer productivity over languages that do not have such aids (see Table 15-4).

CASE tools frequently have built-in code generators or have interfaces to other vendor's code generators, allowing you to mix and match the de- velopment environment and the language generated.

The automated support tools include code gener- ation tools, incremental compilers, and program gen- eration environments. All of these are loosely called Lower CASE or Back-end CASE tools.

SUMMARY -----------,------In this chapter, a number of distinguishing charac-

teristics of languages were defined. These included: data type definitions supported, data type checking, operators supported, type of user processing sup- ported, and processing for loops, conditional state- ments, arrays, I/O, and subprograms. In addition, nontechnical characteristics included uniformity,

TABLE 15-4 Automated Support Tools for Code Generation

Product

ADW-Construction Workbench

C Development Environment, OOSD/C++

Developer Assistant for Information Systems (DAISys), Secure user Programming by Refinement/DAISys

lEW

NeXTStep 3.0

ObjectMaker

Software Through Pictures

System Architect

Teamwork, Ensemble

Visible Analyst Workbench

Company

Knowledgeware, Inc. Atlanta, CA

Environments (IDE) San Francisco, CA

S/Cubed Inc. Stamford, CT

Texas Instruments Dallas, TX

NeXT Computer Redwood City, CA

Mark V Systems

Integrated Development

Popkin Software & Systems Inc. New York, NY

Cadre Technologies Providence, RI

Visible Systems Corp. Newton, MA

References 663

Technique

Builds Pseudocode for modules that can be used to Generate Code for MsDOS,MVS

Object-oriented C++ code development environment

Generates COBOL for IBM mainframe, AS/400, OS/2

Generates C Code for MSDOS, OS/2

Generates COBOL with Embedded SQL

Generates C Code for MVS, MsDOS, OS/2

Interfaces to Telon and other Code Generators

Object Oriented DB development environment

Generates C or C++ Code for MsDOS, VMS, Unix,AIX

Generates C or C++ Code for Unix, AIX

Generates C Code for MsDOS, OS/2

Generates C or C++ Code for for Unix, OS/2, AIX

Generates C Code for MsDOS

ambiguity, compactness, locality, linearity, ease of code translation, portability, compiler efficiency, and availability of CASE, code generation, and testing tools. Each of ten languages were described accord- ing to the characteristics. Then the languages were defined as appropriate for supporting different appli- cation requirements and were discussed in terms of their support for development of transaction, query, data analysis, DSS, ESS, and ES applications.

REFERENCES __________ __

Ageloff, Roy, and Richard Mojena, Applied Fortran 77 Featuring Structured Programming. Belmont, CA: Wadsworth Publishing, 1981.

Alcock, B., Illustrating Pascal. New York: Cambridge University Press, 1987.

Barnes, 1. G. P., Programming in Ada, 3rd ed., Reading, MA: Addison Wesley, 1989.

664 CHAPTER 15 Choosing an Implementation Language

Barnett, Eugene H., Programming Time-Shared Comput- ers in Basic. New York: John Wiley, 1972.

Bjorner, D., and C. B., Jones, The Vienna Development Method: The Meta-Language. New York: Springer- Verlag, 1978.

Booch, Grady, Software Engineering with Ada, 2nd ed., Menlo Park, CA: The Benjamin/Cummings Publish- ing Co., Inc., 1987.

Bordillo, Donald A., Programmer's COBOL Reference Manual. Englewood Cliffs, NJ: Prentice-Hall, 1978.

Clocksin, William, "A prolog primer," Byte, August, 1987, pp. 146-158.

Cooper, Doug, and Michael Clancy, Oh! Pascal!, 2nd ed., New York: W. W. Norton & Company, Inc., 1985.

Date, C. J., and Colin While, A Guide to DB2, 2nd ed., Reading, MA: Addison-Wesley, 1988.

Friedman, Linda Weiser, Comparative Programming Languages: Generalizing the Programming Function. Englewood Cliffs, NJ: Prentice-Hall, 1991.

Gear, C. W., Programming and Languages. Chicago: Science Research Associates, 1987.

Goldberg, Adele, Smalltalk-80: The Interactive Program- ming Environment. Reading, MA: Addison-Wesley Publishing Co., 1984.

Higman, B. A., Comparative Study of Programming Lan- guages. New York: American Elsevier, 1967.

Information Builders, Inc., Focus Users Manual. New York: IBI, Inc., 1984.

Kernighan, Brian W., and Dennis M. Ritchie, The C Programming Language. Englewood Cliffs, NJ: Prentice-Hall, 1978.

Martin, J., Fourth Generation Languages, Vols. 1-2. Englewood Cliffs, NJ: Prentice-Hall, 1985.

S. Medema, C. H., P. Medema, and M. Boasson, The Programming Languages: Pascal, Modula, Chill, and Ada. Englewood Cliffs, NJ: Prentice-Hall, 1983.

Nagrin, Paul, and Henry Ledgard, Basic with Style: Pro- gramming Proverbs. Rochelle Park, NJ: Hayden Books, Inc., 1978.

Philippakis, A. S., and Leonard J. Kazmier, Advanced COBOL Programming, 2nd ed., New York: McGraw- Hill, 1983.

Reps, Thomas W., Generating Language-Based Environments. Boston, MA: MIT Press, 1984.

Stroustrup, Bjorn, "Data abstraction in C," AT&T Bell Labs Technical Journal, Vol. 63, October 8, 1984, pp. 1701-1732.

Warren, David, H. D., "The SRI model for Or-parallel execution of PROLOG-Abstract design and imple- mentation issues," Proceeding, 1987 International Symposium on Logic Programming, August 31- September 4, San Francisco, CA, IEEE, pp. 92-102.

KEy TERMS ______ _

Ada ambiguity array array-oriented I/O automatic type coercion BASIC bit data type Boolean C case statement character string COBOL compactness compiler efficiency conditional statement control language

constructs data type data type checking date data type dynamic memory

management ease of code translation exception handling exit Focus Fortran global data input/output (I/O) integer language constructs linearity list -directed I/O local data locality logical data type

loop memory management mixed mode type

checking modularization object operator precedence Pascal persistent object physical I/O pointer portability programming PROLOG PROLOG facts PROLOG goals PROLOG rules PROLOG subgoals pseudostrong type

checking reentrant real number record-oriented I/O recursive reusability set-oriented I/O Smalltalk SQL static memory

management strong type checking table typeless checking uniformity user-defined data type

EXERCISES _______ _

1. For any (or all) of the cases in the Appendix, define the application concept as batch, on-line, real-time, or a mix of these. For the applications you choose, select an implementation language and develop the reasons why the language you recommend is best. What specific features and characteristics of the language make it your pre- ferred choice?

STUDY QUESTIONS ___ _

1. Define the following terms: Boolean data type reentrant dynamic memory set -oriented I/O

management local data modularization operator precedence pointer

static memory management

type checking user-defined data type

2. Why should we concentrate on language selec- tion rather than on programming?

3. In your opinion, is programming going to dis- appear as an activity? Justify your response.

4. What is a data type and why is it important in language selection?

5. When is strong type checking important? 6. Why do you think type checking is absent from

a language like COBOL? 7. Why is type checking important in object-

oriented programs? 8. Define three logic-related language constructs

and discuss their differences. 9. What is operator precedence? Why, as a pro-

grammer, must you be aware of operator prece- dence in a language?

10. In an ideal program, how many exits should a module contain? Why?

11. Define the three types of arrays that are com- monly supported in languages.

12. For SQL, COBOL, Fortran, Ada, C, and Pas- cal, define the type of I/O orientation as record- oriented, set-oriented, array-oriented, or list-directed. What difference does the I/O ori- entation make?

13. What are the differences between local and global data? How do they relate to properties of programs such as reusability, reentrancy, and recursion?

14. Contrast static and dynamic memory management.

15. Why is exception handling desirable in a lan- guage? Why don't all languages support excep- tion handling?

Study Questions 665

16. What level of code sophistication is required to support multiple concurrent users? Why?

17. What is the relationship of recursion, reentrancy, and reusability of programs?

18. List three nontechnical language characteristics and describe why they are important in lan- guage selection.

19. Define language portability. Is this property of growing or decreasing interest to businesses, and why?

20. What is COBOL's appeal? 21. Why is C a potentially dangerous language? 22. Describe how and why PROLOG differs so

much from the other nine languages in this chapter.

23. How does PROLOG handle databases? 24. What are the object-oriented languages? How

do they differ from the other languages? 25. Even though SQL and Focus both use implicit

I/O, they are different. What is the main differ- ence in the way they treat data? Which lan- guage is 'cleaner' in guaranteeing the results of a query?

* EXTRA-CREDIT QUESTIONS 1. PROLOG is not the only logic-oriented, artifi-

cial intelligence programming language. Lisp is also popular. Investigate the differences between the two programming languages using the char- acteristics discussed in this chapter.

2. Object orientation and artificial intelligence are two characteristics of applications that are of growing interest to businesses. Can a typical COBOL transaction processing application in- corporate object and AI tenets? Will COBOL change or will other languages come to be used? Can other languages be 'grafted on' or inter- faced to COBOL gracefully? Be sure to docu- ment your arguments.

C HAP T E R 16 PURCHASING ----------------------------------------------------~. HARDWARE AND '----------------------------------------------~ SOFTWARE ----------------------------------------------------------~

INTRODUCTION ____ _

When PC software companies first created the end- user market in the early 1980s, the number ofPCs in companies was about one per every 4,000 people. By 1986, the number of PCs was about one per every 100 people; companies had settled on stan- dard, supported products for spreadsheets, data- bases, and word processing. In the intervening years, there was a mad scramble for market share during which vendors' claims were sometimes unfounded, the notion of vaporware was created, and major evaluations were done by buying companies. For every new market that develops, a similar set of activities takes place. In the 1990s, object-oriented languages, expert systems, imaging systems, multi- media, CASE products, and distributed databases are the new markets that will have developed rec- ognized leaders by the end of the decade. At best, a company selects a product and vendor that will weather the storms of industry growth and emerge a leader. At worst, they purchase several products before settling on one that works for their company.

The purchasing process tries to minimize the guesswork and provide a rational, objective method of selecting hardware, software, or services. The techniques can be used on products of any type. There are two basic processes, one informal and one formal. There is a great deal of overlap in the activ-

666

ities. The major difference is that the formal process is usually conducted in a more open environment, frequently for legal compliance. All governmental contracting for goods and services, for instance, is subject to a formal procurement process that includes the solicitation of proposals from vendors.

In this chapter, we discuss how to evaluate and choose between alternatives for application use. The trade-off between building the item in-house or pur- chasing it elsewhere is commonly called a make- buy decision. This name is not always accurate, however, because you might be comparing develop- ment alternatives, for instance, having a consulting company build a customized application versus pur- chasing a software package. These alternatives all are considered in the make-buy decision process. RFPs can be used for deciding between vendors that have the same package but are selling turnkey prod- ucts including all hardware and software in an 'environment,' or for hardware only, software only, services only, or some combination of those three.

In this chapter, we first discuss the formal pro- curement process, describing the steps performed in the purchasing decision process. The informal pro- cess is then described and compared with the for- mal process. Then, the contents of each RFP section are detailed. Next, we discuss the selection process and criteria that are important to it. Finally, auto- mated support tools for RFP management and eval-

uation are presented. The ABC case is woven throughout the discussion, providing examples of the major points.

REQUESTFOR __________ __ PROPOSAL ____________ __ PROCESS ______________ _

A request for proposal, or RFP, is a formal, writ- ten request for bids on some product. In our context, an RFP might relate to hardware, firmware, soft- ware, or services such as programming or operations management. Also called RFQ, for request for quotation, an RFP provides formal requirements, ground rules for responses, and, usually, a stan- dard format for the proposal responses. The basic stages of the request for proposal process, which are discussed in the ensuing sections, include the following:

1. Develop and prioritize requirements 2. Develop schedule and cost 3. Develop requests for proposal 4. Receive proposals 5. Evaluate proposals and select alternative

Develop and Prioritize Requirements The initial step in all software engineering projects, regardless of whether it is going out for bids or not, is to determine the requirements. When proposals are solicited, the requirements define the problem and the features and functions of the solution that will constitute the work of the bidding companies. In general, the requirements provided in an RFP are identical to those developed during analysis. If a requirements specification is available, it should be appended to the RFP and referenced in the docu- ment. If no requirements specification has been developed, at a minimum, the topics summarized below should be provided.

1. General instructions 2. Statement of work

Request for Proposal Process 667

3. Technical specifications 4. Management approach 5. Financial requirements 6. Company information requirements 7. Vendor response guidelines 8. Standard contract terms and conditions

The level of detail and specificity of the require- ments varies with the context, situation, and company. Some companies spell out every item in excruciating detail, leaving nothing to the vendors' imaginations. The advantage of such detail is that the proposals can be easily compared to the list of requirements to determine compliance with the basic request. Also, the likelihood of misunderstanding of requirements is lower when more detailed descrip- tions are used. The disadvantage of detailed require- ments is that, in information systems work, the complex engineering nature of the work frequently requires creative design that might be stifled or over- shadowed by too specific a requirements list. The creative aspects of systems design also provide for cost differentiation that might not otherwise surface. To overcome this problem, when creativity is de- sired, it can be specifically identified as a selection criteria in the RFP.

There are four types of requirements: technical, managerial, financial, and company. Technical requirements address the specific hardware, soft- ware, or services to be provided. Managerial requirements identify the level of detail at which schedule, staff plans, and staff management should be discussed in the proposal. Financial require- ments list the type of bid desired and the expected format for the financial portion of the response. Company requirements list the type of vendor information to be supplied to assure the client of vendor ability to complete the work successfully. The details of each section are discussed in the RFP contents section.

Develop Schedule and Cost The schedule and cost developed during an RFP process are neither as detailed nor as refined as if the item costs were developed in-house. If the in-house estimate is being compared to the vendors'

668 CHAPTER 16 Purchasing Hardware and Software

estimates in a make-buy decision, a detailed sched- ule and cost should be developed. If the RFP is com- paring only external purchase options, less detail and precision are required. In this case, the schedule pro- vides an estimated end-date for the item to be used in comparing the proposals. The expected end-date might be omitted and left as a proposal item, or might be listed as either required or desired in the proposal.

Occasionally, a user manager will mandate the desired completion date for a project. In that case, the in-house estimates are developed to determine the realism of the mandated date. If the date is unlikely because it is very different from the esti- mate, the vendors can be asked in the proposal re- quirements how they deal with completion date problems and a tight schedule.

The planning process is the same as that followed in Chapter 6, with the level of precision adjusted to fit the situation. Requirements are converted into a task list. Each task's development time is estimated for the most likely outcome. Sophisticated estimates, including optimistic, average, and pessimistic times, mayor may not be developed. During the proposal evaluation process, vendor time estimates are com- pared to the planned completion date.

A similar activity is done for personnel estimates. A rough estimate of the number of people and their skill levels should be developed, based on the tasks and times for each task. During proposal evaluation, the estimated project team skills are matched against the skills of the people to be assigned to the project by each vendor. The closeness of match indicates several things. First, the closer the match, the more confidence you can have that the vendor understands the problem. Second, the closer the match, the more likely the vendor's reasoning is consistent with your reasoning about the project's needs. Third, the less close the match, the more likely the vendor is staff- ing the project with people who are learning new skills and who, therefore, will not be fully knowl- edgeable about the technology or application area of your problem. This third case is not necessarily bad, but it does imply that there will be one, or pos- sibly two, key person(s) on whom the success of the project rests. This places you, as the client, in a somewhat more vulnerable position because you

must rely totally on the key person(s), ensuring that they remain on the project until it is operational.

Staffing estimates are used to develop personnel costs for the project. If the proposal includes hard- ware or software, each item should be priced at the best retail prices available. For instance, MacWorld and PC magazines include tear-out pages of adver- tising by discount vendors for both hardware and software. Professional data sources, such as Data- Pro™,l provide retail prices which can be used as a basis to which proposed costs might be evaluated.

Develop Request for Proposal The steps in developing the RFP are first, to deter- mine likely vendors; second, select from the likely vendors the few that best meet your require- ments; and third, develop and send the proposal to the vendors.

Determine Likely Vendors

Several stages of information gathering precede the actual bidding process. First, potential vendors are identified. Vendor identification can be from a com- mercial information service, such as DataPro TM, or from trade magazine advertisements, for instance, from PC Magazine, Computerworld, or Network Week. This process should identify ten or more vendors.

Narrow the Number of Vendors

When potential vendors are identified, they are con- tacted and requested to send information. Depending on the comp~ny and item, this can be an informal telephone call or can be a formal, written request for information (RFI). Documentation on the prod- ucts requested is reviewed to narrow the number of alternatives to a manageable few, usually between two and five.

The information review frequently identifies a need for more information to differentiate between products. Either requirements are refined or more information is obtained, or both. Another round of

1 DataPro is a trademarked name of DataPro, Inc., Delran, NJ.

information gathering might then take place. At this point, remaining vendors might be called in to pre- sent their product( s) and demonstrate how they work. Specific technical questions to provide miss- ing information are asked.

The decisions after this round of information gathering depend on the nature and use of the prod- uct being purchased. If the number of users is small and the product is inexpensive (e.g., under $10,000), a selection might be made. The more users and the more expensive the product, the more extensive the evaluation. Other companies that use the product might be solicited for experience with the company and product, and perhaps, are visited for an on site demonstration. In these cases, when the field of ven- dors is narrowed to between two and five, an RFP is developed and proposals are requested.

Develop and Send the Proposal to Vendors

The RFP can be developed in parallel with vendor identification. There is some risk that doing so, how- ever, will produce a biased requirements set that favors one particular vendor. The best approach, therefore, is to develop the requirements first, then search for vendors. When the vendor list has been narrowed to between two and five, the RFP is final- ized, vendors are notified that they will receive the proposals, and the proposals are sent or delivered to each vendor. From this point, the requesting com- pany begins to manage the proposal process.

Manage Proposal Process The proposal process begins with release of an RFP to vendors and continues until the proposals are delivered and the selection process begins. The pro- posal process might include one or more formal meetings, informal meetings, inquiry sessions, or other methods of information exchange between the vendors and the requesting organization. The more money involved and the more complex the proposed work product, the more process management is needed to ensure equitable treatment of all vendors. Equitable treatment means ensuring that all vendors receive the same information. Firm compliance with

Request for Proposal Process 669

due dates and locations for delivery of proposals is maintained. Late or incorrectly delivered proposals are dropped from further consideration, providing equitable treatment of all vendors.

Assume a proposal is being let by the local police department for development of an applica- tion that would deploy computer terminals in each police car for interactive look-up of license plates, arrest warrants, and moving violations. The appli- cation requires both hardware and software to be developed for 14,000 police cars in a large metro- politan area with over 3,000,000 inhabitants and covering several jurisdictions. Examples might be Washington, D.C., Los Angeles, New York City, Houston, or Chicago. Hardware cost alone is over $2,000,000. The databases each will have millions of entries with issues to be resolved about how and when information is removed from the files. Inter- faces to several other applications for license plate information and access to arrest warrants from mul- tiple local and national databases are desired.

The proposed application has several sources of complexity, the least of which is that vendors prob- ably know little about how a police officer spends his day. When New York City let a similar contract for its police force, they had a formal announcement of the proposal to vendors. Vendors were selected and invited to the presentation by mail based on previous contract work or reputation. Nonsolicited vendors were also welcome in response to announcements of the RFP that ran in the local newspaper for sev- eral days.

At the formal presentation, each vendor was invited to spend up to four hours traveling with an officer to view the tasks firsthand, for which the application would be built. A specific officer was identified as the liaison for these tours.

In addition, the liaison officer was available for questions at any time until proposals were submitted. If questions were asked by a vendor, the question and response were recorded and a list of all such queries was sent to all vendors attending the pro- posal announcement meeting. The purpose of pro- viding all queries and responses to all vendors was to ensure that information inadvertently left out of the RFP that might alter the decision process could not be used by one vendor to the detriment of the

670 CHAPTER 16 Purchasing Hardware and Software

others. By giving everyone all responses, every ven- dor had the same information.

Halfway through the two-month proposal pro- cess, another meeting was held for vendors to come ask more questions and to clarify the requirements from the document. That meeting was well attended but contained no real information. When one person was asked why he bothered attending, he replied, "To see what the competition asked."

Each vendor presented his or her proposal on the due date and left the written copy for NYC review. Each vendor, then, heard the other vendors' propos- als and had some sense of the differences between them. Ironically, the company with the best solution lost because the company was too small. One short- coming of the RFP was that it had not identified company size as a selection criterion; if it had, the vendor would not have wasted his time bidding.

Evaluate Proposals and Select Alternative The sections of the proposal responses are each eval- uated separately, then summarized together. The technical evaluation reviews that requirements are met and scores the proposal based on the priority cri- teria developed during the preparation of the RFP. A benchmark, or comparison test, might be used to identify differences between hardware or software packages.

The management approach is evaluated for the type, quality, and nature of staff and vendor com- pany resources proposed for the work. A financial evaluation is developed to show the present value of the proposed amount(s). Other analysis, such as payback period, or average cost per vendor employee, might be developed for comparison pur- poses. Next, the vendor's prior experience with the firm, similar applications, and business reputation are ranked to evaluate the vendor's capability to do the proposed work. Finally, each section is weighted again for comparative section importance, creating a summary of the ratings and final weighted score for each vendor. Objectively, the vendor with the high- est, overall weighted score is selected for the work. Each type evaluation is discussed in the evaluation

sections. After selection, a contract is negotiated and work begins.

INFORMAL ____________ _ PROCUREMENT ________ __ Most of the same information required for the RFP is required for the informal procurement process. The major difference is in the approach. In the informal process, few, if any, written documents are used for vendor-client communications. Rather, telephone calls, meetings, and document reviews are the major sources of information. The process of selec- tion is similar to that of the RFP process, including trials and benchmarks for acceptance of the item being procured.

Negotiation is verbal and may go back and forth between the principals for several weeks. Vendors signify agreement with the negotiated terms via a memo. A memo proposal summarizes the main points of agreement, then lawyers are called in, as with an RFP, to add the legal terms.

CONTENTSOFRFP ______ _ RFP contents include a summary, information on the technical, managerial, company and financial aspects of the bid, a schedule of the process, selec- tion criteria, vendor response requirements, and any standard contract terms (e.g., for EEO or OSHA compliance). Each RFP section is detailed below to identify optional and required information.

Vendor Summary The Vendor Summary section provides a short, one- page summary of the work to be done (see Table 16-1). General terms and conditions of the proposal process are usually first to allow vendors to quickly decide whether or not they are interested in the engagement. The contents of the general instructions sections should include proposal instructions, loca- tion and date for proposal delivery, dates for bidders' conferences, and contacts for status reporting and inquiries.

TABLE 16-1 Detailed RFP Outline

1.0 General instructions

2.0 Statement of work 2.1 Description of work to be performed 2.2 Project milestones and deliverable products 2.3 Criteria for vendor qualification

3.0 Technical specifications Technical outlines are in Tables 16-4, 16-5, and 16-7 for hardware, network, or operating system, and customer software or package, respectively.

4.0 Management approach 4.1 Schedule and staffing 4.2 Support requirements of vendor 4.3 Reporting 4.4 Staff reporting structure and problem

management

5.0 Financial requirements

6.0 Company information

7.0 Vendor response guidelines

8.0 Standard contract terms and conditions

Required Information The requirements list details the requirements of the work as described in the sections on hardware and software. The section can refer to an attached docu- ment that might have been developed in-house for functional requirements of the application, hard- ware, or software. In any case, requirements should be listed and identified as mandatory or optional. A set of prioritized weights for the requirements should also be developed for use in scoring, but weights should not be published in the RFP. There are four general classes of requirements: technical, manage- ment, corporate, and financial.

Technical Requirements

GENERAL REQUIREMENTS. The require- ments should place the company and problem in a context for the vendors. First, a brief overview of the industry, company, and work domain is appropriate. Then, a summary of the problem being automated is presented. The major complexity, such as geo-

Contents of RFP 671

graphic dispersion across 16 states, should be identi- fied. Then, the details of work to be provided are described.

DETAILED REQUIREMENTS. The work might include hardware, software, programming services, or other IS services. The criteria for each item should be detailed as much as possible. In general, regard- less of the type of procurement, the features and functions of the equipment should be described in sufficient detail to enable the vendor to design a solution. Functional requirements-what-the item is expected to do are described in detail. Volume of data, throughput, response times, and growth requirements are identified. The type, contents, tim- ing, and format of interfaces also are provided. A hardware interface might list, for instance, a network interface connection to a fractional T-1 (cable) ser- vice for internetwork communication. A software interface might list, for instance, a DBMS interface connection to a SQL server. An application inter- face might list, for instance, electronic messages to be sent to an Accounts Receivable Application.

For services, the work description varies depend- ing on the work. The two most common service RFPs request proposals for software development and for outsourcing of operations. For software services, the application requirements are the infor- mation provided. For outsourcing operations, the business functions included and any existing job descriptions relating to those functions should be provided to the vendors.

Diagrams, tables, and lists should be supple- mented by text to provide clarification of incom- plete, misleading, or ambiguous diagrams. For instance, a data flow diagram cannot describe timing of processes or process interrelationships that might be important. They also do not include constraints, need for simultaneous processes, and so on. Re- quirements for these items would be described in text as required.

AUDIT AND APPLICATION CONTROL RE- QUIREMENTS. Recall from Chapter 10 that audit controls are frequently needed to prove processing. The audit and control section of the RFP identifies the minimum acceptable level of auditability required. If audit controls are in

672 CHAPTER 16 Purchasing Hardware and Software

compliance with laws or other professional guide- lines, the requisite laws and guidelines should be referenced.

Vendors' designs might assume human interven- tions to ensure accurate application processing. A requirement should be developed to surface such assumptions. For instance, controls might include data integrity, data and process access, exception management, and print control of prenumbered doc- uments (e.g., checks). These examples usually require manual interventions supplemented with interactive processing to recover from failures or to fix hardware problems. For instance, a check might jam in a printer after it is printed. Both software and human procedures are required to reprint the check and to account for the damaged check. (See Chapter 10 for types of failures that should be planned.) Ven- dors should be required to identify and detail all such interventions as part of their proposal.

PERFORMANCE REQUIREMENTS. Perfor- mance requirements include manual, hardware, and software performance. For instance, hardware per- formance might define acceptable limits for down- time, precision for mathematical computation, or cycle time.

CONVERSION REQUIREMENTS. Recall that conversion requirements define the required changes from the current environment to the new automated environment (see Chapter 14 to review this discus- sion). The RFP typically identifies data for conver- sion, including the current format, current volume, and growth required. Conversion timing constraints should be identified if any exist. The vendors' designs should describe the target database for the data and a migration path for conversion. The ven- dors' conversion plans also should estimate conver- sion impacts on users, computer operations, and project staffing.

TRAINING. Training to be provided as part of the contract should be listed as a required topic of the vendors. Training options can be left open to vendor proposal or be specified as requirements. Training might be provided for users, software main- tenance staff, operations staff, or user support staff.

The type of trammg can be one-on-one, pro- grammed, individually self-paced, classroom, computer-based training, or some variation of these. Training information provided might include the type, number of sessions, location, and audience for training. The qualifications of expected trainers should also be requested.

ACCEPTANCE. Acceptance criteria, specifying the contents and timing of the acceptance test, should be identified so the vendor knows how work will be judged. Acceptance criteria might include type and amount of test data, length of time for par- allel and pilot runs, phased cutover approach and speed desired (e.g., five locations per month for five months), and performance criteria for success (e.g., five consecutive days with all accounts in balance at the end of daily processing).

Hardware and software packages are usually benchmarked to verify that they perform as adver-' tised. A benchmark is a comparison test between two or more configurations. The contents of the test are a suite of application programs that are representative of the expected work load of the production system. A benchmark test provides you the ability to com- pare throughput performance with the representative work load. In addition to the benchmark which pre- cedes installation, hardware and software packages might also be run through a trial period similar to that described above for acceptance.

~anagernenti\pproach

SCHEDULE AND STAFFING. Vendors should be required to develop a schedule for the proposed work. Pert, critical path (CPM), Gantt charts, or other graphical schedules might be required. Mile- stones for the project and deliverable work products should be identified as specific requirements. The discussion of work should be required to include number, timing, and skills of the expected employ- ees. For contract software development, vendors fre- quently attach resumes of the intended project manager(s) and project team members for client information. If the client wants the right of refusal on all employees, a representative set of contractor staff resumes should be provided for client review.

PROJECT MANAGEMENT. Project manage- ment is an important issue in an RFP because it fre- quently identifies the one or two people the client will work with most closely. The requirements can include reporting structures, management of work, and problem resolution policies of the vendor firm. In general, vendors should identify an on-site manager and a more senior, vendor manager to over- see and guarantee the quality and quantity of ven- dor work. The resumes of one or both of those contacts should be required in the response to allow assessment of the qualifications of the managers for the proposed work.

PROJECT REPORTING. Status reporting form, content, and timing should be requested of vendors. This can be left to the vendor to describe, or can be stated as a requirement for compliance by the ven- dor. Normally, status meetings are held as required or weekly, whichever is more often. A written sta- tus report should be required to identify work com- pleted, progress against the schedule, problems needing resolution for project completion, and work assignments for the next period.

VENDOR ASSUMPTIONS. Special vendor re- quirements should be identified. The idea behind this section is that there should be no surprises because of erroneous assumptions by a vendor after a selec- tion is made. The vendor's assumptions are stated in the response to ensure that the client also shares the same assumptions. Any hardware, configuration, purchased software, or facilities alterations assumed by the vendor to be available for their use are solicited. For instance, when vendors build custom software, they normally assume that their employees work at the client site, use client computing equip- ment and software, and follow the client's employ- ment practices.

The vendor's expectations and type of support required from the client should be identified. For instance, copying, clerical, and secretarial support might be expected. In addition, access to the users should be identified with estimates of the num- ber and expected participants for data gathering meetings.

Contents of RFP 673

Further assumptions about how application infor- mation will be entered into the computer (e.g., keyboard entry by clerks, keyboard entry by pro- grammers), the availability of computer resources for testing, and the frequency of tests for each ven- dor staff member should be identified.