Database project assignment

profileadi_6294
chapter_8.pdf

Chapter 8: Tools and methods for analysis and design

109

Chapter 8: Tools and methods for analysis and design

8.1 Introduction The previous chapter introduced and discussed the approach to setting up and running a development project. In this chapter we look in more detail at the tools and methods that a professional systems developer (a knowledge worker) might use. We focus in particular on analysis, sometimes called systems analysis, and the kind of work that establishes the detail of the information requirements. We also consider some aspects of design – in particular, design of a relational database. This part of the syllabus is directly relevant to your project work.

8.1.1 Aims of the chapter The aims of this chapter are to:

• introduce you to object oriented modelling for the analysis task and the Unified Modeling Language (UML)

• discuss the two UML diagrams: namely, the class diagram and the use case diagram

• offer you guidance on how to develop such diagrams for your project work

• introduce the normalisation of a data model.

8.1.2 Learning outcomes By the end of this chapter, and having completed the Essential reading and activities, you should be able to:

• describe the purpose and uses of the Unified Modeling Language (UML)

• explain the purpose and basic structures found in two UML diagrams: the Use Case diagram and the Class diagram

• undertake analysis work to allow you to draw a simple use case diagram using the UML notation

• undertake analysis work to allow you to draw a simple class diagram using the UML notation

• prepare a set of normalised relations using the entity–relation model and thus complete your database coursework.

8.1.3 Essential reading Laudon, K.C. and J.P. Laudon Management information systems: managing the

digital firm. (Boston; London: Pearson, 2013) thirteenth edition [ISBN 9780273789970 (pbk)] Chapters 7, 13 and 14.

Curtis, G. and D. Cobham Business information systems: analysis, design and practice. (London: Prentice Hall, 2008) sixth edition [ISBN 9780273713821]. Chapters 10, 11, 13 and 16.

8.1.4 Further reading Avgerou, C. and T. Cornford Developing information systems: concepts,

issues and practice. (London: Macmillan, 1998) second edition [ISBN 9780333732311] Chapters 2, 3, 4 and 5 cover systems development in some detail and depth.

IS1060 Introduction to information systems

110

Fitzgerald, B., N. Russo and E. Stolterman Information systems development: methods-in-action. (Berkshire: McGraw Hill, 2002) [ISBN 9780077098360].

Pressman, R. Software engineering: a practitioner’s approach. (London: McGraw Hill, 2009) [ISBN 9780071267823].

8.1.5 References cited Kent, W. ‘A simple guide to five normal forms in relational database theory’,

Communications of the ACM 26(2) 1983, pp.120–25.

8.1.6 Synopsis of chapter content This chapter introduces the object oriented approach to systems analysis, the Unified Modeling Language (UML) and the two diagrams from UML that you are expected to understand and use. These are the use case diagram and the class diagram. The chapter goes on to discuss how a class diagram as an analysis output can be used as the basis from which to develop a design for a database obeying the entity–relationship model. Finally, the chapter introduces the process of normalisation that is used to ensure that a database records data efficiently, is maintainable and can safely and successfully be updated. The overall process of developing a database analysis and then a design is the basis for your database project.

8.2 Techniques used in object oriented modelling

Reading activity

Read Chapter 13 of Laudon and Laudon (2013) and Chapter 16 of Curtis and Cobham (2008).

The lifecycle model introduced in the last chapter includes the key activity of ‘analysis’. Once the broad direction of a project has been established effort is needed to explore, understand and document the new system in both its business processes and its main technical elements. This must be done in order that in the next phase designers are appropriately briefed. The analyst’s work is to provide the required detail so that a system can be built (programmed, configured, new jobs and roles established, databases established, etc).

The person who does this analysis work, the systems analyst, needs to be able to provide a high-level overview of:

• what things the proposed system will do for its users/sponsors, their requirements including, in particular, their information needs

• where data comes from and then goes to in the wider world (remember: information systems are open systems and therefore get data from the world around them and return it to that world)

• what data processing should take place, first at a high level of generalisation and abstraction, and then in increasing detail

• what data needs to be stored, or more specifically, what things in the world will need to be represented by some stored data.

These are almost all posed as ‘what’ questions, and it is a broad but useful generalisation to say that analysis is mostly about ‘what’, while the subsequent phase of ‘design’ is about the ‘how’; in particular the ‘how’ of technology.

For this subject, you are expected to understand and be able to use two basic techniques used in systems analysis as a way to express answers to

Chapter 8: Tools and methods for analysis and design

111

these ‘what’ questions – the specification of a new system. Both are based on specific diagrams, although they will usually need to be accompanied by some detailed textual descriptions too.

The two diagrams we will use are named:

• use case diagram

• class diagram (which is required to support the data analysis and data model you prepare for your database project).

These two diagrams are a part of the Unified Modeling Language (UML), a modern standard for documenting systems analysis work as well as systems design. UML has been established by an influential industry body, the Object Management Group (OMG) as an open standard. Being an open standard means that UML is freely available to be used by anyone with no license fees to pay – just as open source software is available without fees. As its name implies, UML is a language – a language for building models of information systems – and UML provides models appropriate for developing new information systems from an idea through to implementing an operational system. (UML version 2.4.1 the latest released in July 2012 has a 1,000+ page specification, available at: www.omg.org/spec/ UML/2.4.1/Infrastructure/PDF/ You will be glad to know that you are not expected to read any of this document!)

As a modelling language UML includes model elements (fundamental concepts), notations (visual versions of model elements) and guidelines (idioms of usage or recommendations).

What UML is not is a specified process for development activity or some other version of the lifecycle. Another way to say this is that it is not a ‘methodology’ – a methodology being a tightly coupled and prescriptive process for how to develop systems. So we can and will use elements of UML for all kinds of development work and in support of all kinds of development approaches. What UML does, as a language, is allow developers and other people to express and communicate ideas about a new system – often using diagrams and pictures – but it is up to the developers, managers and users to determine what needs to be expressed, and what the right sequence of activities is for developing models and moving forward towards the new information system.

For the purposes of this course we are going to introduce and use only a small sub-set of UML. If you take further courses in this area you will learn more UML, principally in the course IS3139 Software engineering: theory and application. The two diagrams we will use are:

Use case diagram: to capture users’ overall requirements. A use case diagram defines the boundary of the system under analysis and identifies actors (people, and perhaps other information systems) who participate with the (technical) system, and the ‘things that they want to get done’.

Class diagram: to capture the static structure of a proposed system in terms of classes (types of relevant objects or ‘things’) and their relationships to one another. We focus in particular on business classes that relate to relevant things in the domain of a new system. Put in plain English, business classes are the ‘types of things in the real world that we will need to know about’. So, if a new system is going to process orders sent in by customers for various products, we can immediately identify some classes: Customers, Orders, Products. In simple terms, we might say that a ‘class’ is suggested any time we use a noun in our description. Here is another example of a simple description of a system and some classes that emerge.

IS1060 Introduction to information systems

112

‘On any given day the new system will allocate available aeroplanes to fly specific flights between two airports and then allocate a qualified pilot to take charge of the flight.’ Here we can see immediately four potential classes (nouns): pilots, aeroplanes, airports and flights. Each one is quite plausible as important things a system would need to store some data about.

8.2.1 Use case diagram

Reading activity

Read Chapter 16 of Curtis and Cobham (2008).

The use case diagram is about actors (people who act (do things) in the world) and what they want a system to do to help them. In order to get things done they use or become part of an information system. To say that they ‘become part of’ is an example of a sociotechnical view of information systems which are always a part technical, part social and organisational. Actors might, sometimes, also include other computers or databases if they will be interacting with the system we are considering. Strictly, we should say that an actor is a ‘role’, not a person, and one person (or even computer) could take on more than one role with respect to a system. In the example below, a person could be both a nurse and a patient – a nurse one day and a patient the next – these being two roles they can play depending on circumstances. Within the broad structure of UML an actor and a use case are model elements and the diagram below shows the usual notation used to depict them.

Prescribe medicines

Administer medicines

Review medicines

Supply medicines

Doctor

Nurse

Pharmacist

Figure 8.1: A simple use case diagram showing an electronic prescribing (eP) system for a hospital to support the giving of medicines to patients.

The phrase use case may not sound quite right in English on first hearing. It comes from a Swedish author (Jacobson). Perhaps it sounds better in Swedish. However, the phrase and the concept has caught on

Chapter 8: Tools and methods for analysis and design

113

and is widely used to convey the notion of ‘a case of somebody using a system to do something’, which is a very appropriate way to document systems requirements during systems analysis, in particular functional requirements. So a use case says that ‘this actor/these actors will use (for example, be involved with) the information system to help achieve this task/these tasks’.

The notation is very simple. A stick figure stands for the (human) actor. An oval represents a whole and complete task the system will do (the use case itself) and this is given a short and imperative name (book course, enter order, check credit, administer medicines). In the example above, the four use cases all relate to medicines being given to patients in a hospital and the activities of various actors. We have also chosen to add two borders on this diagram. Symbolically, at least, the outer border shows the boundary of the information systems, and the inner one the boundary of the technical, programmed, computer systems – but such boundaries are not really needed unless they really help to explain the context.

All actors shown on a diagram must be related to at least one use case, and all use cases on a diagram must have at least one actor associated with them. An actor sends a message or otherwise stimulates a use case, or the use case sends a message to the actor, and this provokes some response.

Each use case (oval) in the diagram should be a whole operation from the perspective of the actors involved, and provide some value to the actors. The most common error in developing a use case is to break it down into too fine a detail at first. Remember, this is intended to provide a high-level and user-oriented description of what a system can do – not any internal design detail.

A new system under analysis may require many use case diagrams each showing some distinct sub-set of functional requirements. In general there is a premium on keeping a use case diagram simple and direct so everybody can understand it.

There are various elaborations possible in the diagram, with notations to allow use case diagrams to express some more ideas about the general architecture of a system and the tasks it supports. One use case can make use of another in a <<includes>> relationship. An arrow from the user to the used indicates this. This is a simple ‘subroutine’ relationship – one use case uses another for a particular task. A use case can also be modelled in terms of an <<extends>> relationship, in which one use case is based on another, but extends its functionality or deals with some special case.

In the use case diagram below the cook can make a cake, and to do this they will (always, often, usually) measure flour. But this is a separate use case because other use cases may also need to include the same activity. Sometimes the cook will bake a cake that needs to be iced. So we recognise this as a special case where we must add an extra activity to add the icing. The usual advice is to be very careful in using these extra elements – the principal aim of a use case diagram is to be simple and understandable and detail can come later. You can read more about these two elaborations in Curtis and Cobham (2008) Chapter 16.

IS1060 Introduction to information systems

114

Cook

Make a cake

Measure flour

Add icing

«include»

«extend»

Figure 8.2 A use case diagram illustrating <<include>> and <<extend>> associations.

The use case diagram captures and shows very well the basic ideas of what a system is supposed to do, but a diagram alone is seldom enough to convey all the detailed information that is required. Thus it is usually necessary to provide some textual description to accompany the use case diagram, as well as a textual description of each actor (who exactly they are) and each use case in the diagram. Such a description will include the objectives for the use case, how it is initiated, how it delivers value (for example, supports the overall systems purpose), and any required pre- conditions, etc.

To develop a use case diagram (or set of diagrams for real sized systems) we usually start with finding (some of) the actors.

Who needs the system; who uses the system; who supports or manages the system; who benefits from the system? Remember too that an actor does not have to be a human role; an actor could be a device, or another computer based system that is outside the current system’s scope.

Once we have found some relevant actors we can start to express their needs in terms of broad functionality – the use cases. Use cases are there to get things done, to process or retrieve information, or to monitor activity and report on events. By asking these types of question we will be starting to think about the possible shape of a future system.

The above example is of a use case diagram for an everyday activity familiar to most of us from television and films, if not real life, i.e. giving medicines to a patient in a hospital. Information systems are increasingly used to support this activity, often described as electronic prescribing (eP) systems. Three actors are shown in Figure 8.1. The Doctor who prescribes medicines, the Nurse who administers them to a patient and the Pharmacist who supplies the medicines and also reviews and checks the prescription. Each of these roles (people) will interact with the computer. Each has to get something done to fulfil the overall purpose of the system which is to provide safe, timely and appropriate medicines to patients.

You may think that a patient should be shown as one of the actors here – indeed, perhaps they should – but in most such systems the patient does not directly interact with the computer system. We certainly could imagine a case where they would, for example, need to confirm they have received their medicines, or to review their own medicine history. Mothers may want to know what inoculations (medicines) their children have had, or on leaving hospital another doctor may want to review the record. With that in mind, try to add to the above diagram a Patient actor and a suitable use case.

For now there are four use cases (ovals) shown in the diagram – that is, four ‘chunks of functionality’ that we think we want the software to incorporate to support the work processes of these medical staff. You

Chapter 8: Tools and methods for analysis and design

115

should be able to appreciate that the very simplicity of the diagram is its strength. For example, you could have a good discussion with nurses, doctors and pharmacists about this diagram, probably everybody would quickly understand it, and they could probably tell you a lot of extra detail about each use case – for example, detail on the how part.

How is important of course, but the use case diagram is not intended to say much about about how things are done by the computer, just what and to who it is done. For example, we may know that there is an implied sequence of events here: probably in the rough order prescribe, check, supply, administer. But the use case diagram does not concern itself with this. At the start of a development effort it is important not to add too much detail – that can come later. Indeed UML has other specific diagrams (tools) to capture the sequence of events and the messages that would link together these use cases and actors – for example, the object-sequence diagram, but we do not consider this diagram in this course.

Activity

The TOPCAR taxi company is working to develop a new information system to support corporate credit accounts. So far they have come up with this loose textual description of the system.

A client company can make a request for a credit account, in which case one or more credit checks is made. If these are positive, a credit account is set up on file.

Thereafter, an authorised person from such a company can phone up a dispatch clerk and request a booking. When this happens, availability of a taxi at the requested time is checked, as is the credit status of the client company. If these two checks are successful a booking is made.

After the customer has used the taxi, the driver sends in a record of the work, including the time and the distance. The cost of the booking is calculated and added to the account. At the end of the month accounts are sent to client companies for settlement.

Sketch a use case diagram for this system. First identify the actors, then the ‘chunks of functionality’ that these people interact with. Try to restrict your diagram to five or fewer use cases. Remember, you are not expected in a use case diagram to be concerned with sequences of events, and the use cases should have no associations between them (other than <<includes>> and <<extends>>. Remember too that this is an exercise in simplification. You will need to leave out some of the detail while capturing the basic functionality that is needed. This is never easy, not in exercises or in real life systems development.

8.3 Class diagrams and data models

Reading activity

Read Section 6.2, Chapter 6 of Laudon and Laudon (2013) and Chapter 13 of Curtis and Cobham (2008).

Use case diagrams are about modelling actors and the functionality they expect or need. Another important aspect of information systems development addressed during the analysis phase is establishing in appropriate detail (for example, not too much) the classes of things that a system will hold and use data about. The goal is to establish a logical model of the world around the system and the things there with which it interacts (again – think open systems). Once this logical model is established then designing the database element of a new system can go ahead (a ‘how’ question addressed in the design phase).

IS1060 Introduction to information systems

116

Here we must hold on to the distinction between analysis and design. Analysis is about needs, requirements and some idea of the future logical systems we are seeking to build (the what). Design will take that information and add the detail to allow the system actually to be built (the how).

For analysis we develop a class diagram based on UML notation. UML supports an object oriented approach to analysis and design, and the basis for object oriented model building is, unsurprisingly, objects and classes of objects. My VW Golf is an object that belongs to the class of motor cars. The class diagram is concerned with broad types of things (classes; for example, cars) rather than specific examples (objects; for example, my VW Golf). Orders not an order, students not a student, etc.

A class diagram is the basis for database development including for your project for this course. An object is an individual ‘thing’, a class describes the generality of such things. An object class (or just ‘class’ for short) is an abstraction of a set of real world things – tangible objects (cars, boats, planes, products, books), but may also represent roles (student, customer, reviewer), incidents, events or activities (election, earthquake, examination), interactions (sales contract, review request), or specifications (orders, recipes, prescriptions).

When we build an analysis model we are, at least initially, trying to build a model of the environment within which a system will operate, and with which it has to maintain some level of correspondence. Thus, if we have customers or patients, or medicines or doctors out there in the world, we will want to model them within our new information system. In this way, we identify the object classes (or just classes) and use them to build a ‘map’ or model of the domain. Later on in the development process we will shift our view towards software and detailed processes by which things happen. But for now, in analysis, it is the problem domain that we focus on.

The graphical depiction of a class is a box with up to three compartments. At the top we name the object (singular noun); in the middle we describe the attributes (data values) that the object should retain. At the bottom we can add the operations or functions, the things that the object can do, or the events that it can respond to. This is the basis of full-blown object oriented analysis and design, as taught in course IS3139 Software engineering: theory and application, but for the purpose of this course and your database coursework we ignore operations within the class model.

Identifying relevant classes, at least initially, is quite easy. Think of the electronic prescribing system introduced above. What are the types of ‘things’ that we want to collect and hold data about? Patients, prescriptions, medicines, for a start, plus perhaps administration events such as when medicines are given, checks done by pharmacists, and deliveries of stock to the ward. We probably also need to store information about the actors – doctors, nurses and pharmacists – to record who does what.

To keep it simple we will focus here on the prescribing activity itself. One way to think about a prescription is as an order written by a doctor for some medicines (one or more) to be given to the patient – perhaps once or perhaps regularly for a certain number of days. This structure, of an order for items for a customer (for example, a patient), is very common in all sorts of business situations, and is the most common example of a class model/data model found in textbooks. (See, for example, Laudon

Chapter 8: Tools and methods for analysis and design

117

and Laudon (2012) Figure 6.11.) Below (Figure 8.3) is our version of this general model written using UML notation – and then the model is adapted for the specific case of prescriptions.

For the moment let us take the classes as given. We want to concentrate on the associations between classes. Note also that the class boxes are very simple. This is because we have not, as yet, defined any attributes (or operations). In the very early stages of analysis we do not need to do this, so keep it simple. There is plenty of ‘analysis information’ contained in the diagram already.

What the diagram shows are some associations between classes. In our system we will need to be able to ‘follow’ these associations so, for example, a prescription can be ‘linked’ to a particular patient, or we can answer a query such as ‘name all the patients who are taking medicine X’.

In the diagrams below we can interpret these associations in specific ways. Thus they say that there is a relationship between a customer and an order such that a customer can have 0 or more orders (0…*). Each order is made up of one or more line items. In other words, it is not possible to have an order for zero items (this may or may not make sense). The black diamond on the left of the line says that this association between the two classes is an ‘aggregation’. That is, a line item belongs to an order, is a part of it. If we delete the order for some reason, all the line items are also deleted. Note that customers and products are independent classes, hence no more of the open diamonds in the diagram. A line item is associated with one product and any given product can be associated with zero or more line items.

In the lower diagram this same structure tells us the same detail. It says that a prescription will be associated with just one patient, but that a patient can have 0 or more prescriptions. The diagram also tells us that a prescription can be for one or more items, and that each item on a prescription relates to a single medicine. All of this is useful information to check with the medical staff to make sure we have it right!

Here is a general version of this class diagram.

Customer Order ProductLine Item 1

0..*

0..*

1..* 1

Here is a version adapted for prescriptions.

Patient Prescription MedicineItem 1

0..*

0..*

1..* 1

Figure 8.3a and 8.3b: Two examples of a class diagram using the same pattern.

These two diagrams were drawn using software freely available at: http://yuml.me/diagram/scruffy/class/draw

The script used to produce the first diagram is given below. With this software (an example of SaaS), you specify the diagram with simple text and the website draws the picture for you. You can find many other software packages and services that can do the same task.

// Order Order Line Class Diagram

[Customer]1--0..*[Order]

[Order]++--1..*[Line Item]

[Line Item]0..*--1[Product]

IS1060 Introduction to information systems

118

This class diagram shows the overall structure of the data that a system will need to store, shown in terms of different classes. This provides a useful notation and simple but powerful ideas to work with. Using them on a real problem should convince you.

To take this model forward to form a database design we use the relational database approach, in which data is considered as being stored in square tables, one table per class. The actual details of how the computer stores the data need not concern us in analysis activities – that is largely taken care of by database management software.

A table or relation has a number of columns representing the items of data to be stored related to any given object in a class (what we call the attributes). The rows represent the occurrences of items to store data about a particular item of the specified class. Now we have come to the design phase we change from calling these objects to calling them entities. It is a bit confusing but the word entity is the older term and the entity- relationship model preceded the object-class model. Just remember: object in analysis = entity in design. In this way, a table of patients may have a layout as shown in table 8.1; adding more patients would mean adding more rows.

Patient number# Patient name Age Gender Allergy

58447 Peter Small 23 Male Nuts

21944 Mary Jones 23 Female None

55633 Lola Smith 26 Female Nuts

18647 John Smith 28 Male None

419745 Mary Jones 18 Female Milk

Table 8.1: The patient relation.

We can write down the design of this relation in the following form:

Patient(Patient number#, PatientName, Age, Gender, Allergy).

We can add this information about the attributes onto the class diagram as follows.

Patient

0..* Patient No#

Patient Name

Age

Gender

Allergy

Prescription 1

Figure 8.4: Class diagram for the patient relation.

The relation itself is called PATIENT – it has so far five attributes with the names Patient Number, Patient Name, Age, Gender, Allergy. The field Patient Number has # after it to indicate that this is the key field. The value of the key field must be unique for each entry in the table and here, as in many cases, this is achieved by giving entities unique reference numbers – just as the books on the subject reading list have unique ISBN numbers and any database dealing with book titles might use ISBN numbers as the key.

In this simple case, we have only one key field, but it may be the case that two fields taken together form the key – a composite key. An example of a table with a composite key might be:

Chapter 8: Tools and methods for analysis and design

119

Vehicle (Model#, Engine size#, year#, price).

None of the attributes taken alone – model, engine size or year – can uniquely identify a particular type of vehicle (for example, a 2012, 1.6 litre Honda Civic), but taken all together they do.

Note that in the Patient example none of the other fields can be used as the key – in particular, we have two people called Mary Jones, two people of 23 years of age and two people with a nut allergy.

The order in which the patients appear in the table is unimportant, unlike a simple sequential file structure. In the database approach, we assume that we will access records by using the values of the various fields. For example, patient number 23817 or all patients with a nut allergy. Another way of looking at this is to say that we are interested in one class (type of entity) – patients – and we have identified five attributes of a patient, including a key attribute.

Doing data analysis and building a data model for a real information system generally will require consideration of more than one class as we have seen. In this example, we have identified another three classes: Prescription, Item and Medicine. This leads to another three relations:

Prescription(Prescription Number#, Patient Number, Date, Doctor issuing, etc. )

Item(Item Number#, Prescription Number, Medicine number, Dose, Frequency, etc.)

Medicine(Medicine Number#, Name, Unit Size, etc.)

What is the relation between the entities Patient, Prescription, and Item? Well, it is expressed in the class diagram above (see Figure 8.3) as well as in the relations above. In the relations we can see that all the underlined attributes link us to another entity. When the database is working as part of the information system we need to be able to make and maintain these relationships. As shown above we do this by using the key of one relation as an attribute in another. When it occurs in another relation it is known as a ‘foreign key’. So Medicine Number and Prescription Number are shown as attributes of the class Item, and each underlined to indicate it is a foreign key – the necessary links to Prescription and Medicine.

In the above description of the class diagram we have spoken of one to many associations as indicated by a 1 and a 1..n at the ends of the line. This is known as the cardinality or multiplicity of the association or relationship. In general we might have many types of multiplicity. Consider, for example, the case of a new medicine that becomes available but at the present moment is not prescribed for any patient.

Activity

Take this simple example of an association; Footballers and Football clubs. (Assume for the moment that a player can only play for one club.) First draw the appropriate class diagram and show the multiplicity of the association. Then, assuming that each resulting relation has a key (Club Number# and Player Number#), how would you represent this relationship? For example, which relation (Club or Player) would contain a foreign key and what would it be? Now see if you can expand your model to allow a player to play for a number of clubs.

The class model introduced above is intended to show us important things about associations out in the world and which an information system and its database needs to reflect. When we leave Analysis and come to Design it is important to ensure these real world relationships can be accurately represented in the developed database. With that in mind, consider the appropriate associations or relationships in the following cases and highlight any issues or questions that may arise:

IS1060 Introduction to information systems

120

• husbands and wives (think of the King Henry VIII)

• mothers and children

• football teams and players

• books and authors

• cinemas and films

• films and actors

• films and directors.

In general, we will find lots of many-to-many relations (M:N) out in the world (one film has many actors, one actor is in many films). This poses a problem when we come to database design and the use of foreign keys. Similarly, one-to-one relations may be suspect, because they may suggest that the two entities are one and the same. This is not always the case though. A one-to-one relation between two entities might represent a situation in time – as in drivers to cars during a race.

So far, this section has approached data modelling through a simple example and by appealing to a common sense understanding of the way the world is organised. In the situation of developing a data model for a new information system (including when working on your project), the sequence needs to be a bit more formalised:

• Identify and name classes of things about which data will be stored.

• Identify and name the association among the classes.

• Draw a class diagram.

• Identify the attributes of each entity (example of a class) and select the key attribute(s).

• Ensure that the identified associations are supported through the keys (for example, the key of one relation is an attribute of the other – a foreign key).

8.3.1 Normalisation

Background reading At this stage, we may seem to be finished and, indeed, a class diagram alone may be adequate to fulfil the needs of the analysis phase of systems development, but there is one further important step we must undertake as we develop our design – that is, normalisation.

Normalisation is not considered in detail in Laudon and Laudon (2013) but Curtis and Cobham (2008) does give an adequate coverage. One of the best short and easy-to-read explanations of normalisation is found in Kent (1983).

This is a process by which we make sure that the database we are designing will contain all the information we want, that the data will be accessible to us, and that the data is stored as far as possible with minimum redundancy and to support easy updating.

Normalisation is important for you as you study for this course, because you may get examination questions about it, but also because your database project is expected to include a set of normalised relations.

The ideas behind normalisation are quite simple.

All entities (for example, of a particular class) must contain the same number of attributes – this is called the first normal form. This rule

Chapter 8: Tools and methods for analysis and design

121

excludes variable repeating groups. Consider this example: students may study a variable number of subjects; this might imply a relation with a variable number of fields. For example, assume that a student could attend as many subjects as they wished:

Student(Student number#, Student name, Address, Subject1, Subject2, Subject3,…)

This is wrong, because some students might have two subject fields, some three, or some five, etc. This does not fit the first normal form rules.

We can approach this problem another way by seeing that it suggests that we have a many to many relation between Student and Subject. That is, one Student can take from zero to many subjects; and any one Subject can be taken by many students. The solution is to remove any mention of subjects from the Student relation and add a new relation (a new class), called Course, to ‘link’ a student to as many subjects as they need, and at the same time allow a subject to be linked to as many students as is required. This new relation Course can have many rows for any one student as needed, each representing one of their selected subjects.

The resulting set of relations are then:

Student(Student number#, Student name, Address)

Subject(Subject number#, Subject name,…)

Course(Student number, Subject number…) = This is the link between any one Student and any one Subject.

Each entity (row) in Course represents an individual student taking a particular subject; further attributes such as individual attendance or mark achieved could be added here. In this way, the many-to-many relation of students to subjects is made into two one-to-many relations – Student to Course and Course to Subject.

To summarise the second and third normal forms we can say that, ‘Any attribute in a relation must provide a fact about the key, the whole key and nothing but the key’. We break this down a bit further. Second normal form is violated if a non-key field is a fact about a subset of a key. In this example, ‘Lecture hours’ is a fact about the subject – only one part of a composite key. The attribute ‘Lecture hours’ belongs in the SUBJECT relation because it only relates to the subject.

Course(Student number#, Subject number#, Examination mark, Lecture hours)

If the original design was adhered to, the lecture hours would be repeated many times and any change would require many updates to the database. Also, if there were no students taking a particular subject – perhaps temporarily, for example, during the vacation – then there would be no possibility of storing the lecture hours’ data.

The third normal form is violated if a non-key field is a fact about another non-key field, as in:

Teacher(Staff number#, Department, Building)

Building may be a fact about the teacher (‘where their office is’ would be OK) or about the department (where the department office is would not be OK). A better design is:

Teacher(Staff number#, Department)

Department(Department#, Office location)

Overall, normalisation helps to ensure that information is stored only once and that inconsistencies do not occur in a database when data is added or deleted.

IS1060 Introduction to information systems

122

The set of relations resulting for a college database after some normalisation might be:

Student(Student number#, Name, Address, Staff number (of tutor) )

Course(Student number#, Subject number#, Exam mark)

Subject(Subject number#, Lecture hours)

Teacher(Staff number#, Teacher name, Subject, Department)

Department(Department#, Location)

Note how the relations we need to record and use are supported by using the key or keys of one relation as non-key attributes of another. Even so, this example still has problems: consider the relation TEACHER – is it in third normal form? And are we sure that each teacher has only one subject?

Activity

As an exercise, redesign the model to solve these problems and then draw the appropriate class diagram.

Finally...once you have worked through from a class diagram to a set of normalised relations, it will be easy to implement the design using a database package. The only step remaining is to determine the exact form in which each field will be stored – for example, as integers, real numbers, character strings, dates, etc.

Activity

The TOPCAR taxi company is developing a database to be used as part of a real-time dispatch system. The intention is that customers can phone up and make bookings for taxis, requesting a particular type of vehicle (small car, large car, minibus), or a particular driver. Some customers have credit accounts with the company, some do not.

i. Suggest candidate classes for this database, justifying your choices.

ii. Identify and name the associations between the classes, indicating their multiplicity.

iii. Draw the class diagram.

iv. Design the relevant relations with key attributes.

v. Identify and resolve any issues of normalisation you can see.

vi. Show how one identified association can be supported through suitable keys.

8.4 Reminder of learning outcomes Having completed this chapter, and the Essential reading and activities, you should be able to:

• describe the purpose and uses of the Unified Modeling Language (UML)

• explain the purpose and basic structures found in two UML diagrams: the Use Case diagram and the Class diagram.

• undertake analysis work to allow you to draw a simple use case diagram using the UML notation

• undertake analysis work to allow you to draw a simple class diagram using the UML notation

• prepare a set of normalised relations using the entity–relation model and thus complete your database coursework.

Chapter 8: Tools and methods for analysis and design

123

8.5 Test your knowledge and understanding 1. Do you believe that the Use Case diagram is too simple to be of much

use in systems development work, or is its simplicity its most valuable characteristic?

Prepare a use case diagram for the following situations:

a. An ATM (cash machine) provided by a bank.

b. Ordering books on Amazon.

c. Using an online film database (as the example discussed in Chapter 2 of the subject guide).

d. Checking in to a hotel at the reception desk.

Keep your answers simple with minimal Actors and carefully chosen Use Cases.

2. Explain clearly the difference between an <<include>> and an <<extend>> in a Use Case diagram. Find relevant examples of the use of each in books or online.

3. a. How does a class diagram help when undertaking analysis work? What essential questions can it allow you to answer or how does it clarify what a system should do?

b. Take the basic class diagram of Customer-Order and rewrite it to fit some other situation that is not usually described in these terms. For example, a situation where somebody or some thing (e.g. a computer) asks for a list of things. Whatever example you choose, make sure that the associations have the right multiplicity.

4. Research the process of normalisation online and in textbooks. On that basis explain as many different types of problem that normalisation is intended to prevent or solve as you can.

5. a. Give three examples of relations that violate the second normal form, and explain why they do.

b. Give three examples of relations that violate the third normal form, and explain why they do.