Database - Discussions

San77
Chapter14_BasicsofFunctionalDependenciesandNormalizationforRelationalDatabases.pdf

Basics of Functional Dependencies and

Normalization for Relational Databases

Dr. Buleje

Outline

 1 Informal Design Guidelines for Relational Databases

 1.1 Semantics of the Relation Attributes

 1.2 Redundant Information in Tuples and Update Anomalies

 1.3 Null Values in Tuples

 1.4 Spurious Tuples

 2 Functional Dependencies (FDs)

 2.1 Definition of Functional Dependency

Outline

 3 Normal Forms Based on Primary Keys

 3.1 Normalization of Relations

 3.2 Practical Use of Normal Forms

 3.3 Definitions of Keys and Attributes Participating in Keys

 3.4 First Normal Form

 3.5 Second Normal Form

 3.6 Third Normal Form

 4 BCNF (Boyce-Codd Normal Form)

1. Informal Design Guidelines for

Relational Databases

 What is relational database design?

 Two levels of relation schemas

 Logical "user view"

 Storage "base relation"

 Functional dependencies and normal forms

 - 1NF (First Normal Form)

 - 2NF (Second Normal Form)

 - 3NF (Third Noferferferfewrmal Form)

 - BCNF (Boyce-Codd Normal Form)

1.1 Semantics of the Relational Attributes

must be clear

 GUIDELINE 1:

 Informally, each tuple in a relation should represent one

entity or relationship instance.

 Attributes of different entities (EMPLOYEEs, DEPARTMENTs,

PROJECTs) should not be mixed in the same relation

 Foreign keys: refer to other entities

 Entity and relationship attributes: kept apart as much as

possible.

Figure: A simplified COMPANY relational

database schema

1.2 Redundant Information in Tuples and

Update Anomalies

 Information is stored redundantly

 Wastes storage

 Causes problems: update anomalies

 Insertion

 Deletion

 Modification

Guideline for Redundant Information in

Tuples and Update Anomalies

 GUIDELINE 2:

 Schema that does not suffer from:

 Insertion,

 Deletion and

 Update anomalies.

 If there are any anomalies exist

 Note such anomalies

1.3 Null Values in Tuples

 GUIDELINE 3:

 Tuples should have as few NULL values

 Attributes that are NULL frequently

 Separate relations (with the primary key)

1.4 Generation of Spurious Tuples (avoid at

any cost

 GUIDELINE 4:

 The relations should be designed to satisfy the lossless join condition.

 No spurious tuples should be generated by doing a natural-join of any relations.

2. Functional Dependencies - Definition

 Functional dependencies (FDs)

 Goodness

 Normal Forms

 Constraints

 A set of attributes X functionally determines a set

of attributes Y if the value of X determines a

unique value for Y

3 Normal Forms Based on Primary Keys

 3.1 Normalization of Relations

 3.2 Practical Use of Normal Forms

 3.3 Definitions of Keys and Attributes

Participating in Keys

 3.4 First Normal Form

 3.5 Second Normal Form

 3.6 Third Normal Form

3.1 Normalization of Relations (1)

 Normalization (Def.)

 Normal form (Def.)

Normalization of Relations (2)

 2NF, 3NF, BCNF

 4NF

 based on keys, multi-valued dependencies :

MVDs;

 5NF

 based on keys, join dependencies : JDs

3.2 Practical Use of Normal Forms

 Normalization is carried out in practice  Resulting designs are of high quality

 Meet the desirable properties

 Questionable when the constraints on which they are based are hard to understand or to detect

 There is NO need not normalize to the highest possible normal form  Typically up to 3NF and BCNF

 Denormalization:  Storing the join of higher normal form relations

3.3 Definitions of Keys and Attributes

Participating in Keys

 Superkey of a relation schema

 If a relation schema has more than one key, each

is called a candidate key.

 Primary Key

 Secondary Key

 Prime attribute

 Nonprime attribute

3.4 First Normal Form

 1NF

 Property of a relation in a relational database

 If the domain of each attribute contains atomic values

 Value of each attribute contains only a single value

from the domain.

 Disallows

 composite attributes

 multivalued attributes

 nested relations; attributes whose values for an

individual tuple are non-atomic

3.5 Second Normal Form (1)

 2NF (Def): two conditions must be met

 Relation must be in the first normal form.

 It does not have any non-prime attribute that is functionally dependent on any proper subset of any candidate key of the relation.

 Definitions

 Prime attribute

 Full functional dependency

3.6 Third Normal Form

 3NF

 A relation schema R is in third normal form (3NF) if it is in 2NF and no non-prime attribute A in R is transitively dependent on the primary key

 Transitive Functional Dependency:

 An indirect relationship between data elements in a database.

 The rule is essentially that A is a transitive dependency of C (A->C) if

 A is functionally dependent on B (A->B), and B is functionally dependent on C (B->C) but not on A (B not->A)

Normal Forms Defined Informally

 1st normal form

 All attributes depend on the key

 2nd normal form

 All attributes depend on the whole key

 3rd normal form

 All attributes depend on nothing but the key

4. BCNF (Boyce-Codd Normal Form)

 A relation schema R is in Boyce-Codd Normal Form

(BCNF) if whenever an FD X → A holds in R, then X is a superkey of R

 Each normal form is strictly stronger than the previous

one

 Every 2NF relation is in 1NF

 Every 3NF relation is in 2NF

 Every BCNF relation is in 3NF

Summary

 Informal Design Guidelines for Relational

Databases

 Functional Dependencies (FDs)

 Normal Forms (1NF, 2NF, 3NF)Based on Primary

Keys

 BCNF (Boyce-Codd Normal Form)