Database - Discussions
Basics of Functional Dependencies and
Normalization for Relational Databases
Dr. Buleje
Outline
1 Informal Design Guidelines for Relational Databases
1.1 Semantics of the Relation Attributes
1.2 Redundant Information in Tuples and Update Anomalies
1.3 Null Values in Tuples
1.4 Spurious Tuples
2 Functional Dependencies (FDs)
2.1 Definition of Functional Dependency
Outline
3 Normal Forms Based on Primary Keys
3.1 Normalization of Relations
3.2 Practical Use of Normal Forms
3.3 Definitions of Keys and Attributes Participating in Keys
3.4 First Normal Form
3.5 Second Normal Form
3.6 Third Normal Form
4 BCNF (Boyce-Codd Normal Form)
1. Informal Design Guidelines for
Relational Databases
What is relational database design?
Two levels of relation schemas
Logical "user view"
Storage "base relation"
Functional dependencies and normal forms
- 1NF (First Normal Form)
- 2NF (Second Normal Form)
- 3NF (Third Noferferferfewrmal Form)
- BCNF (Boyce-Codd Normal Form)
1.1 Semantics of the Relational Attributes
must be clear
GUIDELINE 1:
Informally, each tuple in a relation should represent one
entity or relationship instance.
Attributes of different entities (EMPLOYEEs, DEPARTMENTs,
PROJECTs) should not be mixed in the same relation
Foreign keys: refer to other entities
Entity and relationship attributes: kept apart as much as
possible.
Figure: A simplified COMPANY relational
database schema
1.2 Redundant Information in Tuples and
Update Anomalies
Information is stored redundantly
Wastes storage
Causes problems: update anomalies
Insertion
Deletion
Modification
Guideline for Redundant Information in
Tuples and Update Anomalies
GUIDELINE 2:
Schema that does not suffer from:
Insertion,
Deletion and
Update anomalies.
If there are any anomalies exist
Note such anomalies
1.3 Null Values in Tuples
GUIDELINE 3:
Tuples should have as few NULL values
Attributes that are NULL frequently
Separate relations (with the primary key)
1.4 Generation of Spurious Tuples (avoid at
any cost
GUIDELINE 4:
The relations should be designed to satisfy the lossless join condition.
No spurious tuples should be generated by doing a natural-join of any relations.
2. Functional Dependencies - Definition
Functional dependencies (FDs)
Goodness
Normal Forms
Constraints
A set of attributes X functionally determines a set
of attributes Y if the value of X determines a
unique value for Y
3 Normal Forms Based on Primary Keys
3.1 Normalization of Relations
3.2 Practical Use of Normal Forms
3.3 Definitions of Keys and Attributes
Participating in Keys
3.4 First Normal Form
3.5 Second Normal Form
3.6 Third Normal Form
3.1 Normalization of Relations (1)
Normalization (Def.)
Normal form (Def.)
Normalization of Relations (2)
2NF, 3NF, BCNF
4NF
based on keys, multi-valued dependencies :
MVDs;
5NF
based on keys, join dependencies : JDs
3.2 Practical Use of Normal Forms
Normalization is carried out in practice Resulting designs are of high quality
Meet the desirable properties
Questionable when the constraints on which they are based are hard to understand or to detect
There is NO need not normalize to the highest possible normal form Typically up to 3NF and BCNF
Denormalization: Storing the join of higher normal form relations
3.3 Definitions of Keys and Attributes
Participating in Keys
Superkey of a relation schema
If a relation schema has more than one key, each
is called a candidate key.
Primary Key
Secondary Key
Prime attribute
Nonprime attribute
3.4 First Normal Form
1NF
Property of a relation in a relational database
If the domain of each attribute contains atomic values
Value of each attribute contains only a single value
from the domain.
Disallows
composite attributes
multivalued attributes
nested relations; attributes whose values for an
individual tuple are non-atomic
3.5 Second Normal Form (1)
2NF (Def): two conditions must be met
Relation must be in the first normal form.
It does not have any non-prime attribute that is functionally dependent on any proper subset of any candidate key of the relation.
Definitions
Prime attribute
Full functional dependency
3.6 Third Normal Form
3NF
A relation schema R is in third normal form (3NF) if it is in 2NF and no non-prime attribute A in R is transitively dependent on the primary key
Transitive Functional Dependency:
An indirect relationship between data elements in a database.
The rule is essentially that A is a transitive dependency of C (A->C) if
A is functionally dependent on B (A->B), and B is functionally dependent on C (B->C) but not on A (B not->A)
Normal Forms Defined Informally
1st normal form
All attributes depend on the key
2nd normal form
All attributes depend on the whole key
3rd normal form
All attributes depend on nothing but the key
4. BCNF (Boyce-Codd Normal Form)
A relation schema R is in Boyce-Codd Normal Form
(BCNF) if whenever an FD X → A holds in R, then X is a superkey of R
Each normal form is strictly stronger than the previous
one
Every 2NF relation is in 1NF
Every 3NF relation is in 2NF
Every BCNF relation is in 3NF
Summary
Informal Design Guidelines for Relational
Databases
Functional Dependencies (FDs)
Normal Forms (1NF, 2NF, 3NF)Based on Primary
Keys
BCNF (Boyce-Codd Normal Form)