HomeWork in DB
SamG*
First Normal Form (1NF)
- Unnormalized table : Contains a repeating group
- Eg: from multi-valued attributes
- Eg: from many-many relationship
- Table in 1NF: Contains no repeating groups
- All tables are in 1NF
- Then why have this as a NF ?
- Not true when Codd first defined it.
CSCI 6622 Databases
Winter 14, Week 8
*
*
Pratt and Adamski Figure 5.5: 1NF Example
Unnormalized Table (not in 1NF)
*
Pratt and Adamski Figure 5.6: 1NF Example (con’t.)
Conversion to 1NF
*
Functional Dependencies
- Column B is functionally dependent on Column A (A B) if A’s value determines a single value for B.
- By looking at A, we will know what is the value of B
- For a single value of A, can’t have multiple values of B
- Eg: In Emp, ssn address, ssn salary and others
- Eg: In MyWorksOn (ssn, lname, P#, PName). f.d.s?
- ssn lname, P# Pname , Pname P#
CS 622/215 Databases
Winter 14, Week 8
*
7
*
Functional Dependencies
- Can f.d.s be determined by looking at relational instance alone ?
- No ! Have to look at the semantics of the data
- By looking at relational instance, might be able to say f.d. is false
- But can never say that a f.d. is true
- might happen to be true for that instance, but not always true.
- See examples on next slide
CS 622/215 Databases
Winter 14, Week 8
*
7
*
Pratt and Adamski Figures 5.3-5.4:
Functional Dependence Example
Rep Table Where LastName can determine record
Rep Table Where LastName cannot determine record
*
Pratt and Adamski Figure 5.7:
- Are following f.d.s. consistent with above table ?
- OrderNum PartNum, PartNum Description
- Price PartNum, NumOrdered OrderNum
*
Keys in terms of f.d.s
- Column(s) C is key for table T if:
- All columns in T are functionally dependent on C
- As long as no duplicate rows
C D
x 1
x 1
*
Functional Dependencies
- A B does not imply B A
- Left part can be composite.
- Eg: In WorksOn {ESSN, PROJ#} HOURS
- A A
- f.d is a type of integrity constraint
- Trivial dependencies: A A, key A
- Modification anomalies: caused by tables which have some undesirable type of f.d.
- Normalization: split tables to get rid of this f.d
- Without information loss
CS 622/215 Databases
Winter 14, Week 8
*
10
*
Pratt and Adamski Figure 5.7:
*
Dependency diagram, Pratt and Adamski Eg
- Eg: Consider the table
Orders (OrderNum, OrderDate, PartNum,
Description, NumOrdered, QuotedPrice)
- Apart from trivial f.d.s, other f.d.s ?
OrderNum OrderDate,
PartNum Description
- Dependency diagram: pictorial way of representing f.d.s.
*
Problems with Orders table
- Redundancy ?
- OrderDate 10/20/2007 repeated for OrderNum 21610
- Description Gas Range repeated for PartNum DR93
- Inconsistency ?
- If we change Description Gas Range to Oven in one row for PartNum DR93, but leave as Gas Range in the other row.
CS 622/215 Databases
Winter 14, Week 8
*
7
*
Problems with Orders table
- Update Anomaly?
- If we want to change Description Gas Range to Oven for PartNum DR93, to maintain consistency, have to make this change in two rows.
- Deletion Anomaly?
- If we no longer have OrderNum 21610, lose info that PartNum DW11 is a washer.
- Insertion Anomaly?
- Can’t insert that PartNum AB47 is a range oven till you have an order including it.
CS 622/215 Databases
Winter 14, Week 8
*
7
*
How to fix Orders table ?
- What do we want ?
- Algorithm to identify these problems
- Way of safely breaking up table
- Need to be careful on how to break it up
- So don’t lose info
- Eg of lossy decomposition on Orders table ?
- Non-loss decomposition: a way of breaking table up without loosing information.
- Eg on Orders table ?
- 2NF guarantees non-loss decomposition
CS 622/215 Databases
Winter 14, Week 8
*
10
*
Partial functional Dependencies
- Partial f.d: attribute depends only on a portion of the composite primary key (and not the whole primary key) i.e. we can take away part of the primary key, and still have an f.d. Eg:
OrderNum OrderDate is partial f.d.
PartNum Description is partial f.d.
- We don’t like partial f.d.s: Why ?
- Consider the Orders table again. What is the problem: redundancies, modification anomalies.
- Caused by the partial f.d.s
CS 622/215 Databases
Winter 14, Week 8
*
10
*
Second Normal Form Defn
- Non-key attribute (column): not part of the primary key
- 2nd Normal Form Table: In 1NF and no non-key column is partially dependent on primary key
- i.e. no non-key column is dependent on only a portion of the primary key.
- Want to ensure we don’t have following:
- (part of PK) (non-key attribute)
- If a table does not have a partial f.d., is in 2NF
CS 622/215 Databases
Winter 14, Week 8
*
10
*
Second Normal Form Eg
- Is this true of Orders table?
- No ! What are the f.d.s which are the problem ?
- OrderNum OrderDate is partial f.d.
- OrderDate is non-key, so this is partial f.d.
- PartNum Description is partial f.d.
- Description is non-key
CS 622/215 Databases
Winter 14, Week 8
*
7
*
Second Normal Form Eg
- We can break the table up to get rid of these partial f.d.s by putting the columns for the partial f.d.s into separate tables.
- Guaranteed that this will not cause loss of info.
- How to break up Orders table?
Orders (OrderNum, OrderDate)
Part (PartNum, Description)
OrderLine (OrderNum, PartNum, NumOrdered,QuotedPrice)
Intuition: OrderNum OrderDate is f.d., so can safely move to another table without losing info.
What will be the PK of new table ?
Same with PartNum Description
Is non-loss decomposition?
CS 622/215 Databases
Winter 14, Week 8
*
10
*
Pratt and Adamsi
Figure 5.9: 2NF Example
*
Gotten rid of problems ?
Redundancy:
OrderDate 10/20/2007 repeated for OrderNum 21610 ?
Description Gas Range repeated for PartNum DR93 ?
Inconsistency: If we change Description Gas Range to Oven for PartNum DR93 ?
Update Anomaly: If we want to change Description Gas Range to Oven for PartNum DR93 and maintain consistency, do we have to make this change in two rows ?
CS 622/215 Databases
Winter 14, Week 8
*
10
*
Gotten rid of problems ?
Deletion Anomaly:
If we no longer have OrderNum 21608, do we lose info that PartNum DW11 is a washer ?
Insertion Anomaly:
Can we insert info that PartNum AB47 is a range oven without having an order including it ?
Have we lost any information
Spurious tuples when we join new tables?
No !
CS 622/215 Databases
Winter 14, Week 8
*
10
*
Another 2NF example, is 2NF enough?
- PartSupp( P# , S# , Sname, SAdd, Price)
- Price depends on both P# and S#
- Non-trivial f.d.s?
- S# Sname , S# SAdd
- Problems? 2NF? Modified Tables ?Problems fixed ?
- Table in 2NF may still have problem
- Consider the Customer table from next two figures with primary key CustomerNum
- Is this table OK?
CS 622/215 Databases
Winter 14, Week 8
*
7
*
Pratt and Adamski Figure 5.10: Combined C and R
CSCI 6622 Databases
Fall 13, Week 8
*
*
Problems with Combined C and R
- Redundancy ?
- Rep’s info repeated for each Customer
- Update Anomaly ?
- If we want to change Rep 35’s last name to Green, to maintain consistency, have to change in multiple rows.
- Deletion Anomaly?
- If we no longer have Customer 356, 452, lose name info for Rep 35
- Insertion Anomaly?
- Can’t insert that Rep 45’s name is Bob Smith without a Customer having 45 as their Rep .
CS 622/215 Databases
Fall 13, Week 8
*
7
*
Boyce Codd Normal Form (BCNF)
- Suppose we put Balance, RLName, RFName in second table, and also leave Balance in first table. Is this a good decomposition?
- No. Lossy. Can get spurious tuples.
- When do a join over Balance
- Consider a different decomposition in the next slide:
CS 622/215 Databases
Fall 13, Week 8
*
7
*
Pratt and Adamski
Figure 5.13:
Decomposition Example
*
BCNF example
- Figure 5.13: is this a good way to break up the Customer table ?
- Is this lossy i.e. when we join the two new tables, do we get any spurious tuples?
- No spurious tuples. Why ? Any other problems?
- Update anomalies. Eg?
- If we want to change Rep’s last name
- Also insertion and deletion anomalies
CS 622/215 Databases
Fall 13, Week 8
*
7
*
Pratt and Adamski Figure 5.11:
Customers Dependency Diagram
*
BCNF example
- Is the Customer table in 2NF ?
- Is there a partial dependency ?
- No partial dependency. Why ?
- Not a composite key, so can’t have partial dependency
- No partial dependency : is in 2NF
- What is the problem – which are undesirable f.d.s?
- RepNum Lname, RepNum Fname
- Need to break this table up. How ?
- Want guidelines for doing this.
CS 622/215 Databases
Fall 13, Week 8
*
7
*
BCNF Definition
- BCNF: A table is in BCNF if
- it is in 2NF and
- for any f.d. X A, if X ≠ A, then X is a key
- either primary or secondary key
Customer Table: is it in BCNF ?
No: which are the problem f.d.s?
RepNum Lname, RepNum Fname. Why ?
RepNum is not a key
Can break the table up according to these f.d.s
CS 622/215 Databases
Fall 13, Week 8
*
10
*
Pratt and Adamski
Figure 5.12:
BCNF Example
*
BCNF
- In this new table, have problems been fixed?
- Redundancy ?
- Inconsistency/Multiple rows to update ?
- Deletions ?
- Insertions ?
- Spurious tuples when we join the two new tables?
CS 622/215 Databases
Fall 13, Week 8
*
10
*
BCNF – Another Example
- Eg [Date]: Supp(S# , Sname, City, Status)
City Status
- Problems?
- 2NF?
- BCNF?
- Modified Tables ?
- Problems fixed ?
- Third Normal Form: a slightly weaker version of BCNF. We won’t cover.
- Some books refer to BCNF as 3NF.
CS 622/215 Databases
Fall 13, Week 8
*
7
*
*
Constraints and Assertions
- Use queries to capture constraints
- Useful when more general Integrity Constraints than keys are involved.
- Constraints
- Attribute-based CHECK
- Tuple-based CHECK
- Assertions
- Extended Assertions
- Commercial DBMSs generally support tuple-based check, not assertions, though assertions part of SQL standard
CSCI 6622 Databases
Fall 13, Week 10
*
*
*
Attribute-Based Checks
- Easy way to introduce more powerful tuple-based checks.
- Want to specify that every employee must have a salary of at least 20,000
- Attrribute based check: When we specify attribute in CREATE TABLE
- Follow this by a condition that must hold.
CREATE TABLE EMPLOYEE …
SALARY INTEGER CHECK (SALARY >= 20000)
- Checked only when associated attribute changes
- An insert occurs : new employee inserted
- An update occurs: when SALARY is changed
CS 622/215 Databases
Fall 13, Week 10
*
CS 622/215 Databases
Fall 09, Week 9
*
7
*
*
Attribute-Based Checks
- Condition can also refer to (in subquery)
- other columns
- other tables
- However, condition is only checked when the “main” attribute is changed
- not when other attributes changed.
*
*
Attribute-Based Checks
- Eg: in SP table, want to make S# foreign key to S.
CREATE TABLE SP …
S# CHAR(5) CHECK (S# IN (SELECT S# FROM S))
- If try to insert row with S8 into SP ?
- rejected
- If try to change S1 in SP to S8 ?
- rejected
- If delete S2 from S ?
- Not rejected. Why ?
- Won’t check when changes to S
*
*
Tuple-Based Checks
- Similar to attribute-based check
- but done on whole table
- In S table, we want to enforce :
- all London suppliers have STATUS = 20
- Conceptually we would think of this as
- If a supplier is in London, then the STATUS = 20
- CITY = ‘London’ => STATUS = 20
- Problem : no “=>” in SQL
- How to do ?
CS 622/215 Databases
Fall 13, Week 10
*
CS 622/215 Databases
Fall 09, Week 9
*
10
*
*
Tuple-Based Checks
- We assert: CITY < > ‘London’ OR STATUS = 20
- why is this the same as
- CITY = ‘London’ => STATUS = 20
- If CITY = ‘London’ then the first condition (CITY < > ‘London’ ) is false, so ?
- second condition (STATUS = 20) has to be true
CREATE TABLE S(
S# …
CITY …
STATUS …
CHECK (CITY < > ‘London’ OR STATUS = 20)
CS 622/215 Databases
Fall 13, Week 10
*
CS 622/215 Databases
Fall 09, Week 9
*
10
*
*
Tuple-Based Checks
- Stronger than attribute based check since checked when tuple inserted (but not deleted) or any attribute updated
- With attribute based check, only when associated attribute is update.
- Suppose done as attribute based check
- STATUS INTEGER CHECK (CITY < > ‘London’ OR STATUS = 20)
- Suppose CITY changes ? Problem ?
- Won’t check
- How about with tuple-based ? Problem ?
CS 622/215 Databases
Fall 13, Week 10
*
CS 622/215 Databases
Fall 09, Week 9
*
10
*
*
Tuple-Based Checks
- Condition being checked can be anything that could follow a WHERE
- Can refer to other tables
- No checking done when other tables changed
CREATE TABLE SP(
S# …
CHECK (S# IN (SELECT S# FROM S)
- Will this work ?
- No : not when S# changes/gets deleted in S.
CS 622/215 Databases
Fall 13, Week 10
*
CS 622/215 Databases
Fall 09, Week 9
*
10
*
*
SQL Assertions
- Database-schema constraint
- Not associated with any table
- Used when we want to do a check over multiple tables
- Checked when any involved tables changes.
- Part of SQL standard, not currently supported by most commercial DBMS (eg: Oracle, SQL Server) because of performance penalty
- Syntax:
CREATE ASSERTION name
CHECK (condition);
CS 622/215 Databases
Fall 13, Week 10
*
CS 622/215 Databases
Fall 09, Week 9
*
7
*
*
Assertion Example
- Want to assert that there are at least 4 employees.
- Is it OK if we do as tuple-based check ?
- No: why not ?
- Won’t check in case of DELETION
- Do as assertion
CREATE ASSERTION atleastfouremp
CHECK( (SELECT COUNT (ssn)
FROM EMPLOYEE) >= 4);
*
*
Assertion Example
- Condition: If A then B specified as
- not exists (A and not B)
- Any supplier whose STATUS >= 20 should be supplying at least 3 different parts
CREATE ASSERTION highstat
CHECK( NOT EXISTS
( SELECT * FROM S
WHERE STATUS > = 20 AND
(SELECT COUNT(DISTINCT P#) FROM SP
WHERE SP.S# = S.S#) < 3
));
*
*
SQL3 Extended Assertions [UW]
- SQL2: system decides when a constraint might be violated, decides when to perform check.
- SQL3: programmer specifies when to check.
- The salary of an employee who works in accounting must be at least 30000. In SQL2 with assertions
CREATE ASSERTION acctsalSQL2assert
CHECK (NOT EXISTS (
SELECT * FROM EMPLOYEE, DEPT
WHERE EMPLOYEE.DNO = DEPT.DNUMBER AND
DNAME = ‘ACCOUNTING’ AND
EMPLOYEE.SALARY < 30000 ));
CS 622/215 Databases
Fall 13, Week 10
*
CS 622/215 Databases
Fall 09, Week 9
*
10
*
*
SQL3 Extended Assertions [UW]
- SQL3: with extended assertions
- when should check be performed ?
CREATE ASSERTION acctsalSQL3extassert
INSERT ON EMPLOYEE
UPDATE ON DEPARTMENT
UPDATE OF SALARY, DNO ON EMPLOYEE
CHECK (NOT EXISTS (
EMPLOYEE.DNO = DEPT.DNUMBER AND
DNAME = ‘ACCOUNTING’ AND
EMPLOYEE.SALARY < 30000 ));
CS 622/215 Databases
Fall 13, Week 10
*
CS 622/215 Databases
Fall 09, Week 9
*
10
*
*
SQL3 Extended Assertions [UW]
- Advantages of SQL3 extended assertions ?
- Programmer has more control.
- Eg: for efficiency reason, may choose not to perform check when DNO is changed
- Makes it easier for system implementers
- They don’t have to figure out when violation can occur
- Disadvantages of SQL3 extended assertions ?
- Programmer must discover when violation might possibly occur
- o/w database could go into inconsistent state
CS 622/215 Databases
Fall 13, Week 10
*
CS 622/215 Databases
Fall 09, Week 9
*
10
CSCI 6622 Databases
CSCI 6622 Databases
Winter 14, Week 8
*
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Winter 14, Week 8
*
7
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Winter 14, Week 8
*
7
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Winter 14, Week 8
*
10
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Winter 14, Week 8
*
7
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Winter 14, Week 8
*
7
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Winter 14, Week 8
*
10
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Winter 14, Week 8
*
10
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Winter 14, Week 8
*
10
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Winter 14, Week 8
*
7
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Winter 14, Week 8
*
10
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Winter 14, Week 8
*
10
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Winter 14, Week 8
*
10
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Winter 14, Week 8
*
7
Winter 14, Week 8
CSCI 6622 Databases
CSCI 6622 Databases
Fall 13, Week 8
*
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Fall 13, Week 8
*
7
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Fall 13, Week 8
*
7
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Fall 13, Week 8
*
7
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Fall 13, Week 8
*
7
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Fall 13, Week 8
*
10
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Fall 13, Week 8
*
10
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Fall 13, Week 8
*
7
Winter 14, Week 8
CSCI 6622 Databases
CSCI 6622 Databases
Fall 13, Week 10
*
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Fall 13, Week 10
*
CS 622/215 Databases
Fall 09, Week 9
*
7
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Fall 13, Week 10
*
CS 622/215 Databases
Fall 09, Week 9
*
10
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Fall 13, Week 10
*
CS 622/215 Databases
Fall 09, Week 9
*
10
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Fall 13, Week 10
*
CS 622/215 Databases
Fall 09, Week 9
*
10
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Fall 13, Week 10
*
CS 622/215 Databases
Fall 09, Week 9
*
10
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Fall 13, Week 10
*
CS 622/215 Databases
Fall 09, Week 9
*
7
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Fall 13, Week 10
*
CS 622/215 Databases
Fall 09, Week 9
*
10
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Fall 13, Week 10
*
CS 622/215 Databases
Fall 09, Week 9
*
10
Winter 14, Week 8
CSCI 6622 Databases
CS 622/215 Databases
Fall 13, Week 10
*
CS 622/215 Databases
Fall 09, Week 9
*
10
Winter 14, Week 8