Introduction to data mining

yk1993
QUESTION2.pdf

Table 4.7. Data set for Exercise 2. Customer ID Gender Car Type Shirt Size Class

1 2 3 4 b

6 F7 I

8 9 1 0 1 1 t 2 1 3 t 4 1 5 1 6 7 7 1 8 1 9 20

M

M M M M M F F F F M M M M F F F F F F

Family Sports Sports Sports Sports Sports Sports Sports Sports Luxury Family Family Family Luxury Luxury Luxury Luxury Luxury Luxury Luxury

Small Medium Medium

Large Extra Large Extra Large

Small Small

Medium Large Large

Extra Large Medium

Extra Large Small Small

Medium Medium Medium

Larqe

CO CO CO CO CO CO CO CO CO CO C 1 C 1 C 1 C 1 C 1 C 1 C 1 C 1 C 1 C 1

4.8 Exercises 199

Table 4,8. Data set for Exercise 3. Instance aL a2 as Target Class

I 2 3 4 5 6 a T

8 o

T T 1 . O T T 6 . 0 T F 5 . O F F 4 . O F T 7 . 0 F T 3 . O F F 8 . O T F 7 . 0 F T 5 . O

T

-T

-T-

-T-

(b)

(")

What are the information gains of o1 and o2 relative to these training examples?

For o3, which is a continuous attribute, compute the information gain for every possible split.