Data Mining Questions
Question 3. Anomalies
A data set majorly consists of objects that are related; these objects are known as normal objects. In the same data set, objects that do not conform to other objects known as anomalous objects also exist. A data set, therefore, consists of both the normal and anomalous objects. The anomalous objects attract much attention since they give unique information that should be given attention (Hossain, Akhtar, Ahmad, & Rahman, 2019).
Cluster validity measures
Question 4 (a)
Defining normal regions is challenging since the boundaries between normal regions and the abnormal regions are always slim; therefore, they cannot be precisely distinguished.
Question 4 (b)
Normal datasets have a smaller SSE for K-10 cluster since it represents data with some relations; therefore, the distance to nearest clusters is short giving least square errors when compared to abnormal data sets (Hossain, Akhtar, Ahmad, & Rahman, 2019).
Question 4 (c)
DBSCAN will merge uniform data into a cluster and classify the ununiformed data into noise. DBSCAN will also solve the boundary issues by identifying the variation in density and using the identified variations to cluster the data (Hossain, Akhtar, Ahmad, & Rahman, 2019).