Cluster Analysis
Week 4 Cluster Data.xlsx (8.977 KB)
Included with this assignment is an Excel spreadsheet that contains data with two dimension values.
The purpose of this assignment is to demonstrate steps performed in a K-Means Cluster analysis.
Review the "k-MEANS CLUSTERING ALGORITHM" section in Chapter 4 of the Sharda et. al. textbook for additional background.
Use Excel to perform the following data analysis.
- Plot the data on a scatter plot.
- Determine the ideal number of clusters.
- Choose random center points (centroids) for each cluster. (Note: Each student will select a different random set of centroids.)
- Using a standard distance formula measure the distance from each data point to each center point.
- Assign each data point to an initial cluster region based on closeness.
- For each cluster calculate new center points.
- Repeat steps 4 through 6.
You will use Excel to help with calculations, but only standard functions should be used (i.e. don't use a plug-in to perform the analysis for you.) You need to show your work doing this analysis the long way. If you were to repeat steps 4 through 6, what will likely happen with the cluster centroids? The rubric for this assignment can be viewed when clicking on the assignment link.
Here is a link to an example spreadsheet using a smaller data set. It contains two tabs. The first tab is the raw data. The second tab contains the analysis that was performed. Make sure that you use a different starting center points from the example.
The attached file has 43 data points. Please complete all.
10 years ago 20
Purchase the answer to view it
- cluster_analysis_example.xlsx
- math
- CSIT Programming Project 4 Sorting to find anagrams Solution
- Macroeconomics
- Marketing 2101 Quiz 5
- FIN515 - Week 1 - Problem Set
- as discussed
- Question 4. Short Essay: While exports are good for the U.S. economy (e.g. raising demand for U.S. products and wages...
- Principles of Management
- payment link
- a modest proposal speech 2 minutes