WEKA Assignment: Machine Learning
Section 1 — Upload final data file
Section 2 — Business/research questions
1. State the three business or research questions that you have attempted to answer through your analysis, and justify why they are interesting (300 words maximum)
Section 3 — Processing the data
1. Describe how you explored the data, why you did it that way, and what conclusions you drew about it (300 words maximum)
2. Describe the cleaning/fixing you did on the data, and why (300 words)
Section 4 — Data analysis
3. Explain what analysis techniques you used to answer your business/research questions, and why (300 words maximum)
4. Summarise the results of your analysis (300 words maximum)
5. What do the results say in answer to your business/research questions? (300 words maximum)
6. Describe the most salient threats to validity that remain in your analysis (300 words maximum)
Section 5 — Dealing with large data sets
7. Describe how you could represent the data in a relational database — give a suitable schema, and describe a mechanism for converting it to a suitable input form for WEKA (300 words maximum)
8. Describe a way that you could use appropriate technologies to spread the load over multiple computers, and justify why this would be a good approach (300 words maximum)
Section 6 — Privacy
9. List the three most salient privacy issues related to this analysis, and give strategies you could use to address each of them (300 words maximum)
Section 7 — Report references
10. Provide a correctly structured list of references to all the resources used for this development and report (no word limit)