analyzing and visualizing data assignment
DATASET 5
Dataset
Venkat reddy Maddi
Running head: DATASET 1
University of Cumberlands
The selected dataset is:
Tesla stock data from 2010 to 2020
Link https://www.kaggle.com/timoboz/tesla-stock-data-from-2010-to-2020/metadata
Why you selected this data set?
The reason as to why I selected the dataset is the data has rich rows and columns that provide more than sufficient data to manipulate and play with.
Examination
Here are the physical properties of the dataset.
Size - 170.98 KB
File type - CSV file
Condition of data – Raw data
Tesla has been on the rise with hikes of +100% in one month. This dataset is aimed at finding out the history of Tesla and why it is hiking at such a high rate. Here is the view of the data in excel.
Transformation
The transformation of data involves converting, consolidating, cleaning, and creating. This process should occur both during and after the examination stage. The flaw is identified and cleaning starts. These flaws will be used and utilized in this particular dataset. This process will aim to rid the data of any errors. Data that has errors contaminates and tampers with the results of the analysis by producing incorrect results. Any unnecessary data within the dataset is discarded. Changing the data fields is an efficient method to clean the data. Also removing any spaces present between the data help in the elimination of errors. The use of a single word increases the conciseness of the data in the dataset. Large paragraphs or sentences should not be used to name the fields. The sorting and filtering tools in excel also helped in cleaning the data. To consolidate the data, the financial records for Tesla in the past 10 years would go a long way in helping me achieve certain milestones (Piwowar, Chapman, 2009).
Exploration
In this data analysis, the data was mainly for use in the visual representation of the dataset. The aim was to get generalized and easier to understand the view of the data. The mode of exploration focused on was visual techniques as opposed to statistical analysis in a bid to understand the characteristics of the dataset.
A benefit of data exploration is that it helps the analyst inform themselves better about the subject being analyzed. The continued in-depth exploration of Tesla’s stock data since 2010 has increased my understanding and grasp of the most intricate details with interesting patterns and relativity of data being discovered (Morillo, Huanga, Ferri, 2019).
References
Piwowar H. A., Chapman W. W., (2009) Public sharing of research datasets: a pilot study of associations HHS Public Access vol4 (2) 148-156 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3039489/
Morillo P., Huanga D. V., Ferri C., (2019) A dataset of attributes from papers of machine learning conference Data in Brief vol24 https://www.sciencedirect.com/science/article/pii/S2352340919301878