lab work HU

profiletejvallabh
07_lab.Rmd

--- title: "Correlation" author: "Enter Your Name" date: "`r Sys.Date()`" output: word_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` *Title*: Big Data Analytics Services for Enhancing Business Intelligence *Abstract*: This article examines how to use big data analytics services to enhance business intelligence (BI). More specifically, this article proposes an ontology of big data analytics and presents a big data analytics service-oriented architecture (BASOA), and then applies BASOA to BI, where our surveyed data analysis shows that the proposed BASOA is viable for enhancing BI and enterprise information systems. This article also explores temporality, expectability, and relativity as the characteristics of intelligence in BI. These characteristics are what customers and decision makers expect from BI in terms of systems, products, and services of organizations. The proposed approach in this article might facilitate the research and development of business analytics, big data analytics, and BI as well as big data science and big data computing. # Dataset: - Gender of the participant surveyed on these topics - Temporality: an average score of the rated ability to adapt to change over time 1 (not changing) to 7 (changing a lot) - Expectability: a rated degree of satisfaction with the BI - Relativity: average score rating of how much better one system is than another in BI 1 (not very good) to 7 (very good) - Positive emotion: how positive participants felt about BI (higher scores are more positive, ranges from 1 to 7). ```{r starting} ``` # Data Screening: ## Accuracy: a. Include output that indicates if the data are or are not accurate. b. If the data are not accurate, delete the inaccurate scores. c. Include a summary that shows that you fixed the inaccurate scores. ```{r accuracy} ``` ## Missing: a. Since any accuracy errors will create more than 5% missing data, exclude all data pairwise for the rest of the analyses. ```{r missing} ``` ## Outliers: a. Include a summary of your mahal scores. b. What are the df for your Mahalanobis cutoff? c. What is the cut off score for your Mahalanobis measure? d. How many outliers did you have? ```{r outliers} ``` # Assumptions: ## Linearity: a. Include a picture that shows how you might assess multivariate linearity. b. Do you think you've met the assumption for linearity? ```{r linearity} ``` ## Normality: a. Include a picture that shows how you might assess multivariate normality. b. Do you think you've met the assumption for normality? ```{r normality} ``` ## Homogeneity and Homoscedasticity: a. Include a picture that shows how you might assess multivariate homogeneity. b. Do you think you've met the assumption for homogeneity? c. Do you think you've met the assumption for homoscedasticity? ```{r homogs} ``` # Hypothesis Testing / Graphs: Create a scatter plot of temporality and relativity. a. Be sure to check x/y axis labels and length. b. What type of relationship do these two variables appear to have? ```{r plot1} ``` Create a scatter plot of expectability and positive emotion. a. Include a linear line on the graph. b. Be sure to check x/y axis labels and length. c. What type of relationship do these two variables appear to have? ```{r plot2} ``` Create a scatter plot of expectability and relativity, grouping by gender. a. Include a linear line on the graph. b. Be sure to check x/y axis labels and length. c. What type of relationship do these two variables appear to have for each group? ```{r plot3} ``` Include a correlation table of all of the variables (cor). a. Include the output for Pearson. b. Include the output for Spearman. c. Include the output for Kendall. d. Which correlation was the strongest? e. For the correlations with gender, would point biserial or biserial be more appropriate? Why? ```{r correl1} ``` Calculate confidence interval for temporality and relativity. ```{r cicorrel1} ``` Calculate the difference in correlations for 1) temporality and expectbility and 2) temporality and positive emotion. a. Include the output from the test through Pearson's test. b. Is there a significant difference in their correlations? ```{r correl2} ``` Calculate the difference in correlations for gender on temporality and relativity. a. Include the output from the test. b. Is there a significant difference in their correlations? ```{r correl3} ``` Calculate the partial and semipartial correlations for all variables, and include the output. a. Are any of the correlations significant after controlling for all other relationships? ```{r partials} ``` # Theory: - What are we using as our model for understanding the data in a correlational analysis? - How might we determine model fit? - What is the difference between correlation and covariance? - What is the difference between R and r? - When would I want to use a nonparametric correlation over Pearson's correlation? - What is the distinction between semi-partial and partial correlations?