R studio assignment
#GIS 270 #In-class lab and assignment #2 #The first half of the script is for our in-class lab on October 12 #We will all work with the same data and conduct the same analysis #This script serves as a key template for the assignment, so take notes/comments!! #The second half of the script has prompts for you to complete. #You will upload this R script, and only this R script, to receive credit for the assignment #Upload to Canvas when complete. #Use comments to answer open-ended questions or provide other information about your work #The more comments, the better!!! #########IN CLASS LAB - OCTOBER 12############ #Signpost E tempe_data=read.csv("https://raw.githubusercontent.com/davidhondula/gis270/master/TempeHeatSurvey_Demo_Oct11_IndoorOutdoor.csv",header=T) #Metadata from Tempe survey: question text #Q10A_Outdoor: What was the highest temperature you felt comfortable being OUTDOORS during your normal routine? #Q10B_Indoor: What was the highest temperature you felt comfortable being INDOORS during your normal routine? #DE4_Gender: What do you consider your gender? #Let's make a prediction for outdoor and indoor #Enter hypotheses here #Outdoor limiting temp H1=100 #Indoor limiting temp H2=80 head(tempe_data) summary(tempe_data) hist(tempe_data$Q10A_Outdoor) hist(tempe_data$Q10B_Indoor) #Signpost F n1 = length(tempe_data$Q10A_Outdoor)-sum(is.na(tempe_data$Q10A_Outdoor)) test1=t.test(tempe_data$Q10A_Outdoor,mu = H1) test1 test1$statistic test1$p.value t1=(mean(tempe_data$Q10A_Outdoor,na.rm=T)-H1)/(sd(tempe_data$Q10A_Outdoor,na.rm=T)/sqrt(n1)) print(t1) #Signpost F2 pt(t1,n1-1,lower.tail=T) pt(t1,n1-1,lower.tail=F) abs(qt(0.025,n1-1)) #Signpost G test2=t.test(tempe_data$Q10B_Indoor,mu = H2) test2 #Signpost H test2=t.test(tempe_data$Q10B_Indoor,mu=H2,alt="two.sided") print(test2) test3=t.test(tempe_data$Q10B_Indoor,mu=H2,alt="greater") print(test3) test4=t.test(tempe_data$Q10B_Indoor,mu=H2,alt="less") print(test4) #Signpost J sample1=tempe_data$Q10A_Outdoor[tempe_data$DE4_Gender=="Male"] sample2=tempe_data$Q10A_Outdoor[tempe_data$DE4_Gender=="Female"] summary(sample1) summary(sample2) test8=t.test(sample1,sample2,alt="two.sided") print(test8) test9=t.test(sample1,sample2,alt="greater") print(test9) boxplot(tempe_data$Q10A_Outdoor~tempe_data$DE4_Gender,xlab="Gender",ylab="Preferred Temperature",main="Gender Differences in Preferred Temperature",col="white") #########ASSIGNMENT 2 BEGINS HERE################## sdata=read.csv("https://raw.githubusercontent.com/davidhondula/gis270/master/ClassSurvey_V4_Sept21.csv",header=T) print("Assignment #2 for MY NAME HERE") #PLEASE USE THE PRINT FUNCTION FOR YOUR ANSWERS, SUCH THAT THEY APPEAR IN THE CONSOLE #EXAMPLE print("The scale variable I am using is Q14_Numberofpushups") print("I hypothesize that the class average is different than 20 pushups") #ALL COMMANDS/FUNCTIONS SHOULD BE EXECUTED IN THE CODE AND NOT WRITTEN AS COMMENTS #USE PRINT TO DISPLAY IMOPRTANT OUTPUT IN THE CONSOLE (E.G., TEST STATISTICS) #IF YOU HAVE TROUBLE, PLEASE USE COMMENTS TO EXPLAIN YOUR ERRORS/EFFORTS TO OVERCOME THEM ############################### #Q1 #What day of the month does your birthday fall on? Assign that value to be x. x=1 #replace 0 with your birthday day of month here #Imagine you were to flip a coin 50 times. What is the probability of getting exactly x heads? #Write code in the space below that shows how you got to your answer. #Be sure to include a sentence where your answer is clearly stated. n=50 m=0.5 trails=c(1:50) CF1=dbinom(18,size=n,prob=m) print(CF1) #about 1.8% #In the same trial of coin flips, what is the probability of getting AT LEAST x heads? #Write code in the space below that shows how you got to your answer. #Be sure to include a sentence where your answer is clearly stated. CF2=dbinom(0:50,size=n,prob=m) CF3=dbinom(18:50,size=n,prob=m) sum(CF3) #about 99.7% #Produce a bar graph that shows the probability of getting 0 through 50 heads in a series of 50 coin flips. barplot(CF2,names.avg=seq(0,50,1)) ############################### #Q2 #Select a scale variable from the class survey you will use for your final project #Name of scale variable = Q14NumberofPushups #Generate a NON-DIRECTIONAL hypothesis about the MEAN of your scale variable #what is your hypothesis? #The mean won't be around 20. #Why did you make that hypothesis? #I think many people will do 20 pushups. #Write the null hypothesis for a statistical test: #The mean will be around 20. #Write the alernative hypothesis for a statistical test: #Ha:Average number of pushups=/=20. #Run a one-sample, two-tailed t-test to test your hypothesis #Enter code below to show your work t.test(sdata$Q14_Pushups,mu=15,alt="two.sided") #What was your sample mean? #The sample mean was 22.46. #What is the test statistic that you calculated? #The test statistic is 1.8. #What is the p-value of the test statistic that you calculated? #The p value is 0.075. #Should you accept or reject your null hypothesis? #I should accept mu null because my p value was more than 0.05. #Write a plain language sentence that describes the outcome of your statistical test. #The class average is significantly different from my hypothesis means of 15. ############################### #Q3 #Generate a DIRECTIONAL hypothesis about the MEAN of the scale variable you examined in Q2 #what is your hypothesis? the mean won't be 15' #Why did you make that hypothesis? I think many people will do pushupsless than 15 #Write the null hypothesis for a statistical test: #Write the alernative hypothesis for a statistical test: True means is les than 15 #Run a one-sample, one-tailed t-test to test your hypothesis #Enter code below to show your work t.test(sdata$Q14_Pushups,mu=15,alt="less") #What was your sample mean? #22.46 #What is the test statistic that you calculated? #1.85 #What is the p-value of the test statistic that you calculated? #0.96 #Should you accept or reject your null hypothesis? #accept null hypothesis because p value is greater than 0.05 #Write a plain language sentence that describes the outcome of your statistical test. ############################### #Q4 #Generate basic descriptive statistics and a histogram for the scale variable you examined in Q2 & Q3. #Show the code you used to generate those statistics and histogram in the space below. summary(sdata$Q14_Pushups) hist(sdata$Q14_Pushups) #Review your descriptive statistics and histogram. #Do the outcomes of your statistical tests make sense, given the patterns you see in your descriptive statistics and histogram? #Write a short explanation here. ############################### #Q5 #We're now going to look at how your scale variable might vary between different groups #Pick a categorical variable from the survey that you think might show differences in your scale variable print("Q36_numofpets") #Name of categorical variable = #Identify two, and only two, categories of your categorical variable that you would like to examine #which two categories did you select? #How do you think your scale variable will vary between the two groups you selected? #You are welcome to make a DIRECTIONAL or NON-DIRECTIONAL hypothesis #Write your hypothesis here: #Does your hypothesis call for a one-tailed or two-tailed test? Explain below #What is your null hypothesis? #What is your alternative hypothesis? #Extract data for each of your samples from the main data set using conditional logic as shown in the in-class example #Write your code in the space below sample1=sdata$Q14_Pushups[sdata$Q60_NumofPets==0] sample1=sdata$Q14_Pushups[sdata$Q60_NumofPets==1] #Which variable are you considering to be sample #1? Yes #Which variable are you considering to be sample #2? No #What is the sample size for each of your two samples? #Write the code you used to calculate sample size in the space below. #Run the appropriate two-sample t-test to test your hypothesis #Enter code below to show your work #What were the two sample means? #What is the test statistic that you calculated? #What is the p-value of the test statistic that you calculated? #Should you accept or reject your null hypothesis? #Write a plain language sentence that describes the outcome of your statistical test. ############################### #Q6 #Generate basic descriptive statistics and a box-plot for the two sub-samples of the scale variable you examined in Q5. #Show the code you used to generate those statistics and boxplot in the space below. #It is okay if your boxplot shows more than two categories - focus only on the categories you tested #Review your descriptive statistics and histograms #Does the outcomes of your statistical tests make sense, given the patterns you see in your descriptive statistics and boxplot? #Write a short explanation here. ############################### #Q7 - BONUS print("Q7") #This assignment will be scored out of 100 total points. How many do you think you earned? print("I predict that my score on this assignment is XX out of 100.") #How many HOURS (rounded to the nearest 0.5) would you estimate that you spent on this assignment? print("I spent approximately XX hours on this assignment.") print("############END###################")