advance analytics

alija
HowtoUsethewithfunction.docx

How to Use the with() function

The purpose of the with() function is to attach a new vector to a data frame, if certain conditions are qualified. For instance, in the acs_ny.csv file, there is a column called “NumChildren”. It shows how many children a family has. However, in question 2, we want to compare if there is a difference in family incomes between families with kids and families without kids. We don’t need to consider how many kids each family has. We only need to consider whether or not a family has kids. We need to create a new vector called “HasKidsOrNot” according to the values in “NumChildren”. If the value in “NumChildren” is zero, we want to give a “No” in “HasKidsOrNot”. Otherwise, we give a “Yes” in “HasKidsOrNot”. By doing this, we can classify the data set into families with kids and families without kids. We use the with() function to achieve this goal.

1) Import the data set into R.

acsData<-read.table("acs_ny.csv",sep=",",header=T)

2) Use head() function to take a look at the current dataframe

head(acsData)

Notice that the last vector is Language.

3) Use the with() function

acsData$HasKidsOrNot<-with(acsData,NumChildren>0)

4) Use head() function to take a look at the current data frame

head(acsData)

Notice that the last vector is HasKidsOrNot. What we did was to attach a new vector called HasKidsOrNot to acsData data frame by using the with () function. The with() function did an estimation by using the values in the NumChildren vector. If the value is greater than 0, which means a family has at least one kid, with() will give a “Yes” to HasKidsOrNot; Otherwise, with() will give a “No” to HasKidsOrNot.

Now, you can use the ANOVA test to perform the analysis for this question.

image1.png

image2.png