programming
1. Non-Parametric Percentiles
Unlike the type of statistics we have worked with in class (which typically assumes a normal distribution), non-parametric statistics makes no assumption about the underlying distribution of data. Define a function called percentileFinder that takes two arguments 1) a vector V, and 2) a percentile P (as a decimal from 0 to 1) and returns the number at the P percentile based on the numbers in V. For example, suppose we specify P = 0.9. Then we need to find the number in V that is greater than or equal to 90% of all numbers in V. Your method should not assume any specific statistical distribution for V (it should work for any distribution). This function should have the following form: percentileFinder(V, P). Here are some examples of how the function will be called:
> vec <- c(5,4,3,2,1,10,9,8,7,6) # Notice this contains the numbers 1 to 10
> percentileFinder(vec, 0.9) # Should return 9
> percentileFinder(vec, 0.2) # Should return 2
Hints: 1) Common sense will serve you well on this problem, and 2) consider the sort function and how it might help. You might want to try sort(c(4,1,3,2)) and sort(c(8,2,6,10,4)) in R and see what happens.
2. Estimating a Proportion Error
In the chapter 17 homework problems (problem 13, specifically) you were asked to estimate a proportion for a population similar to the way we did for estimating a population mean. Write a function “estimateProportionError” that takes 3 elements: 1) n – the number of sample elements, 2) p – the probability of a “success”, and 3) c – the required confidence level. The function should have the following form: estimateProportionError(n, p, c). This function will calculate and return the error E within the specified confidence c based on the probability of success p so that the following relationship holds:
Pi = p +/- E with c confidence
Here is an example of how the function should be called:
> estimateProportionError(500, 0.6, 0.95)
# Should return this number: 0.042941448508405
# In other words, Pi = 0.6 +/- 0.042941448508405 with 0.95 confidence
Hints: 1) Use the formulas on pages 435-436, 2) look at the estimatePopulationMean function and consider using the Zc function in the “basic_functions.r” that we made.
3. Number Sequences
Consider the following sequence: 1,2,3,5,8,13,21, … Can you guess which one comes next? It is 34, which is 13 + 21. Here, the nth number in the sequence = sequence number (n – 1) + sequence number (n – 2). Write a function called “nseq” that takes three numbers: 1) first – the first number of the sequence, 2) second – the second number of the sequence, and 3) n – the sequence number of the desired term to be returned. The function should then return the nth term according to this rule. The function should have the following form: nseq(first, second, n). Here is an example of how the function should be called:
> nseq(1,2,5)
# Should return 8 - the fifth term in the sequence, where the first and second
# terms are 1 and 2, respectively
Hint: This is very easy to do using recursion and also easy using iteration ;)