Business assignment
BUS 354
SPRING 2021
Assignment 1 – Descriptive Analysis
DUE DATE: Sunday February 14th at 11:59PM
Email your submission as one document (word or PDF) to: [email protected]
Many of these questions ask you to create a figure and discuss. When you create tables, charts, and graphs you
need to think about formatting. Axes should be labeled, charts should have titles, and so on. You do not have to
format in one specific way, but work on making these neat and easy to read. Be careful when copying and pasting
documents into word (or similar programs). You may want to paste as a picture. Make sure I can see the
chart/table in the document. Please ask me if you have any questions.
Section 1: State-level analysis of the pandemic and related variables (28 points).
For this first set of questions, we will use the Excel file: “State data for assignment 1.xlsx”. This includes
information on the 50 states plus DC (n = 51) about vaccination rates, infection rates, unemployment, and a few
other variables. The worksheet titled “Notes” in the Excel file provides a description of each variable and the
sources of the data used.
1. (10 points) Focus on the “% Infected” variable. This shows the percentage of the population that has
tested positive for COVID-19 at any point during the pandemic.
- Create a table that shows the descriptive statistics in three columns: one for the overall sample, and
then separately for the states with a Republican governor and for the states with a Democratic
governor. In each column, you should present the full set of “summary statistics” from the Data
Analysis tool in the Data tab of Excel.
- Create a set of boxplots (on one chart) that show the distribution of % Infected by Census region
(south, west, etc.).
- Summarize what we learn about this variable from the information in the table and chart you
created. How would we describe the variable overall? What differences (if any) are noticeable by
political party or region? You do not need to comment on every statistic, but you should give a
discussion of what we learn.
2. (5 points) Create a correlation matrix that includes all four of the quantitative variables present in the
dataset. Discuss the results of the matrix. What are the strongest and weakest relationships? Do the
results (strength, direction) match your expectations?
3. (6 points) Consider the relationship between a state’s unemployment rate and the cumulative case count
(% infected).
- Create a scatter plot that shows the relationship between these two variables. Add data labels (using
the “State Label” abbreviation).
- Explain how you chose which variable to put on the x-axis.
- Interpret the results of the scatter plot in terms of strength, direction, form, and unusual (features
such as outliers).
4. (7 points) Wildcard! Create your own table, figure, or chart that examines the relationship between any
two variables (don’t do the relationship examined in question 3). This can include categorical or
quantitative variables (or even turning a categorical variable into a quantitative variable).
- Present your table/chart with titles, labels, etc.
- Explain your thought process in creating the figure (why is this interesting, for example?)
- Discuss what we learn from your table or chart. Does this match what you expected?
Section 2: Descriptive Analysis of the Stock Market (27 points).
For this set of questions, you will collect the data. This will allow you to see how easy it can be to collect some
basic market data.
To create the dataset:
(1) Go to Yahoo Finance: https://finance.yahoo.com/
(2) In the search bar: search for Amazon (or use the symbol (AMZN)
(3) Click on the “Historical Data” tab
(4) Set the time period of January 2, 2020 to February 5, 2021 and use daily prices.
(5) Click “Apply”
(6) Under the “Apply” button, there should be an option to download the data.
(7) An Excel file should download. The only columns we want are the Date and the Adjusted Closing Price
(“Adj Close”). Delete the other columns.
(8) Check that there are 278 rows in your dataset
(9) Rename the “Adj Close” column “Amazon”.
(10) Save the Excel file as whatever name you like
(11) Go through the same steps for two more sets of data: Netflix (NFLX) and the Dow Jones Industrial
Average (DJI).
(12) Take the “Adj Close” columns from each of these (they should be called “Netflix” and “Dow Jones”)
and place them next to the Amazon column in your main Excel file that you have named. Make sure
the dates align.
(13) You should have a dataset with 4 columns (Date, Amazon, Netflix, Dow Jones) and 278 rows.
Now we will use the dataset to perform some descriptive analysis:
5. (6 points) Correlations between stock prices
- Calculate the correlation between Amazon and Netflix stock prices. Discuss how we should interpret
this number.
- Calculate the correlation between Amazon’s stock price and the Dow Jones Industrial Average price.
As we have discussed in class, the Dow Jones Average is an index that looks at the price of stock for
30 large companies. This is often used to give an idea of how the financial market is doing overall.
Interpret what the correlation tells us in this case.
- Calculate the correlation between Amazon’s stock price and the Dow Jones for two separate periods:
2020 (the whole year) and 2021 (January and beginning of February). Discuss the two numbers and
what these tell us.
6. (6 points) Time plots – focus on the Amazon stock price.
- Create a time plot (line chart in Excel) that shows the Amazon stock price for the entire period.
Discuss the trends that are shown in the plot. What impact (if any) did the pandemic have on Amazon
stock?
- Create a second time plot that shows the Amazon stock price for just the 2021 time period. How
would you describe the trend over this period?
7. (6 points) Percentage changes.
- For each of Amazon and Netflix, create a new column that is the daily percentage change in stock
price. Sort the data by Date – oldest to newest (if it is not already sorted). There will be no
percentage change for the first day (Jan 2, 2020). Starting on Jan 3, 2020, calculate the percentage
change in Amazon’s closing price based on the day before ((new-old)/(old)). Drag this formula down
so it calculates for every day. Then do the same thing for Netflix.
- Create a table that shows the summary statistics for each of these. Comment on how the percentage
changes compare for each of the companies.
8. (4 points) Histogram
- Create a histogram using the percentage change in Amazon stock price variable that you created in
question 7. Comment of the shape of the distribution, and what this tells us about what we should
expect in terms of daily fluctuations in Amazon’s stock price.
9. (5 points) Predictions
- Predict what the Amazon closing stock price will be on each of the following dates:
o Friday February 19th
o Friday April 30th
- Discuss how you came up with your predictions for the price on these dates. Note: you do not need
to have a sophisticated method, just explain your reasoning. Did you base on trends in the data? Did
you use the summary statistics? Did you base on your expectations about what will happen in the
economy over the next few months?