Case study Statistics

profileSchoolDaze
week2presentationandhw354statistics2012.doc

Welcome to the presentation Grouped Frequency Tables in Excel, for the course PSYC 354 at Liberty University. This presentation is meant to be part of a series, so please be sure that you have watched all preceding presentations in order before viewing this one.

You have been learning about frequency distributions this week, including how to construct frequency tables. As you may recall from your reading, frequency tables display data values according to how often they appear in a data set. The frequency table is one of the best ways to organize and display frequencies in an easy to read format. Frequency tables include values, their frequencies, and usually their percentages.

Grouped frequency tables are a type of frequency table that represent the frequencies of groups of values, instead of each individual value. These are often used in psychology to describe various types of data. You can imagine that with a data set that contains 3,000 different scores, the grouped frequency table is a better choice to organize the data in an accessible way, because it does not list each of the 3,000 scores in the table along with its frequency. That would be a very large and useless table! Instead, the grouped frequency table lists groups, or intervals or ranges, of scores, and lists how many individual scores fall within each interval. It is the job of the researcher to decide how large or small the intervals will be, but keep in mind that all of the intervals need to be the same size.

Now that we have reviewed the purpose of frequency tables, let’s look at an example. We will be constructing a grouped frequency table in Excel, which is exactly what you will be doing for your homework assignment this week. On this slide, there is a list of scores that represent the number of hours per week spent reading in a sample of high school seniors. The data as they are arranged here do not tell us very much. We can look them over to get a general idea of the numbers, but they are not well-organized.

When these raw scores are entered into a column in an Excel spreadsheet, they create what is known as the “data array.” In other words, these are the data, or scores, that Excel will use to construct the frequency distribution. Excel can create a simple frequency distribution, which shows the amount of times that each score occurs. However, this week your homework deals with grouped frequency tables, so we will concentrate on learning how to construct these. The process for the simple and the grouped frequency tables is essentially the same, but the grouped tables require a few extra steps.

As you have learned from your text, the first step to constructing a grouped frequency table is to establish intervals for your data. These represent ranges into which scores can fall. When we examine the raw scores in our data set, we see that they range from 1 to 48, with most values falling below 40. For this example, we will choose to group scores into groups, or intervals, of about 10. The intervals are shown in the slide: 0-10, 11-20, and so on, all the way to 41 and up. At this point, it is also recommended that you set up your frequency table with labels as you would like it to appear when it is finished. This is done in the Excel file itself, often a bit to the right of the data array column. Here, we see a table with the intervals listed vertically down the left-hand side, and columns labeled across the top as Frequency and Percent. There is a row for displaying totals at the bottom of the table.

Once our table is set up and our intervals are established, it is time to set up the bins array. The bins array is a column of cells that essentially helps Excel sort data into intervals that represent the groups for your grouped frequency table. These intervals are like bins, or containers, into which Excel drops the values that fit. The bins array consists of numbers that represent the upper limit of each interval (or group), with one number in each cell of the bins array.

Here is our example. If we look at the first interval of our grouped frequency table, we see that it includes any scores equal to or greater than 0, and less than or equal to 10. The upper limit of this interval is 10, so 10 is the first value that goes into the bins array. The second interval has an upper limit of 20, so 20 goes in the next cell of the column. Continue this process until you reach your last interval. If your last interval does not have an upper limit (as seen here), the last cell of the bins array is left empty. If your last interval does have an upper limit, then enter it in the last cell of the bins array.

What is the bins array for? Well, the bins array tells Excel that for each cell containing a value, Excel should count the number of scores in the data array that are equal to or less than that value, but greater than the upper limit value in the preceding cell. So, for the first bin, Excel will count all values equal to or less than 10. For the second bin, Excel will count all values equal to or less than 20, but greater than 10 (so, in this example, between 11 and 20). The last cell in the bins array is left than blank because we are interested in listing all of the values greater than the upper limit in the previous cell, 40, but there is no upper limit that we need to designate. We are interested in listing any score greater than 40 here, whether it is 41 or 410.

(read slide). So, if used correctly, the frequency array formula will fill out the frequency column of a table in one process.

The frequency formula consists of a function and arguments just like the other formulas we’ve learned about. The function, as you may have guessed, is called Frequency. The arguments consist of two things: the cell references of the data array, and the cell references of the bins array. These are entered as ranges of cells, in the format first cell reference in range, colon, last cell reference in range. In our example, the data array (hours reading per week) ranges from cells A2 to A20. The bins array, which defines the upper limits of the bins into which Excel will sort the scores, ranges from cells C2 to C5.

Let’s go through how to enter the frequency formula into a grouped frequency table. The first step is to select the cells in which you want your results to appear. In our example, we want the frequency values to appear in cells F2 to F6, so we select all of these cells by clicking in cell F2 and dragging down to cell F6.

It is important that all of the cells remain selected while you complete step 2. With the cells still selected, click in the formula bar and type the Frequency Array formula. It will appear in the formula bar as you begin to type. Enter the correct cell ranges for the data array and the bins array. As stated before, these cell ranges are A2 to A20 for the data array and C2 to C5 for the bins array.

Our spreadsheet looks like this after entering the formula, but before pressing enter. Note that the formula appears in the formula bar just as we typed it, and it also appears in the first of the selected cells in the frequency table. The data array is outlined in blue, and the bins array is outlined in green.

The last step to entering an array formula is NOT pressing the Enter key. In order to alert Excel that we are using an array formula, there is a special hotkey sequence that we must use instead. Press and hold Control, Shift, and Enter (in that order). Do not let go of any keys until the entire sequence is complete. (In other words, you will use three fingers.) This sequence tells Excel to fill out each of the selected cells in your Frequency column, not just the first one.

After pressing control shift enter, our frequency column is complete. We can see that there are 11 scores between the values of 0 and 10, 4 scores between 11 and 20, and so on, with only one score over 40, in the 41 and up column.

Once the frequencies are filled in, we can compute the total number of scores by adding all of the frequencies together. This is accomplished by simply using the SUM function (covered in Module 1) to add cells F2 through F6 together, then enter the total in the appropriate cell, F7.

Now it’s time to work on the percent frequency column. This column describes the percentage each frequency is of the total. First, select all of the cells in the column (in our example, G2 to G7), including the Total Percent cell, and format these as percentage cells with 2 decimal places, as shown here.

Remember that to compute a percentage, we must divide each frequency by the total number of scores. The total that we are dividing by will remain the same for each percentage, so we will use a fixed cell in our Excel formula. This is accomplished by simply typing the frequency cell reference (for example, F2), then a forward slash, and then the fixed cell reference that you are dividing by (for example, “dollar sign F dollar sign 7). So, in our example, to compute the percentage of scores that fall within the interval 0 to 10, we click in cell G2 and type “equal sign, F2, forward slash, $F$7”. This will compute 11 divided by 19, as shown in the green highlighted areas. Then press enter.

Pressing enter fills in the first cell with 57.89% as shown. Once the first cell is complete, we can simply click on it again and use the Fill Handle to complete the rest of the column, including the total percent cell. (The fill handle is covered in the Module 1 Excel presentation.) Each frequency in column F will be divided by the fixed cell F7, the total number of scores equalling 19. Each value should appear as shown, since we formatted the cells to appear as percentages with 2 decimal places. You may also choose to display the values in this column without the percentage sign—in this case, just format the cells as number cells with 2 decimal places. The total percent should always add up to 100 percent.

Here we have our completed file, with the data array “hours reading per week,” the bins array, and the grouped frequency table. Now it is much easier for us to get a general idea of what our data look like. Instead of a list of raw scores, we now have a table that describes our data in an organized fashion. We can see from the outset that over half of our sample of high school seniors spend somewhere between 0-10 hours reading per week. We can also see that only one student in our sample reported reading more than 40 hours a week.

We end with a summary of the steps you have learned to create a grouped frequency table in Excel. You should now be prepared to complete this week’s Excel assignment.

image1.png

image2.png

image3.png

image4.png

image5.png

image6.png

image7.png

image8.png

image9.png

image10.png

image11.png

image12.png

image13.png

image14.png

image15.png

image16.png

image17.png

image18.png

image19.png

image20.png

image21.png

Welcome to the presentation Graphing in Excel for PSYC 354 at Liberty University.

This presentation is meant to be part of a series, so please be sure that you have watched all preceding presentations in order before viewing this one.

This week we are focusing on learning to display data using graphs and charts. Graphs and charts are a visually appealing way to summarize and display data. These can include line graphs, bar graphs, scatterplots, pie charts, and many others. Most common graph and chart formats are available in Excel.

Here are a few examples of graphs and charts made using Excel. This week we will focus on using Excel to create a histogram, which is a graph that appears similar to a bar or column graph. There are many examples of histograms in Chapter 1 of your text.

As you may recall from your text, the horizontal axis of a histogram is typically labeled with numbers that represent the midpoints of each interval. In order for this to occur in Excel, we need to create another table that lists the midpoints of the intervals (instead of the intervals themselves) next to the frequencies of each interval. In order to determine the midpoints, find the number that is exactly halfway between the lowest number of the interval and the lowest number of the interval above it. Thus, we find the midpoint for the interval 0-10 by determining the value that is halfway between 0 and 11, which is 5.5. We then go through and complete the rest of the column, filling in each midpoint. The last cell of this particular table contains the values 41 and greater. Thus, there is no real midpoint. We can keep the label “41 and up” for this particular row.

Here is our new table with the interval midpoints. To create this table, we simply click in a new column, enter the midpoints, and enter the frequencies in the column to the right, as shown. At this point, it is important that you do not try to copy and paste the frequencies from last week’s frequency table into your new one, but rather type them manually. This will avoid any fatal errors that sometimes occur when an array formula is changed in Excel. Errors like these typically freeze the program and cause you to have to shut down.

Now that we have our small table with the interval midpoints and frequencies, we can begin to create a histogram. The first step is to select the entire new table we just created, then click on the Insert tab in the Excel ribbon. This will bring up some different options, and we are interested in the “Charts” area. Click on “Column”, then click on the first 2-D Column chart, which is titled the Clustered Column chart. You will notice that Excel recommends this type of graph for constructing histograms.

This is a screenshot of exactly what Excel generates when you insert the clustered column chart. Though it is basically correct, we want to change the chart’s format and title, and we want to add axis titles and perhaps change the colors to create our finished product. We will do this using a few different tools in Excel. If you take minute to click around within the table, you can see that it is possible to select different areas and edit them—like the bars, or the titles, or the axis labels, for example.

First, we will change the format of our chart to reflect the histograms you see in your textbook. You may remember that the bars should touch one another with no gaps, and there are a few steps involved in doing this. First, click on your new chart, then click the design tab. Go to Chart Layouts, then click on the second downward arrow, which will bring up the entire Chart Layout menu. From this menu choose “Layout 8” or whichever layout in your version of Excel that shows the chart bars touching one another.

The format of our chart now looks like this. However, there are a few more things to do before this chart is finished. Let’s begin by making some tick marks that point to the center of our intervals. These will help delineate the midpoints.

To do this, right click directly on one of the numbers in the horizontal axis (like 5.5 or 15.5). Be sure to right click directly on a number—otherwise, the wrong pop-up menu will appear. Here you can see the correct menu—choose the option at the bottom that says “Format Axis.”

This brings us to a dialogue box that gives us many different options for editing the axis. For this presentation, we are interested in adding tick marks to our graph beneath each bar to represent the midpoints. To add tick marks to the middle of a bar, choose “Minor tick mark type” then chosse “Outside”. This will add a tick mark to the bottom center of the bars on the outside of the horizontal, or X, axis. If you choose Major tick marks, this will place marks on the outer edges of each bar.

In the first image, we can see that the tick marks have appeared at the midpoint of each bar. Now, we would like to change the Axis titles to be more descriptive. We can change both horizontal and vertical axis titles by clicking twice on the title, deleting the current text, and entering the desired title.

We can change the chart title in the same way. When the text is highlighted, a box pops up that may be familiar to many of you who use Microsoft Word. This box allows you to make changes to the size and color of the text, to center your text, make it bold, or fill the entire selected area with a certain color. If the box disappears during work, simply right click on the selected text to bring it back.

Once we have changed the chart and axis titles, our histogram is basically complete as you can see here. But let’s say that we want to change the color of the bars in the histogram.

If we right click directly on the bars of the histogram, the pop-up menu on the left appears. Choose “Format data series” from the bottom of the menu. A dialogue box then appears that gives several options for editing the data series, or the bars in this case. Another example of a data series would be a line in a line graph. Click on Fill at the left, then click the circle next to “Solid Fill”. This will fill all of the bars with the same solid color. Now click the color arrow to bring up the Color Menu (as marked by the words “click here”). From there, you can choose a color for the bars of your histogram.

Our histogram’s data series is now orange, and our histogram is complete. Of course, you may prefer another color, or another a different font style. The important thing to remember is to keep your histogram easy to read and understand, with clear labels and titles.

We end with a summary of the topics we’ve covered in this presentation.

image22.png

image23.png

image24.png

image25.png

image26.png

image27.png

image28.png

image29.png

image30.png

image31.png

image32.png

image33.png

image34.png

image35.png

image36.png

image37.png

image38.png

· Discussion Board Forum Due Friday November 2, 2012

· Accurate Presentation of Data

One of the goals of this course is to make students aware of the responsible and irresponsible use of statistics in everyday situations. Results reported in social science studies are often picked up by the media, and psychologists are often interviewed to briefly describe the results of potentially important studies. For these reasons, it is important for psychologists to conduct research with truthfulness and integrity, and to maintain standards of accuracy and clarity in the reporting and display of statistical data that will be disseminated to the public.

Standard 5 of the APA’s “Ethical Principles of Psychologists and Code of Conduct” provides ethical guidelines concerning public statements made by psychologists. For example, Section 5.01 states that “Psychologists do not knowingly make public statements that are false, deceptive, or fraudulent concerning their research, practice, or other work activities…” ( http://www.apa.org/ethics/code/ ). This requirement integrates well with the Christian virtues of truthfulness, conscientiousness, and integrity.

With this in mind, examine the two graphs found in the attachment below under "Discussion Board Forum 1 Graphs." A certain agency is trying to decide which graph it should include in a brochure released to the general public. Each graph represents the same set of data, but they differ in many important respects. Information concerning the “goodness” or “badness” of graphs is part of this module/week’s study (Dr. Triola’s presentation “ Critical Thinking - Bad Graphs” and Aron et al.: chapter 1).

Based on the reading and presentations from this module/week, please answer the following questions:

1. Which graph is a fair or accurate representation of the data, and which is a distorted representation? Give at least two reasons why. (Be sure to cite your textbook and/or a presentation from the Reading & Study folder in your response.)

2. Find and share at least one scriptural passage that relates to showing conscientiousness, integrity, or truthfulness, and discuss how it relates to the responsibility of Christian psychologists to report data accurately and honestly.

Your original thread should be 250 words long and is due by 11:59 p.m. (ET) on Thursday. Your replies to two classmates’ posts should be 100 words each and are due by 11:59 p.m. (ET) on Monday of the same module/week.

image39.png

· All unemployment data from the US Bureau of Labor Statistics: www.bls.gov

Graph 2

image40.png

· All unemployment data from the US Bureau of Labor Statistics: www.bls.gov

Excel Homework 2

(70 pts Possible)

The objective of your second Excel assignment is to learn to construct a grouped frequency table and histogram in order to better organize and display a given data set. First, be sure you view the PointeCast presentations that present how to construct frequency tables and histograms in Excel, found in the Reading & Study folder in Module/Week 2. These presentations go through the steps you will need to be familiar with in order to complete this assignment.

In Module/Week 2, you will be using a hypothetical data set that consists of scores on the Beck Depression Inventory (BDI) (Beck, A.T., Ward, C.H., Mendelson, M., Mock, J., &Erbaugh, J.1961) to create a frequency distribution and histogram, following the steps you learned during the presentations.

PART ONE: GROUPED FREQUENCY TABLE (30 pts total—see values below)

1. Research Question: The BDI (Beck et al., 1961) is an instrument that is widely used to assess levels of depression in individuals in a variety of settings. In our example, the scores range from 0–69 (whole numbers only—note that the actual BDI has a high score of 63, but we’ve changed it for the sake of making the intervals evenly spread out).Certain “cutoff scores” are assigned to certain levels of depression. These levels roughly include the categories of Minimal (0–9), Mild (10–19), Moderate (20–29), and Severe (30–69). A certain researcher has administered the BDI to a sample of college students, and she would like to find out how many students fall into each category based on their individual scores. She also decides to break the “Severe” category up into four different levels called Severe 1 (30–39), Severe 2 (40–49), Severe 3 (50–59), and Severe 4 (60–69), creating seven categories instead of the original four. This researcher would like to construct a grouped frequency table to visually display the number of participants in each category.

2. Open the “Data Set 2” attachment under the “Excel Homework 2” header located in the Assignment Instructions folder. This file contains:

a. The raw BDI scores of the students in our researcher’s study. The scores have already been sorted from smallest to largest. Take a minute to scroll through the data to get familiar with it.

b. An empty frequency table that you will complete as part of your assignment.

3. In Column F, enter the correct score interval for each “Severity” level in the table (column E). For example, the score interval for Minimal is already filled out for you (09). See the “Research Question” above for the other intervals. These score ranges represent the class intervals for your frequency table. Do not put anything in the “Frequency” column (G) yet—this is where you will enter your results when you use the FREQUENCY array formula. (6 pts)

4. Using the steps that you learned in the presentation “Grouped Frequency Tables in Excel,” create a Bins Array that is based on the upper limits of the intervals that you entered in Column F in the table. You may place the Bins Array anywhere on the spreadsheet between the raw data and the frequency table. (6 pts)

5. Again, using the steps that you learned in this Module/Week 2’s presentation, use the FREQUENCY array formula to finish filling in the FREQUENCY column of the table. These frequencies (the results of your formula) should appear in Column G. (6 pts)

6. Our researcher would also like to enter the total number of scores under the frequencies in Column G. Use the SUM function to compute this value and have it appear in Cell G11. (6 pts)

7. Finally, fill in the Percent column (H). First, format the cells in column H of the table as “Percent” cells with two decimal places. Then,compute the corresponding percentage for each frequency, following the steps that appear in the PointeCast presentation, and have these appear in cells H4 through H10. In cell H11, compute the total percentage as the sum of cells H4-H10 (this should equal 100.00%). (6 pts)

PART TWO: HISTOGRAM (30 pts total—see values below)

The goal of Part Two of your homework is to create a histogram using the frequency table you have constructed, using the steps you learned during the presentation.

1. Determine the interval midpoint for each interval in the table. See your text to review how to compute midpoints.

2. Set up a new table with interval midpoints in one column and frequencies in the next, as shown in Module/Week 2’s presentation. Remember, do not copy and paste the frequency column from the frequency table due to errors that may occur with the Excel array formula. Instead, enter the frequencies manually.ONE IMPORTANT NOTE CONCERNING EXCEL: For the purposes of constructing this histogram, you should only put a heading at the top of the column that contains the frequencies in your table (i.e. “Frequencies”). If you also put a heading at the top of the column containing the midpoints, a default function in Excel will cause your histogram to chart the columns as two separate data series, which is not what we’re aiming for, so leave the midpoints column unlabeled. (10 pts)

3. Highlight the area of the worksheet that contains the data you want to use to create your histogram. In this case, this will be the table with midpoints that you have just created. Select the cells containing the midpoints and frequencies—this will highlight the information needed for both the X and Y axes. Note that we are not using any percent information at this time.

4. Click on “Insert” and choose the “2-D Clustered Column” Chart format, as shown in this Module/Week 2’s presentation.

5. Click on the chart, click on the Chart Tools “Design” tab, and choose the “Chart Layout” that shows all of the bars as touching (as in this Module/Week 2’s presentation). For many of you, this will be Chart Layout 8. (10 pts for correct set-up)

6. Label the horizontal and vertical axes appropriately, and give your chart a title. Include the “Minor tick marks” we went over in the presentation as well. (10 pts for correct labeling and presentation)

The size of the histogram may need to be adjusted to fit all of the data labels neatly into the graph. If so, remember to resize carefully so that you do not change the ratio of the axes (i.e. making the chart exceptionally more tall or wide than it first appears). You may personalize your histogram by changing colors or font types, but keep in mind that your finished product should be easy to read and comprehend.

Save the completed file with frequency table and histogram to your computer as “yourname_Excel2.xls”.

PART THREE: QUESTIONS (10 pts total—see values below)

In the rows beneath your table/histogram in Excel, number and answer the following questions based on the Excel assignment:

1. Which variable represents the Data Array? (2 pts)

2. How many numbers are in your Bins Array? (2 pts)

3. Based on your frequency table, which of the seven intervals contains the largest number of scores? (2 pts)

4. What percentage of students are in the Mild AND Moderate categories? (2 pts)

5. Based on your frequency table, what is the total number of students who fit into the original BDI category of Severe (30–69)? (2 pts)

Your finished Excel Worksheet, including data and the answers to the above questions, should be submitted to Blackboard by 11:59 p.m. (ET)on Monday of Module/Week 2.