Assignment 2: Methodology - PTSD

Module4Overview.docx

Home >Psychology homework help >Assignment 2: Methodology - PTSD

Module 4 Overview

Before you begin the journey into the world of statistical analysis, put away all your fears and fantasies about data analysis. By becoming familiar with the two main types of statistics (descriptive and inferential), you will acquire a benevolent tolerance, and even a reverent respect, for the almost magical things statistics will enable you to do in the behavioral sciences.

You will discover that descriptive statistics (mean, mode, median, standard deviation, and frequencies) are used only to describe the characteristics of data rather than to draw conclusions or make inferences from the measurement data collected. However, the importance of descriptive statistics cannot be challenged because they form the basis for the workings of inferential statistical processes, especially the mean. In behavioral science data analysis, one of the most important concepts is that regardless of the topic or the issue being investigated, everything is based on the mean of a data set. Although you cannot make predictions from descriptive statistics, they are useful when combined with inferential statistics.

It is important to keep in mind that the primary purpose of statistics is to make order out of chaos. By properly applying the selected statistical processes to measurement data, you, as a research investigator, can determine whether a decision can be made with definitiveness, accuracy, clarity, and representation. Without the application of a prudent statistical process, decision making is often nothing more than an impulsive and erroneous decision.

In this module, you will also learn that there are two branches of statistical analysis, parametric and nonparametric. Module 4 presents extensive details on inferential statistical processes through a discussion on parametric processes and nonparametric statistical analysis. This module also includes information on the importance of providing a solid rationale for using a specific type of statistical process to determine relationships or differences among variables needed to analyze and interpret the data collected in quantitative research.

Provides the learning outcomes on which the readings and assignments for this module are based.

· Discuss issues related to the calculation, analysis, interpretation, and reporting of statistical output.

· Identify and describe commonly used statistical tests.

· Describe the use of descriptive statistics in forensic psychology.

· Identify and define the methodology of the research in operational terms.

· Compare and contrast characteristics of parametric and nonparametric analysis.

Descriptive Statistics

Descriptive statistics is a branch of statistics used to provide information on aspects of the specific sample being used in the study rather than the entire population. The purpose of descriptive statistics is to describe the sample in detail, which usually refers to demographic characteristics or performance.

Generally, descriptive statistics are mean, median, and mode, which are referred to as measures of central tendency, meaning they provide information about the middle ground within a data set. Mean refers to the average, such as test scores. The test scores of 75, 83, 89, 91, and 97 have a mean of 87. Median is the middle number in a data set, and using those same test scores, the median is 89. Mode is the most frequently occurring number in a data set. In the data set 2, 4, 4, 4, 6, 7, 7, 8, 9, 10, and 10, the mode is 4 because it is the number that occurs most often. Depending on the data set, the median and the mode might be very close to the mean, but in other cases, the median and the mode might be very far from the mean. In either case, those similarities or differences reveal information about the data set.

Standard deviation is also a part of descriptive statistics. Standard deviation refers to how far each number in a data set is from the mean or average. The larger the standard deviation, the more varied a data set is. Conversely, the smaller the standard deviation, the more cohesive a data set is. For example, a mean of 50 with a standard deviation of 2 means that the numbers in the data set are on average within two numbers of the mean. In a data set with a mean of 50 and a standard deviation of 19, the numbers in the data set are on average 19 points away from the mean, which suggests a great deal of scatter among the numbers within the data set.

Descriptive statistics are applied by imparting information on demographics and performance. For example, the mode can clarify which ethnicity or gender occurred most often within the sample. The mean can indicate the average scores of the sample on a measure of anxiety.

Inferential Statistics

Inferential statistics is a branch of statistics used to make inferences about the traits or characteristics of a greater population on the basis of sample measurement data. The primary goal of inferential statistics is to go beyond the measurement data at hand and make inferences about a greater population. When using inferential statistical processes to generate information to make predictions about a larger population, you must always choose the sample on the basis of random selection or random assignment. Without random sampling or random assignment, the mathematical values received by way of the statistical analysis are in error.

In this module, you will become familiar with the types of statistical processes, enabling you to draw accurate conclusions about measurement data so that you can determine relationships, differences, and effects regarding the specific phenomena, occurrences, or events under investigation. The most common inferential statistics that behavioral scientists use are statistical processes providing for the determination of relationships (correlations), differences, and effects between and among things being measured or evaluated. The specific tests used are the Pearson correlation coefficient, chi-square, Student’s t-test, analysis of variance (ANOVA), and regression. All these techniques require the use of not only a null hypothesis but also independent and dependent variables.

Assume that you are a forensic psychologist interested in determining whether a relationship (correlation) exists between drug use and crime. The statistic of choice will be the Pearson’s r. Once a relationship between the two variables is established, or not established, you can make predictions about the crimes in areas where there is a high level of drug use. If, however, you are interested in determining whether males commit more crimes than females in urban areas, the statistic of choice will be the t-test. You might even be interested in more than one condition, for example, drug use and gender. The required statistical process for this case is ANOVA. ANOVA is often called a multiple t-test.

Reliability and Validity

Additionally, in this module, you will be introduced to reliability and validity. Reliability refers to how likely an experiment is to produce similar results when it is repeated. Obtaining the same results in similar studies helps ensure the accuracy of the results. Validity focuses on whether the results of an experiment or a test measure what they are supposed to measure, which relates to the generalizability of a study, that is, whether the results would extend to the population as a whole. For example, validity would be an issue in a study that has been done only on one ethnic group or one gender because the results might not apply to the entire population.

A study may be reliable but not valid; similarly, a study may be valid but not reliable. For example, if an extremely outgoing person takes a personality test multiple times, and each time the results indicate something different, such as introversion, extroversion, or neither, then the measure is not reliable because the measure produces inconsistent results. If that same, extremely outgoing, person takes a personality test that consistently shows he or she is introverted and shy, then the measure is not valid because the test is not measuring what it is intended to measure.

Often, validity and reliability are not all-or-nothing concepts. They are typically described in terms of being low, moderate, or high. Having high reliability or validity is desirable since it is generally not possible to have 100% of either. For example, a test of intelligence is said to be valid if it measures intelligence more so than depression or self-confidence. While individuals with depression might score slightly lower on the test and individuals with high self-confidence might score slightly higher on it, the test could still be valid if the differences in scores among those groups are minimal.

Internal; External; Test Validity

Research Design Validity

The two types of validity that need to be considered are internal validity and external validity:

· Internal validity is concerned with the extent of control that you have on factors that influence extraneous variables. You want to make sure that your results are due to your manipulation of the variables in your study and not due to some other influence.

· External validity is concerned with the extent to which you can generalize your results to a broader population or to another setting. For example, if you need to test the effectiveness of a new medication for depression, it would be unrealistic to conduct your study using every person who suffers from depression. What you need to do is to collect a sample of participants who, by their characteristics, are representative of the characteristics in the larger population of people suffering from depression. Having a representative sample will raise your external validity because you can, with confidence, imply that the level of effectiveness in the sample would hold a similar level of effectiveness if used in the larger population.

Test Validity

The following three types of validity exist:

· Construct validity: It is the extent to which a test measures a hypothetical construct or an observable trait. Constructs are used to explain a behavior. For a test to have construct validity, it must also have content validity and criterion validity.

· Content validity: It is the extent to which the items on a test represent a targeted content area. Content validity is concerned with how well the test questions have been selected from the large number of possible items associated with the content area. Content validity is determined by the judgment of experts in a particular field.

· Criterion validity: It is the extent to which a person’s scores are correlated with other criteria that represent the same construct. The following are the two subcategories of criterion validity:

· Predictive validity: It is the extent to which a test is able to predict the outcome of some future event.

· Concurrent validity: It is the extent to which the chosen instrument correlates with the score on an established test or a measurement tool that evaluates the same construct.

Parametric Statistical Tests

For the parametric process, the rationale for use is as follows:

· The two populations (groups) have the same variance, which is called homogeneity of variance.

· The populations are normally distributed.

· Each measurement value is sampled independently from every other measurement value.

Z Test and t Test

Although both Z tests and t tests are used in behavioral science hypothesis testing, each is used under its own specific set of circumstances. The primary distinction between the two lies in the sample size requirement. While t tests can be used for small samples, Z tests cannot and are, therefore, reserved for larger-sample situations. Both, however, perform the same function, namely, to determine whether there are differences between the samples being evaluated or to make comparisons between sample and population measurements. In addition, both Z tests and t tests make use of the mean scores for raw measurement data when calculating differences on an independent basis. The ttest can also be used with paired data.

Pearson Correlation Coefficient

Broadly defined, correlation coefficients are statistical measures of how each member in a pair of markets relates to the other. There are two parts to the correlation coefficient, the sign and the magnitude. The magnitude is known as the absolute value and ranges from zero to one. It indicates the strength of the relationship between the two assessed values. The sign (positive or negative) indicates the direction of the relationship. A value of 1.0 is a perfect positive correlation—the two members of the pair move up and down in unison. For example, if two behaviors (anxiety and anger) move in exactly opposite directions, then they have a perfect negative correlation, with a correlation coefficient of –1. Should the two move in unison in the same direction, then the relationship between the two is positive, with +1 indicating a perfect relationship or correlation. However, because anger and anxiety may move in opposite directions or in unison, it does not mean that one affects the other.

One-Way Analysis of Variance

As in any other inferential statistical process, the goal of the one-way analysis of variance (ANOVA) is to test for the statistical significance of a stated hypothesis. The hypothesis is a statement as to what the research investigator suspects will happen when the measurement data is analyzed, and it is based on the original research question. Remember that no data analysis is meaningful until a research question has been formulated followed by a statement of the statistical hypothesis. To test the hypothesis in an ANOVA situation, the F-test is calculated and tested against its critical value or that value found in the F tables and according to the probability of the alpha level chosen (.05, .01, or .001).

Two-Way ANOVA

A behavioral scientist is often interested in determining the effects of more than one treatment or independent variable on some measurable outcome. When this is the case, a two-way ANOVA is the statistical method of choice. In a two-way ANOVA classification, you are interested not only in the effects of each treatment variable but also in their interaction effects. However, a one-way ANOVA does not provide this type of statistical information. In psychology, it is rare for only one independent variable to be the single most important cause of a particular outcome. This is because psychological variables often involve a myriad of causes and effects.

Nonparametric Statistical Tests

Nonparametric statistics are a group of statistical processes for determining relationships, differences, and effects, and they do not make strict assumptions about the population from which the data have been sampled. They may also be used for studies with small sample sizes, nominal- or ordinal-level data, and unusually distributed variables.

Chi-Square Test of Association

The rather unique Chi-Square (pronounced ki-square) test is generally used when gathering measurement data to provide information about the actual frequency of an event compared to its expected frequency. If, for example, a marketing manager wants to determine the best-colored packaging out of five possibilities for a new product launch, the statistical process of choice will be the Chi-Square coefficient. The Chi-Square is often considered a nonparametric statistical process because it does not rely on a normal population distribution.

Wilcoxon T Test for Comparing Mean Differences

The primary purpose of the Wilcoxon T test is to perform either a one-sample or a paired two-sample signed-rank test. As you discovered previously, the t test is the standard method for testing whether the differences between two population means are equal for paired samples. However, should the populations being tested be atypical, and particularly when there are small samples, the t test is not likely the best choice. An alternative to this is to use a signed-rank test, but you must remember that the signed-rank test is not nearly as powerful as the t test when the distributional assumptions are actually valid.

Note: The signed-rank test, although officially known as the Wilcoxon Signed-Rank test, is generally simply referred to as the Wilcoxon T test.

Mann-Whitney U Test (Wilcoxon Rank-Sum Test)

The Mann-Whitney U test, also known as the Wilcoxon rank-sum test, is comparable to the independent t test and is used to test whether two samples are drawn from the same population. Further, this particular Wilcoxon T test is a test to detect whether distribution one is shifted to the right of distribution two or vice versa. The test is based on independent random samples of n1 and n2 observations from the respective populations.

Spearman Rank Correlation

Spearman rank correlation is the nonparametric counterpart of the Pearson correlation parametric method. The measurement data is in ordinal form.

Kruskal-Wallis ANOVA by Ranks

Looking back at the information given on ANOVA, you learned that the F-test is an excellent statistical tool for comparing the means of k populations based on measurement data collected through a completely randomized design. Further, the ANOVA, used to test the null hypothesis that the means of the k groups are equal, is based on the assumption that the populations are normally distributed with a common variance (σ2).

On the other hand, the Kruskal-Wallis H-test is the nonparametric equivalent of ANOVA or the F-test. Rather than stating that the means are equal, this test tests the null hypothesis that all k populations possess the same probability distribution against the alternative hypothesis that the distributions differ in location or that one or more of the distributions are shifted to the right or left of each other. The most significant advantage of the Kruskal-Wallis H-test is that no assumptions need to be made about the nature of the sampled populations (i.e., means and variance).

Friedman Two-Way ANOVA

The Friedman method can be used as a nonparametric alternative to the two-way ANOVA. The method compares several related samples when the sample size is greater than five. If the sample size is less than five, the power of the test is greatly reduced.

Interpreting Statistical Output

Now that you have identified your correct scale of measurement and chosen the correct inferential statistical analysis, the results that have been obtained must be understood. Correctly interpreting the statistical output is as essential as accurately pinpointing the correct scale of measurement and selecting the appropriate statistical analysis because those important steps will be of little benefit if, in the end, the results are incorrectly interpreted.

Statistical interpretation is based upon significance levels. The common significant level is .05 and is often written as p = .05, with p standing for probability. (Remember, statistics are a theory of probability.) If p is greater than .05 (written as p > .05), then the results are not significant. Conversely, if p is less than .05 (written as p < .05), then the results are significant. Having a significant result is desired because that means that there are differences between groups, correlations, etc., and a significance level of .05 essentially means that the results are 95% likely to not have occurred by chance.

If the results are significant, then the null hypothesis is rejected. If the results are not significant, then the null hypothesis is accepted. Although it may seem somewhat confusing to discuss a rejection of the null for significant results and an acceptance of the null for nonsignificant results, this process occurs to reflect that the null hypothesis essentially means nothing. You can think of significant results in terms of rejecting the idea that your hypothesis is meaningless because significant results indicate that your hypothesis does mean something, which is what every researcher hopes for. If the results turn out to be nonsignificant, then no further statistical analysis can be interpreted. Unfortunately, with nonsignificant results, the data are meaningless and the idea that your hypothesis is meaningless must be accepted, that is the null hypothesis is accepted.

When discussing acceptance or rejection of the null hypothesis, it is also important to clarify that in either case, a risk for errors exists. If the null hypothesis is rejected, there is the possibility of making a Type I error, which is when the null is incorrectly rejected. Incorrectly rejecting the null hypothesis means that the researcher has made the mistake of thinking that his or her data is meaningful when, in fact, the data is worthless. If the null hypothesis is accepted, there is the possibility of making a Type II error, which is when the null is incorrectly accepted. Incorrectly accepting the null hypothesis means that the researcher has made the mistake of thinking that his or her data is worthless when, in fact, the data is of value. The researcher must take care to correctly interpret significance levels to ensure that valuable data is not ignored and that meaningless data does not get published.

Conclusion

While the notion of statistics can seem somewhat daunting, understanding the type of data that one has can assist in mitigating some of the confusion. Data can be at the nominal, the ordinal, the interval, or the ratio level, and the level of data (scale of measurement) can shape both the descriptive and inferential statistics that are conducted. Descriptive statistics are an essential part of data analysis because they provide specific information about the sample for the study, including any unique aspects of it that might limit the generalizability of the results.

Inferential statistics are the crux of data analysis as they enable a researcher to discern whether relationships or differences exist among the groups within a study. The statistical process allows the researcher to determine the reliability and validity of the research. After the results of the study are obtained, it is ideal if the study is both reliable and valid, meaning the researcher can replicate the results and the results measure what they are intended to measure. A study can be reliable but not valid, and vice versa. It is a problem for the researcher in either case because a well-conducted study should be both.

Additionally, since statistics and research go hand in hand, knowledge of which tests to use for specific data and for the desired comparisons can make data analysis user friendly. Typical data sets will utilize parametric analysis when the data is normally distributed and the sample size is large enough. Common types of parametric tests are Pearson's correlations, t tests, and ANOVAs. When a data set is not suited for parametric tests, nonparametric analyses are needed. Common types of nonparametric tests are Chi-Square, Wilcoxon T test, Mann-Whitney U, Spearman's correlation, Kruskal-Wallis, and Friedman two-way ANOVA. Generally, parametric tests are preferred because they are more powerful than nonparametric tests. Once the appropriate tests are chosen and conducted, then the results must be properly interpreted on the basis of the significance levels so that only significant findings are reported and nonsignificant findings are not reported.

Click here to explore the website called Stat Trek.

Note: Any website may link to any of the pages on Stat Trek. However, pages from Stat Trek may not be embedded in a frame from the referring site, without the written permission of Stat Trek.

Stat Trek contains links to other sites. Stat Trek is not responsible for any content that appears on these linked sites.