Assignment # 1663LA3

profilekamran.21700
STROOP_DATA_PRESENTATION_25FEB2021_AQ1.pptx

Stroop Task Data

Annie Qiu

25 FEB 2021

Part I: raw data

What we have vs #Goals

Original data set

Original Data set

Participant #1 through Participant #36 occupy 2 rows each, creating 72 rows of data

Demographics exist twice for each participant

Gender

Age

Ethnicity

Measures for Hours of Sleep, Caffeine Consumed, Learning Disorder, Words Read Correctly and Total Words exist twice for each participant.

Unclear which row means “MUSIC CONDITION” vs “NO MUSIC CONDITION”

Data structuring rules

Each row represents information for one participant.

The first row (aka ‘header’) should contain clear markers for each variable in easy-to-read categories.

We do not want to get confused on which categories to use when we input the IV vs DV into JASP.

Demographics should have clear coding for nominal data points.

For example: Does WHITE = CAUCASIAN? Do we combine these or keep them separate?

OG data set in jasp

What jasp sees = “FLUFFY” Data

72 separate participants

Double counts of:

Gender

Age

Ethnicity

Diagnosed learning disability

No easy way to do a side-by-side comparison of NO MUSIC vs MUSIC condition

Which ROW = music vs no music?

Implications of unstructured data

Incorrect input leads to erroneous interpretations of significance.

This leads to TYPE I or TYPE II error.

The value of your findings are diminished.

How do we believe your conclusion the next time, when you were not able to reach a statistically sound conclusion the first time?

Reputation of researchers are important if you are trying to impact the community with your results.

Implications of unstructured data

More drastically, people can get hurt/die based on what you report.

COVID vaccine from PHARMACEUTICAL X reported 99.99% efficacy in Phase III Trial, meanwhile it *actually* is only 50% effective. People will still be susceptible to disease and there are social implications that come with this.

-insert housekeeping #queen-

New data structure in jasp

What changed?

There is *ONE* participant per row of data.

Everything that happened to PARTICIPANT 1 exists in ROW 2, everything happened to PARTICIPANT 2 exists in ROW 3, etc.

We are aware of the variables as affected by condition (no music vs music)

For example: WORDS_READ_CORRECTLY_NM and WORDS_READ_CORRECTLY_M

What changed?

While this created double the columns of data as we had originally, it provides clear allocation of information per participant, without DOUBLE COUNTING the demographic variables.

For example: ROW #17 shows PARTICIPANT 16, who is a 20-year-old HISPANIC FEMALE with NO diagnosed learning disability who got 8 hours of sleep and consumed 0 ounces of caffeine and completed the Stroop task, reading 104 out of 104 words correctly, WITHOUT MUSIC…etc.

TL;Dr  Data cleaning Saves Lives

A clean data set provides an optimal set-up for easy analysis.

Having structured items make for easy location of skewed items.

This is for easy determination of exclusionary criteria.

TL;Dr  Data cleaning Saves Lives

All statistical packages will THANK YOU for organizing information.

In return, you will spend less time smashing your head into the keyboard, trying to figure out why the variables do not fit into a certain type of analysis.

Data analysis

Hypotheses and processes

What can hypothesize?

Task: Identify the INDEPENDENT and DEPENDENT variables you want to examine.

This should be done PRIOR to data mining/data analysis.

No HARKing = hypothesizing after results are known

This is a big no-no in research land.

Breaking down variables:

Categorical variables:

Gender

Ethnicity

Diagnosed Learning Disability

Scalar variables:

Age

Hours of sleep

Caffeine consumed

Words read correctly

Total words read

Hypothesis 1

There will be an association between words read correctly while listening to music and hours slept the night prior to completing the Stroop Task.

Nondirectional research question

IV: HOURS_OF_SLEEP_M (hours of sleep; music condition)

DV: WORDS_READ_CORRECTLY_M (words read correctly; music condition)

Type of Variables will denote what test you can use.

HOURS_OF_SLEEP_M = scalar; WORDS_READ_CORRECTLY_M = scalar

Type of test = Pearson’s Correlation

Pearson’s correlation in jasp

Things to check

Pearson’s correlation in jasp

Results

Pearson Correlations
    WORDS_READ_CORRECTLY_M HOURS_OF_SLEEP_M
WORDS_READ_CORRECTLY_M Pearson's r  
p-value  
HOURS_OF_SLEEP_M Pearson's r 0.072
p-value 0.676

Pearson’s correlation in jasp

Graphics

Hypothesis 2

There will be an association between words read correctly and caffeine consumption while completing the Stroop Task without music.

Nondirectional research question

IV: CAFFEINE_CONSUMED_OZ_NM (caffeine consumed: no music condition)

DV: WORDS_READ_CORRECTLY_NM (words read correctly; no music condition)

Type of Variables will denote what test you can use.

CAFFEINE_CONSUMED_OZ_NM = scalar; WORDS_READ_CORRECTLY_NM = scalar

Type of test = Pearson’s Correlation

Pearson’s correlation in jasp

Things to check

Pearson’s correlation in jasp

Results

Pearson Correlations
    CAFFEINE_CONSUMED_OZ_NM WORDS_READ_CORRECTLY_NM
CAFFEINE_CONSUMED_OZ_NM Pearson's r  
p-value  
WORDS_READ_CORRECTLY_NM Pearson's r 0.205
p-value 0.229

Pearson’s correlation in jasp

Graphics

Hypothesis 3

There will be a difference between the number of words read correctly in the music vs no music conditions.

Nondirectional research question

IV: CONDITION (Music vs No Music)

DV: WORDS READ CORRECTLY

Type of Variables will denote what test you can use.

CONDITION = categorical; WORDS READ CORRECTLY = scalar

Type of test = Paired Samples t-test

Paired Samples t-Test in jasp

Things to check

Paired Samples T-Test in jasp

Results

Paired Samples T-Test
      t df p Mean Difference SE Difference Cohen's d
WORDS_READ_CORRECTLY_NM - WORDS_READ_CORRECTLY_M WORDS_READ_CORRECTLY_M 0.244 35 0.808 0.833 3.412 0.041
Note.  Student's t-test.

Your turn!

What do you hypothesize?