Week 7.2 spss

jamiahr23
7.4article.pdf

183

Volume 14, Issue 1 August 2020

Designing and Piloting a Repeated-Measures ANOVA Study on L2 Academic Writing:

Methodology and Challenges

Larisa Nikitina

University of Malaya, Malaysia

larisa@um.edu.my

Abstract

This article highlights methodological challenges inherent in implementing a repeated-

measures ANOVA study on L2 academic writing and suggests possible solutions to these

challenges. The RM-ANOVA design is employed when the same participants are measured more

than once. Therefore, a repeated-measures ANOVA study with its several sessions of data

collection and multiple variables requires a meticulous planning and well-planned

implementation. This article highlights important considerations that researchers might want to

pay attention to while designing and implementing a repeated-measures ANOVA study. These

considerations pertain to selecting and operationalizing the study’s variables, recruiting the

participants, selecting an appropriate research instrument and ensuring that the data are reliable

and valid.

Keywords: L2 academic writing; research methods; RM-ANOVA design; university students

Introduction

Task-based language teaching (TBLT) provides a framework for second language (L2)

researchers and practitioners to design class tasks that are meaningful for the language learners,

that promote L2 communication in the classroom and enhance the learners’ use of authentic

language while performing the tasks (Willis & Willis, 2008). A cognitive model of task-based

instruction— the Cognition Hypothesis—asserts that pedagogical tasks with increased cognitive

184

demands will help to achieve two pedagogical aims. Firstly, these tasks will nudge the language

learners to produce more accurate and complex language. Secondly, they will stimulate the

learners to engage in a lengthier interaction and heighten their attention to, and memory of, the

linguistic input (Robinson, 2003a, 2003b). The literature on the Cognition Hypothesis portrays

task complexity as an important cognitive factor that needs to be properly addressed while

designing a task. This is because this creates more learning opportunities and strengthens

attentional mechanism for L2 production, development and acquisition (Robinson, 2003b, 2005,

2007).

The Cognition Hypothesis and its associated Triadic Componential Framework (TCF)

provides a taxonomic system for pedagogical task design and implementation. Three broad

classificatory categories, namely, task complexity, task condition and task difficulty are each

guided by two continuums (Robinson, 2001a, 2003b, 2005, 2007). A key component—task

complexity and its resource-directing and resource-dispersing continuums—can be operated

according to the -/+ continua to determine the cognitive complexity of a task and its emphasis

on L2 production, development and acquisition31. Causal reasoning demands is one of the

resource-directing factors that promotes L2 production. However, as a review of scholarly

literature shows, it is a greatly underexplored factor. Not only does the cognitive factor or causal

reasoning demands requires empirical validation, but also the interactive factor that refers to the

number of participants in the task (or the task condition) is of the utmost importance.

It is possible that the scarcity of studies that address this important topic in the literature

on L2 academic writing is due to the methodological considerations and challenges that

researchers have to face both at the research design and research implementation stages. Such a

study would require a complex research design that involves a meticulous planning and thorough

execution to allow for a smooth implementation and eventual success of the research project.

Therefore, the current article focuses on the methodological issues that need to be addressed

while designing a study that aims to examine the effects of -/+ causal reasoning demands across

different -/+ number of participant groupings (i.e., individual, dyadic and triadic) on the L2

written modality. The dependent variables for the writing modality in a study highlighted here

are Complexity, Accuracy and Fluency (CAF) in the L2 written production.

31 The continuum +/-, which is attached to the assessed components in a study on L2 academic writing, indicates that “there is relatively more versus relatively less” of the component (Robinson 2001b, p.30). The components may be the task complexity, the number of participants, the allocated time, etc.

185

To be more specific, among the variables, Complexity refers to the number of clauses the

learner connects or includes within a sentence (Foster & Skehan, 1996). This construct in L2

production shows the development of the restructuring process within the L2 learners’

interlanguage systems (Skehan, 1996). Accuracy refers to the learner’s ability to exercise the

maximum level of control to prevent errors during a language performance (Ellis, 2003). Fluency

refers to the learner’s ability to use the language with a high number of words (Larsen-Freeman,

2006).

This article is structured as follows. Firstly, it reviews the methodology and statistical

procedures employed in the earlier studies on the effects of task complexity on the L2 individual

written production in terms of Complexity, Accuracy and Fluency. Then, it highlights important

considerations that researchers might want to attend to while designing, piloting and

implementing a repeated-measures ANOVA (RM-ANOVA) study. The article gives examples of

the challenges that could be faced during data collection and data analysis procedures. While the

actual findings are not reported here, the article gives several examples from a study that adopted

the RM-ANOVA research design. The article provides suggestions to possible actions that could

help to successfully overcome these challenges. In short, this article might offer a useful

guidance to novice researchers who would like to implement their own RM-ANOVA study.

Review of Literature

Earlier research studies have employed various methodologies, instruments and statistical

procedures to investigate the effects of task features on L2 academic writing. Ishikawa (2007)

employed a one-way ANOVA to examine the effects of Here and Now on the Complexity,

Accuracy and Fluency (CAF) measures in the L2 narrative written output produced by 54

Japanese third-year high school students. The Here and Now variable was manipulated with the

availability of the cartoon strip. The findings revealed that more complex tasks resulted in higher

CAF indices. This finding was compatible with the Cognition Hypothesis that postulates that

cognitively more complex tasks would have positive effects on the quality of the L2 production.

However, it is not clear if the resource-dispersing variable—planning time—was included in the

study by Ishikawa. Therefore, it could be suggested that the results could be due to the effects of

the pre-task planning time, which might have lessened the complexity level of the resource-

directing here and now. This issue was also noted by Skehan (2009).

186

Kuiken and Vedder (2007) employed a repeated measures ANOVA (RM-ANOVA)

design to investigate the effects of task complexity with -/+ number of elements, and -/+ number

of reasoning demands on the language performance. The participants were Dutch learners of

French and Italian who were instructed to write a letter about the choice of the holiday

destination. The independent variables included 3 elements to reason the choice of the holiday

destination for the non-complex writing task and 6 elements for the complex writing task. The

assessment was based on the general versus specific measures of writing proficiency. In the

study, accuracy was examined by counting the type of errors made in the L2 texts, whereas

lexical complexity was inspected by distinguishing the frequent words from infrequent ones. The

results revealed that the complex writing task led to a significant decrease of errors and yielded a

lexically more complex text. The effects of task complexity on higher accuracy could be mainly

attributed to lower ratios of lexical errors in the more complex task.

A study by Ruiz-Funes (2015) examined the effects of task complexity and several

learner-related variables in essay writing. The researcher focused on the CAF measures and the

participants were L2/foreign language (FL) groups with advanced and intermediate language

proficiency. Similar to Ishikawa’s (2007) finding, Ruiz-Funes detected a positive impact of the

increased task complexity on syntactic complexity, accuracy and fluency with the advanced level

learners. The results also revealed that the complex tasks yielded a higher syntactic complexity

but had a lower accuracy and fluency. However, it was also found that there were positive

changes in syntactic complexity, accuracy and fluency in the complex task writings of the

advanced learners.

A series of later studies by Kuiken and Vedder (2008, 2009, 2011, 2012) employed a

RM-MANOVA analysis to investigate the effects of task complexity, manipulated with ±

number of elements and ± reasoning demands, and high and low proficiency learners in the L2

written and spoken production. The findings from these studies showed positive impacts of

increasing task complexity mostly on accuracy. However, they did not indicate any effect on

syntactic complexity and lexical variation. It was found that increasing task complexity along

resource-directing variables led to higher accuracy in the L2 written output. Also, the findings

indicated that the learners performed with a higher accuracy in complex tasks and there were

decreases in the lexical errors. This result contradicted the findings regarding the lexical

variation. One of the studies indicated that the effects of task complexity on L2 academic written

187

production was not dependent on the oral and written production modes. As the earlier studies

indicated, positive impacts on accuracy were repeatedly identified in the written L2 productions;

however, there was no statistically significant effect on the lexical variety. Also, in the written

L2 production mode there was no effect on the syntactic complexity. These results might be due

to different task types employed in the studies, which might have affected the learners’ attention

and dispersed it to different dimensions of the L2 production (Skehan, 2009).

In a more recent study, Frear and Bitchener (2015) partially replicated Kuiken and

Vedder’s (2012) operationalized reasoning demands variable with three letter-writing tasks, each

at a different level of task complexity. They examined the effects of increasing task complexity

on the lexical and syntactic complexity in the writing by 34 non-native speakers of English. The

researchers found that the L2 production in the writing task with lower complexity had a larger

number of adverbial clauses while the medium and high complexity tasks yielded less adverbial

clauses. Overall, the study detected increases in the lexical complexity between low complexity

and high complexity writing tasks. However, the increase in the lexical complexity did not lead

to the increase in the syntactic complexity. As Frear and Bitchener noted, these results did not

support the Cognition Hypothesis. They suggested that these findings could be due to the nature

of the tasks, which required a different communication function. Also, there was no statistically

significant difference in the ratio of dependent clauses to t-units across all types of the dependent

clauses. When the ratio of the dependent clauses to t-units for each type of the dependent clause

was analyzed separately, there occurred a decrease in adverbial dependent clauses in the tasks

with higher complexity. Rahimi (2018) employed the paired samples t-tests and Wilcoxon

Signed Ranks tests to investigate the effects of increasing reasoning demands and the number of

elements on CAF indices. In the study, two argumentative tasks were adapted from Révész

(2011); the participants were 60 upper-intermediate FL learners of English in Iran. The findings

showed that increasing task complexity produced a larger number of subordinate clauses with a

greater lexical and syntactic complexity but also with a reduced writing accuracy.

To sum up, the earlier studies that employed statistical analyses to examine the effects of

task complexity and task condition on individual learners’ L2 academic writing were conducted

with different kinds of participants. This might have affected the findings due to the variability of

the overall mean scores that could stem from the participants’ individual differences. Therefore,

to increase the accuracy of the statistical analysis it would be advisable to conduct a study among

188

the same group of participants. This would require implementing an RM-ANOVA design. In

other words, an RM-ANOVA study could be a better analytical tool to examine the effects of the

task design variable (i.e., task complexity: ± causal reasoning demands) and the task

implementation variable (i.e., task condition: ± number of participants) on L2 individual

argumentative written production and measures of the CAF indices.

This article demonstrates how an RM-ANOVA study could be designed, piloted and

implemented. It also highlights the methodological challenges and possible solutions when

implementing such a study. The investigation of the effects of task complexity level (i.e., simple

versus complex) and task condition (i.e., individual, dyadic and triadic) on the L2 individual

academic writing (an argumentative essay in this particular case) was guided by the following

research question: Is there a statistically significant effect of task complexity (simple vs complex

task) and task condition (individual vs dyadic vs triadic groupings) on lexical and syntactic

Complexities, grammatical Accuracy and Fluency in L2 individual academic writing? The

following sections highlight important considerations that researchers have to address while

designing an RM-ANOVA study.

Designing an RM-ANOVA Study: Methodological Considerations

Operationalizing the Variable and Proposing the Relationship among the Variables

Studies employing an RM-ANOVA design assess relationships among several variables.

Moreover, the RM-ANOVA design can be implemented either with only the within-group

variables or in a combination of the within- and between-groups variables. As advised by

Larson-Hall (2015), in order to make the research design and the study’s variables clear to the

reader, researchers might want to provide a design box that visually presents their RM-ANOVA

analysis and variables.

In the current article, the RM-ANOVA analysis investigated whether there was a

statistically significant difference in the L2 individual writing (i.e., the argumentative essays) in

three different task conditions (i.e., individual, dyadic and triadic) which were performed by the

same group of participants. Figure 1 depicts the research design and variables in the current

study.

189

Dependent Variables Independent Variable & Moderator Variable

Continuous Variable Categorical Variable

Within-groups

variable

Between-groups

variable

L2 Individual Writing

• Lexical complexity and

Syntactic Complexity

• Grammatical Accuracy

• Fluency

Independent variables

Task Complexity

• Simple Task with 2

causes and 2 effects

• Complex Task with

6 causes and 6

effects

Moderator variables

Task Condition

• Individual - No

peer discussion,

individual writing

• Dyadic - 15

minutes

discussion,

individual writing

• Triadic - 15

minutes

discussion,

individual writing

Figure 1: Design box of the current 2 x 3 RM-ANOVA study

To be more specific, the independent variable—task complexity—had two levels (simple

and complex) and it was the within-groups variable. The moderator variable—task condition—

indicated that the same participants performed the task in the individual, dyadic and triadic

grouping. This variable was the between-groups variable. The independent and moderator

variables were categorical variables because the former represented the levels of task complexity

while the latter represented the conditions of the task implementation. As for the dependent

variables, the L2 individual writing was operationalized using the global measures of

Complexity, Accuracy and Fluency (CAF). Each of the dependent variables was measured on a

continuous scale.

190

Complexity, Accuracy and Fluency Measures and their Analysis

The writing quality of the argumentative academic writing tasks is usually assessed by

the global measures of Complexity, Accuracy, and Fluency (CAF). The researchers might want

to explain how the CAF were measured in their study. A detailed explanation of the CAF

measures in this particular study is given in Table 1.

Table 1: Global CAF Measures of the academic writing quality

Global Measures CAF Examples

Complexity

Lexical complexity

Syntactic

complexity

(Foster & Skehan,

1996)

Lexical complexity: Measured by a mean segmental type-token ratio.

Example:

A man walked through the underbridge. He was robbed at the underbridge.

• Different words/Total words (10/12)

Syntactic complexity: Measured by the number of S-nodes per T-unit in the

written text.

Example:

The picture shows that a man walked through the underbridge. He was

robbed at the underbridge by a group of unknown people.

• S-nodes/ T-units (3/2)

Accuracy

(Ellis, 2003)

Accuracy: Measured by the number of error-free T-units per T-unit in the

text.

Example:

A man walked through the underbridge. He was robbed at the underbridge

by a group of unknown people. Then, he went to the police station to lodge a

report. He tell the police that he recognized a man’s face.

• Error-free T-units/ total T-units (3/4)

191

Fluency

(Larsen-Freeman,

2006)

Fluency: Measured by the number of words per composition or per T-unit.

In the present study, two types of complexity were measured—Lexical complexity and

Syntactic complexity. As stated earlier, complexity refers to the number of clauses the learner

connects or includes within a sentence (Foster & Skehan, 1996). Accuracy is the learner’s ability

to exercise the maximum level of control to prevent errors during a language performance (Ellis,

2003) while Fluency is the learner’s ability to use the language with a high number of words

(Larsen-Freeman, 2006).

Steps in the Data Collection, Management and Analysis

Next, due to a complex nature of an RM-ANOVA design, which involves not only

multiple variables but also several data collection sessions, it would be helpful if researchers

could provide a graphical representations of the steps in the data collection, management and

analysis. Figure 2 offers a visual representation of the steps in the data collection, management

and analysis adopted in the current study. Moreover, as can be seen from Figure 2, the graphical

representation of RM-ANOVA design could be integrated with the visualization of the overall

research design of the study (e.g., mono- or mixed-methods).

192

Figure 2: Steps in the data collection, data management and data analysis

As can be seen from Figure 2, the data were collected in three sessions and in three task

condition settings—individual, dyadic and triadic. The individual session was set as a baseline

for the further analysis; each participant wrote an essay involving one simple and one complex

task, without any peer interaction. The findings were then compared with the findings from the

L2 individual written production in the dyadic and triadic settings, which involved the

intervention or a peer discussion session prior to the doing the task. In other words, in the dyadic

and triadic sessions, the participants were required to have a group discussion before writing

their individual essays. The written production (i.e., the written L2 texts) were coded to enable

the measurement of the CAF indices. The frequencies of CAF measures were tabulated and

analyzed using the Statistical Package Social Science (SPSS) Version 21.

To sum up, at the initial stage of developing an RM-ANOVA study, the main

considerations would be: 1) defining and operationalizing the variables in the study, 2) choosing

appropriate research instruments and analytical tools and 3) planning the data collection sessions.

Other considerations include addressing research ethics and planning the logistics (i.e., the scale

and the timing of the data collection sessions). The following section focuses on the second step,

which is choosing an appropriate research instrument.

193

Developing a Research Instrument

When choosing the research instrument (i.e., the type of essay), researchers may want to

consider the educational context where their study is conducted. In a Malaysia education context,

where the current study took place, it could be advisable to select an argumentative type of an

essay for the L2 writing task. There are two main reasons for this choice. Firstly, the

argumentative writing task requires the learners to use their logic and reasoning to generate an

argument; therefore, in the Triadic Componential Framework (Robinson, 2001a, 2001b, 2007;

Robinson & Gilabert, 2007) increasing the resource-directing variable, reasoning demands is

considered a cognitively more complex task.

Secondly, the argumentative writing genre is often employed in the academic writing

courses at the tertiary level in Malaysia (e.g., Veerappan, Yusof and Aris, 2013) and other

educational contexts (e.g., Khodabandeh et al., 2013). Therefore, at the tertiary level settings,

participants in a study could be familiar and comfortable with being given an argumentative

writing task stimulus rather than being provided a series of pictures for their writing task.

Therefore, while designing the current RM-ANOVA study argumentative-based topics were

considered as the most suitable to be used as prompts for the L2 individual academic writing in

all three types of settings (i.e., individual, dyadic and triadic) and also for the peer interaction

sessions (i.e., dyadic and triadic).

Research literature offers ample support for using argumentative writing tasks. For

example, Long (2015) proposed that tasks should be analytical in nature in order to stimulate

learner’s attentional mechanisms and memory resources. Argumentative writing tasks allow

bringing out the learners’ ability to understand, analyze, evaluate, explain and justify an issue

when they take a different position on the topic (Duff, 1985; Long, 1990). Besides, as noted in

several studies (Duff, 1985; Long, 2015), argumentative writing tasks allow learners to maintain

different positions during the interaction to reach a consensus and eventually succeed in their

writing task. Importantly, Foster and Skehan (1996) pointed out that an argumentative-based task

that incorporates ‘critical decision-making’ elements would allow yielding the most constant

patterns of the linguistic features and CAF measures. In a similar vein, other researchers (e.g.,

Ellis, 2003; Robinson, 2001a, 2005) argued that tasks that prompt reasoning are considered

cognitively more complex than tasks with decreased reasoning demands (Halford, Cowan, &

Andrews, 2007). The chosen research instrument needs to be tested in a pilot study. The

194

following section addresses issues pertaining the pilot study phase of the RM-ANOVA research

project.

Pilot Study

A pilot study is necessary to conduct in order to identify and prevent potential problems

that might arise in the actual study (Loewen and Plonsky, 2015). It would allow avoiding costly

mistakes (time-wise and resources-wise) that might arise due to deficiencies in the research

design and data elicitation devices, such as research instruments. During a pilot study various

aspects of the future study are assessed and tested, including the research settings, the potential

participants, the research instruments and the analytical tools. An RM-ANOVA study might need

more than one pilot study due to its complex research design, which includes multiple variables,

multiple data collection settings and different timings. In the current study, three rounds of pilot

studies were conducted before carrying out the actual study.

To be more specific, Pilot Study 1 tested the suitability of the intended group of

participants in terms of their English language proficiency level which is required for completing

the tasks. It also evaluated the appropriateness of the complexity level of the argumentative

writing tasks. This pilot study revealed that the participants with a low English level proficiency

or those who had obtained Bands 1 and 2 of The Malaysian University English Test (MUET)

were not able to complete the simple written task within a stipulated time (1 hour). They also

struggled to understand the demands and instructions for the tasks. Therefore, it was decided to

limit the participation in the actual study to only the learners at an intermediate level (i.e., MUET

Bands 3 and 4).

During Pilot Study 2, the main focus was on the concept and design of the task

complexity as well as the implementation of the argumentative essay writing tasks. A group of

15 ESL students at their intermediate levels of proficiency (MUET bands 3 to 4) completed two

argumentative writing tasks: one simple task and one complex task in three different task

conditions—individual, dyadic and triadic. The pilot study results showed that the learners were

able to complete both simple and complex tasks in all three types of settings or task conditions.

After each writing session the researcher had a casual conversation with the participants to seek

their perceptions of the complexity level of the tasks as well as their preferences for the topics of

195

the argumentative essays. The participants deemed the task complexity levels as appropriate,

with 2 causes and 2 effects for the simple task and 4 causes and 4 effects for the complex tasks.

To verify the appropriateness of the task complexity parameters, the researcher identified

several possible topics for the argumentative essays based on her discussion with the participants

and emailed these topics to Peter Robinson. This action aimed to check the feasibility of the task

complexity for the simple (2 causes and 2 effects) and complex (4 causes and 4 effects) L2

writing tasks. Robinson suggested to increase the complexity level for the complex tasks to 6

causes and 6 effects. As for the essay topic, the participants suggested several themes that they

considered engaging and relevant to real life. These themes included parenting, relationship,

academic achievement, freedom, technology intervention and mobile pedagogy.

Based on the findings, some amendments were made for the next round of the pilot

study. The complexity levels for the argumentative tasks were modified—2 causes and 2 effects

for the simple task and 6 causes and 6 effects for the complex task. Time for the individual

writing task was limited to 40 minutes whereas the peer discussion was limited to 15 minutes.

Both simple and complex tasks were discussed in the dyadic and triadic groupings.

Finally, Pilot Study 3 was conducted to verify the appropriateness and feasibility of the

amendments made on the basis of the two earlier pilot studies. It tested the feasibility and

suitability of the tasks, selection of the argumentative essay topics, task complexity levels, time

given to complete the tasks, settings and peer groupings arrangements. The participants were a

different group of 15 ESL learners (MUET bands 3 and 4). The findings of the Pilot Study 3

showed that the participants were able to produce a complete argumentative writing (both for the

simple and complex tasks) and that the research instrument and the data collection procedures

were appropriate.

The Actual Study

Data Collection and Participants

The actual study was conducted in a private university in Malaysia. The recruitment of

participants commenced after getting the official permission from the Dean of the Faculty.

Purposive sampling was adopted to recruit the participants. The criteria for participation in the

study were as follows: participants must be L2 learners of English, must have obtained the

minimum MUET band 3, and must be students at a local university.

196

To recruit the participants, the researcher distributed photocopied forms seeking personal

particulars from potential participants and requested the interested students to return the form to

the researcher. In the form, the potential participants were asked to give their name, state their

mother tongue, indicate their age, gender, degree majors, proficiency level in the English level

(assessed in MUET or IELTS results), and stated their hometown and contact number. Also, it

was stated in the form that as a small token of appreciation, the participants in the study would

receive a certificate of participation upon the completion of all three L2 writing sessions.

Prior to each data collection session, the researcher consulted the participants about

possible dates and times via WhatsApp. The sessions were set based on the students’

availability. Figure 3 offers a detailed visual depiction of the steps involved in the data collection

procedure. Initially, 126 students expressed their interest and willingness to participate in the

study. However, out of the 126 students only 43 attended the first round of data collection.

Furthermore, 7 of these 43 participants did not appear in the second and third sessions.

197

Pre-

selection

of

Participa

nts and

Recruitm

ent

Distributi

ng

Consent

Forms &

Survey

Forms

Two

weeks

later

Session 1

(Individual)

T w

o- W

ee k

In te

rv al

Session 2

(Dyad)

T w

o- W

ee k

In te

rv al

Session 3

(Triad)

Simple

Task 1

10 minutes

Preparation

Simple Task

2

10 minutes

Preparation

15 minutes

Dyadic

Discussion

40 minutes

Individual

Writing

Break 5-10

minutes

Complex

Task 2

10 minutes

Preparation

15 minutes

Dyadic

Simple Task

3

10 minutes

Preparation

15 minutes

Triadic

Discussion

40 minutes

Individual

Writing

Break 5-10

minutes

Complex

Task 3

10 minutes

Preparation

15 minutes

Triadic

198

40 minutes

Writing

Break 5-10

minutes

Complex

Task 1

10 minutes

Preparation

40 minutes

Writing

Discussion

40 minutes

Individual

Writing

Discussion

40 minutes

Individual

Writing

Figure 3: Data Collection Procedures

The remaining 36 (N=36) participants who took part in the study were all from the same

university but various academic programs, such as Civil Engineering, Materials and

Manufacturing Engineering, Mechatronics Engineering, Chemical Engineering, Broadcasting,

Graphic Design and Multimedia, Accounting, International Business and Actuarial Science.

Developing Good Rapport between the Researcher and Participants

Building trust and good rapport with the participants prior to and during the

implementation of an RM-ANOVA study is important in order to maintain the participants’

interest and good will throughout the research project and beyond (e.g., for later sharing of the

findings with the participants). For this purpose, the researcher explained the importance of the

study and highlighted how this research and its findings could contribute to developing a better

curriculum and how they could inform pedagogical decisions regarding the choice of the

teaching materials and classroom activities. In the current study, the 36 participants who stayed

199

throughout the several data collection sessions over a two months period were committed due to

their sense of responsibility and understanding the value of the study for the betterment of the

English language course. Also, the researcher thanked the students in person for participating in

each writing session and reminded them about the upcoming data collection session via the

WhatsApp messages; she reiterated the importance and value of their presence and feedback in

the writing session.

It should be noted that in order to retain the participants a proper organization of the data

collection sessions needs to be considered by the researcher and planned in advance. This

includes scheduling short breaks between the session, finding comfortable settings and

appropriate timing. The next section addresses these issues.

Keeping Participants Alert through the Data Collection Session

Another challenge is keeping the participants alert throughout the writing sessions. This

is especially important in view that the students might come to the data collection sessions after

their lectures and tutorials and they might be tired from their day-time activities. In the current

study, the participants were required to complete two writing tasks in one session, which took

approximately two hours. To maintain the energy level of the students, short breaks and

refreshments were provided. It cannot be stressed strongly enough that researcher’s sincere

concern for the participants’ well-being, such as providing some light food and having a short

chat with them during the break, will go a long way in facilitating and even enabling the

implementation of an RM-ANOVA study.

Performing the RM-ANOVA Test and Reporting the Results

The current article does not aim to report and discuss the statistical results from the actual

RM-ANOVA study that was carried out by the first author of this article. The main aim of this

paper is to highlight methodological challenges and issues that might arise while designing and

implementing an RM-ANOVA study on academic L2 writing. However, it is important to

remind that researchers must be aware of the hidden assumptions underlying statistical tests.

These assumptions must be checked and fulfilled before implementing the actual statistical

analysis. In order to give legitimacy to the findings from the RM-ANOVA statistical procedure,

200

researchers must ensure that the assumptions for this statistical analysis had been tested and met

before the data were analyzed.

As Larson-Hall (2015) reminded, besides the standard statistical assumptions, which are

a normal distribution and equal variances for all groups, there is one important additional

assumption for the RM-ANOVA test known as sphericity. This concept is complex but,

basically, sphericity “measures whether differences between the variances of a single

participant’s data are equal” (Larson-Hall, 2015, p. 326). Researchers usually employ the

Mauchly’s test to assess sphericity. If this assumption is not observed, then either the

Greenhouse–Geisser or Huynh–Feldt correction can be used as an option to remedy the analysis.

However, statisticians and methodologists warn that the Mauchly’s test is not a very

robust and powerful analysis (Howell 2002 as cited in Larson-Hall, 2015). Therefore, it is

advisable that even if the sphericity assumption is observed according to the results from the

Mauchly’s test researchers still might want to use either the Greenhouse–Geisser correction or

the Huynh–Feldt correction, preferably the former one as more conservative. The Greenhouse–

Geisser and Huynh–Feldt correction values are available in the SPSS reports of the RM-ANOVA

results. The researchers might want to include these results when reporting their statistical

findings (see Larson-Hall, 2015 for more details).

Finally, when reporting the statistical results, it would be good to provide graphic depictions

of the findings (see Larson-Hall, 2015). This would help the readers not only get a better general

impression of the findings but it also offers an effective way to summarize and present the

study’s results in a clear manner at both the group and individual levels.

Challenges while Implementing an RM-ANOVA Study

Like any research project, an RM-ANOVA study is bound to pose challenges to

researchers. These challenges include not only devising the research instrument and recruiting

the participants, as highlighted earlier in this article, but also minimizing the participants’

attrition throughout the research project. The following section is devoted to these issues.

Recruiting Participants

Recruiting the participants for this study was the main challenge. The participants were

selected on voluntary basis or their willingness and interest to be a part of the study. The

201

participants were from the same university but from different academic programs and courses.

As a result, they had different timetables for the lectures and tutorials. This made it quite

challenging to arrange the dyadic and triadic L2 writing sessions. Therefore, deciding whether

the participants would come from the same or different university programs and courses could be

important for a smoother implementation of the research project.

Participants’ Attrition and No-show

Attrition of the participants was another major challenge. From the initial 126 students

who showed their interest in the study, only 36 were able to provide the full set of the data. As

ethical considerations would require participation in a research project must be entirely voluntary

and the participants can withdraw from the study at any time they wish. To deal with the

challenge of participants’ attrition, which might jeopardize the success of an RM-ANOVA study,

the negative consequences of the participants’ attrition must be foreseen and minimized by the

researcher. This is especially important for an RM-ANOVA study as the analysis of the within-

group data demands that the same people participate in each and every session of the data

collection.

In addition, the participants’ non-show is a serious challenge. When some participants are

absent on specific data collection days the dyadic and triadic settings cannot be well-formed.

This is because the groupings are planned in advance. Therefore, to avoid losing the much

needed data, the session needs to be re-scheduled. To minimize this challenge it is desirable to

establish a good rapport between the researcher and the participants.

Preventing the Carryover Effects

Participating in a study with multiple data collection sessions, as it is required in a RM-

ANOVA design, can be quite demanding for the participants. To prevent the research fatigue, the

loss of motivation and to minimize the carryover effect, the data collection schedule in the

current study was planned with two-week intervals between each data collection point. However,

the caveat is that longer intervals might increase the attrition rate. To reduce this possibility, the

researcher explained to the participants the consequences and negative effects to the research

project if they would withdraw half-way through the project or do not attend the writing sessions.

This highlights the challenge of creating the ‘study participants’ in a true meaning of this word

202

and to make the students realize that they are important and valuable stakeholders in the research

project.

Conclusions

An RM-ANOVA study requires a meticulous design and well-planned execution. The

steps taken by the researcher beginning from the development of the research design throughout

the data analysis and reporting the findings can ‘make or break’ the success of an RM-ANOVA

research project.

As this article has highlighted, the challenges while implementing the RM-ANOVA

study ranged from the participant recruitment phase to the research instrument developing stage

and to the data collection phase. Arranging and grouping the participants for the dyadic and

triadic L2 writing sessions was one of the main challenges at the data collection phase.

Maintaining the participants’ interest throughout the study was another issue that needed to be

properly addressed by the researcher. Moreover, keeping the participants alert and keen during

the group discussions and immediately following writing sessions was of a paramount

importance for obtaining valid and reliable data and implementing the research project.

As Larson-Hall (2015) notes, research designs that incorporate repeated measures, such

as the RM-ANOVA design, are “quite desirable, as they increase the statistical power of a test”

(p.323). The current article described issues and challenges in a study that adopted the principle

of natural progression of task complexity from simple to complex. Future studies that adopt an

RM-ANOVA design might want to investigate the consequences of the reverse change in task

complexity from complex to simple. It is much hoped that methodological issues and challenges

highlighted in the current article, as well as suggestions provided, would help researchers in their

efforts to design and implement future RM-ANOVA studies.

References

Duff, P. A. (1985). Another look at interlanguage talk: Taking task to task. University of Hawai'i

Working Papers in English as a Second Language, 4(2).

Ellis, R. (2003). Task-based language teaching and learning. Oxford: Oxford University Press.

Foster, P., & Skehan, P. (1996). The influence of planning and task type on second language

performance. Studies in Second Language Acquisition, 18(03), 299-323.

203

Frear, M. W., & Bitchener, J. (2015). The effects of cognitive task complexity on writing

complexity. Journal of Second Language Writing, 30, 45-57.

doi:10.1016/j.jslw.2015.08.009

Halford, G. S., Cowan, N., & Andrews, G. (2007). Separating cognitive capacity from

knowledge: A new hypothesis. Trends in cognitive sciences, 11(6), 236-242.

Ishikawa, T. (2007). The effect of manipulating task complexity along the [+/-Here-and-Now]

dimension on L2 written narrative discourse. Investigating tasks in formal language

learning, 136-156.

Khodabandeh, F., Jafarigohar, M., Soleimani, H., & Hemmati, F. (2013). The impact of explicit,

implicit, and no-formal genre-based instruction on argumentative essay writing.

Linguistics Journal, 7(1).

Kuiken, F., & Vedder, I. (2007). Task complexity and measures of linguistic performance in L2

writing. International Review of Applied Linguistics in Language Teaching (IRAL), 261-

284. doi:10.1515

Kuiken, F., & Vedder, I. (2008). Cognitive task complexity and written output in Italian and

French as a foreign language. Journal of Second Language Writing, 17(1), 48-60.

doi:10.1016/j.jslw.2007.08.003

Kuiken, F., & Vedder, I. (2009). Tasks across modalities: The influence of task complexity on

linguistic performance in L2 writing and speaking. Paper presented at the colloquium

‘Tasks across modalities’. Paper presented at the Task based Language Teaching

Conference, Lancaster, UK.

Kuiken, F., & Vedder, I. (2011). Task performance in L2 writing and speaking: The effect of

mode. Second language task complexity: Researching the Cognition Hypothesis of

language learning and performance, 91-104.

Kuiken, F., & Vedder, I. (2012). Syntactic complexity, lexical variation and accuracy as a

function of task complexity and proficiency level in L2 writing and speaking. Dimensions

of L2 performance and proficiency: Complexity, accuracy and fluency in SLA, 143-170.

Larsen-Freeman, D. (2006). The emergence of complexity, fluency, and accuracy in the oral and

written production of five Chinese learners of English. Applied Linguistics, 27(4), 590-

619.

204

Larson-Hall, J. (2015). A guide to doing statistics in second language research using SPSS and

R. Routledge.

Long, M. H. (1990). Task, group, and task-group interactions.

Long, M. H. (2015). Second language acquisition and task-based language teaching. Hoboken:

Wiley Blackwell.

Loewen, S., & Plonsky, L. (2015). An A–Z of applied linguistics research methods. Macmillan

International Higher Education.

Rahimi, M. (2018). Effects of increasing the degree of reasoning and the number of elements on

L2 argumentative writing. Language Teaching Research, 1362168818761465.

Révész, A. (2011). Task Complexity, focus on L2 constructions, and individual cifferences: A

classroom-based study. Modern Language Journal, 95, 162-181. doi:10.1111/j.1540-

4781.2011.01241.x

Robinson, P. (2001a). Task complexity, cognitive resources, and syllabus design: A triadic

framework for examining task influences on SLA. Cognition and second language

instruction, 287-318.

Robinson, P. (2001b). Task complexity, task difficulty, and task production: Exploring

interactions in a componential framework. Applied Linguistics, 22(1), 27-57.

Robinson, P. (2003a). Attention and memory during SLA. The Handbook of Second Language

Acquisition, 631-678.

Robinson, P. (2003b). The cognition hypothesis, task design, and adult task-based language

learning. Second Language Studies, 21(2), 45-105.

Robinson, P. (2005). Cognitive complexity and task sequencing: Studies in a componential

framework for second language task design. IRAL-International Review of Applied

Linguistics in Language Teaching, 43(1), 1-32.

Robinson, P. (2007). Task complexity, theory of mind, and intentional reasoning: Effects on L2

speech production, interaction, uptake and perceptions of task difficulty. IRAL-

International Review of Applied Linguistics in Language Teaching, 45(3), 193-213.

Robinson, P., & Gilabert, R. (2007). Task complexity, the Cognition Hypothesis and second

language learning and performance. IRAL, 45, 161-176.

205

Ruiz-Funes, M. (2015). Exploring the potential of second/foreign language writing for language

learning: The effects of task factors and learner variables. Journal of Second Language

Writing, 28, 1-19. doi:10.1016/j.jslw.2015.02.001

Skehan, P. (1996). A framework for the implementation of task-based instruction. Applied

Linguistics, 17(1), 38-62.

Skehan, P. (2009). Modelling second language performance: Integrating Complexity, Accuracy,

Fluency, and Lexis. Applied Linguistics, 30(4), 510-532. doi:10.1093/applin/amp047

Veerappan, V., Yusof, D. S. M., & Aris, A. M. (2013). Language-switching in L2 composition

among ESL and EFL undergraduate writers. Linguistics Journal, 7(1), 209-228.

Willis, D., & Willis, J. (2008). Doing task-based teaching. Oxford University Press.

Copyright of Linguistics Journal is the property of E.L.E. Publishing and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.