answer

profilemas.l
ONE.pdf

Experimental trials and ‘what works?’ in

education: The case of grammar for writing

Dominic Wysea,* and Carole Torgersonb aUCL Institute of Education, London, UK; bDurham University, Durham, UK

The place of evidence to inform educational effectiveness has received increasing attention interna-

tionally in the last two decades. An important contribution to evidence-informed policy has been

greater attention to experimental trials including randomised controlled trials (RCTs). The aim of

this paper is to examine the use of evidence, particularly the use of evidence from experimental tri-

als, to inform national curriculum policy. To do this the teaching of grammar to help pupils’ writing

was selected as a case. Two well-regarded and influential experimental trials that had a significant

effect on policy, and that focused on the effectiveness of grammar teaching to support pupils’ writ-

ing, are examined in detail. In addition to the analysis of their methodology, the nature of the two

trials is also considered in relation to other key studies in the field of grammar teaching for writing

and a recently published robust RCT. The paper shows a significant and persistent mismatch

between national curriculum policy in England and the robust evidence that is available with regard

to the teaching of writing. It is concluded that there is a need for better evidence-informed decisions

by policy makers to ensure a national curriculum specification for writing that is more likely to have

positive impact on pupils.

Keywords: experimental trials; research evidence; grammar teaching; teaching writing

Introduction

One of the most important questions in education is: what works best to help children

and young people learn? Possible answers to this question are a daily reality for teach-

ers and their pupils, and the question is also of great concern to wider society, not

least because of governments’ significant expenditure on education, and the expecta-

tions that arise from this expenditure. Society expects schooling to enhance pupils’

learning as a result of teaching that is effective.

Over the last decade, across the world, the political impetus to examine ‘what

works?’ as part of educational effectiveness has coincided with the growth in the use

of two specific research designs to evaluate educational policy and practice: interna-

tional comparative surveys using large data sets, and more recently a growth in experi-

ments and quasi-experiments (Connolly, 2015). International comparative work,

including the testing of representative samples of pupils, is a prominent feature in

education policy evaluation in both low-income and high-income nation states.

Examples include: the goal-driven approach of the United Nations Sustainable Devel-

opment Goals (United Nations, 2017); the test-driven comparisons of specific aspects

*Corresponding author. UCL Institute of Education, 20 Bedford Way, Bloomsbury, London

WC1H 0AL, UK. E-mail: [email protected]; Twitter: @Dominic_Wyse

© 2017 British Educational Research Association

British Educational Research Journal Vol. 43, No. 6, December 2017, pp. 1019–1047

DOI: 10.1002/berj.3315

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

of education such as literacy (UNESCO, 2006); and large-scale international surveys

such as the Programme for International Student Assessment (PISA, for secondary

schooling), the Progress in International Reading and Literacy Study (PIRLS, for pri-

mary schooling) and the Trends in International Maths and Science Study (TIMSS,

covering both primary and secondary schooling) that combine pupil testing with sur-

veys exploring some aspects of educational policies in the comparator countries. The

research designs used in these international surveys are able to establish correlations

between education policies and outcomes, but are not able to establish whether such

policies cause the observed outcomes. A causal relationship to demonstrate effective-

ness requires a design which features a control group [i.e. a ‘true’ experiment—a ran-

domised controlled trial (RCT)—or a quasi-experiment (QE)]. The extent to which

studies using a variant of RCT or QE design can establish stronger or weaker causal

inference also depends on the robustness within the design and its conduct. Although

RCTs comparing the curriculum policies of whole countries are not feasible, RCTs

of specific approaches to teaching are feasible, not least in areas such as literacy that

are included as a comparator in most international analyses of the kinds described

above.

A paramount source of information about effective teaching should be research;

however, the extent to which education research has contributed answers to the ques-

tions of teaching efficacy and effectiveness is fiercely debated. As early as 1972, in the

US congress there was a view that education research was ‘mediocre and useless’

(Kaestle, 1993, p. 27). Thirty years later, it was observed that education in the USA

had been dragged ‘kicking and screaming, into the 20th century’ (Slavin, 2002, p. 15)

as a result of developments in education policy linked to ‘scientifically based

research’, such as the Elementary and Secondary Education ActNo Child Left Behind,

and emphasis on ‘proven, comprehensive reform models’ (Slavin, 2002, p. 15). At

the time, the US Office of Educational Research and Improvement invited nomina-

tion of programmes to be evaluated, ultimately using experimental designs, by third-

party evaluators (Slavin, 2002).

In the UK, the debate about the capacity of education research to contribute to

questions about effectiveness was reignited around 20 years ago. A trend of criticisms

of education research was typified by the Teacher Training Agency Annual Lecture in

1996 given by David H. Hargreaves who was, at the time, a Professor of Education at

the University of Cambridge. Hargreaves’ strong criticism of education research

included his opinion that it was poor value for money in relation to improving educa-

tion in schools, and that the teaching profession had been inadequately served by edu-

cation research. In a comparison with medicine, Hargreaves’ conclusion was that, ‘In

education we too need evidence about what works with whom under what conditions

and with what effects’ (Hargreaves, 1996, p. 8). More recently, the debate came to

prominence as a result of the work of the medical doctor, research fellow and journal-

ist Ben Goldacre (Goldacre, 2013). One notable aspect of Goldacre’s 2013 argument

is how similar it was to some of the points made 20 years previously by Hargreaves.

For example, the advocacy for RCT designs, the idea that research evidence about

education practice is weak, and comparisons with medicine were all addressed in

Hargreaves’ original lecture.

1020 D. Wyse & C. Torgerson

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

The aim of this paper is to examine the use of experimental trials in relation to evi-

dence about effective teaching, and to consider some links between research and

national curriculum policy. To do this we selected one important research area as a

case: the teaching of grammar to help pupils’ writing. The teaching of grammar for

writing is a useful case because the topic has attracted a fierce ideological debate as

well as a significant number of experimental and quasi-experimental trials evaluating

interventions to improve writing. Unlike previous work that has had a main focus on

the methodology of experimental trials or on the implications of evidence from experi-

mental trials for an aspect of policy and/or practice, our argument is built on an in-

depth analysis of methodology and research outcomes in a specified aspect of educa-

tion, namely the teaching of grammar to improve writing. Through examination of

methodology and a substantive topic, a stronger case can be made in relation to which

teaching method is likely to be effective.

The paper begins with a historical account of the debate about research evidence

and experimental trials in education. We then review the research evidence on gram-

mar teaching for writing. The curriculum policy context for grammar teaching is seen

in our brief description of representations of grammar in national curricula interna-

tionally, and an account of the development of England’s national curriculum of

2014. The main part of the paper is a detailed exploration of two well-regarded exper-

imental trials, published in peer-reviewed research journals, and focused on evaluat-

ing the role of grammar teaching in supporting the development of writing. The

studies examined similar approaches to teaching grammar in the same phase of edu-

cation, and both papers had a recognised impact on policy and practice. The studies

were also chosen because, although they addressed very similar teaching approaches,

they came to different conclusions about their effectiveness. One of the two papers

concluded that grammar teaching to support writing was not effective, whilst the other

paper concluded that it was effective. The important considerations for the argument

in the present paper are: (a) what the comparison of the two studies reveals about the

methodology of experimental trials; (b) the extent to which the outcomes of either of

the two studies are replicated in other experimental trials in the same field; and (c) as

a result of considering (a) and (b), whether the research evidence of grammar for writ-

ing is appropriately reflected in national curriculum policy in England, and what the

implications are for research, policy and practice.

Experimental trials in education research

Although the RCT is widely used in medical research, one of its first known uses to

investigate human activity (as opposed to RCT use in the natural sciences) in the

modern period was in the field of education. In the early 1930s in the USA, Walters

undertook two randomised experiments in the field of education (Walters, 1931,

1932). In a university setting Walters randomised the selection of members of the

freshmen class in the School of Mechanical Engineering at Purdue University. Some

of the freshmen were allocated to mentoring delivered by five seniors with a ‘good

scholarship record, pleasing personality, excellent health and fine social environ-

ment’, and some were allocated to a control mentoring condition. Academic out-

comes were then measured, and Walters concluded that the students in the

What works in education 1021

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

mentoring condition had better outcomes than the students in the control condition

(no mentoring). Walters’ experiment is the first known use of the term ‘random sam-

pling’—or randomisation—to form equivalent groups: ‘The 220 delinquent freshmen

were divided into two groups by random sampling.’ The following year Walters under-

took a replication trial with a much larger sample size and random allocation to one

of three ‘arms’: mentoring by seniors; mentoring by Faculty members; and a control

condition (Walters, 1931, 1932) and he concluded that the senior students were more

effective in personal mentoring in reducing drop-out or exam failure than the Faculty

members.

Between 1900 and the 1960s many ‘explanatory’ experiments were undertaken in

the field of education, sometimes using randomisation. These tended to be con-

ducted by educational psychologists, working in psychology laboratories, investigat-

ing basic psychological processes relevant to learning. Between the 1930s and the

1970s many RCTs in education were undertaken in the USA (some large scale), but

there was a dearth of high-quality RCTs in education research in the UK. Between

the 1970s and the 2000s, there were very few large-scale RCTs in education in the

USA, as the design had largely fallen out of favour, although there were a few notable

exceptions.

After the 50-year lull in activity, greater emphasis on experimental trials to inform

education policy in the USA and the UK became evident. This step-change in the his-

tory of the use of the design was largely driven by two distinct policy initiatives on

either side of the Atlantic. In 2002, when George Bush enacted the No Child Left

Behind Act, the subsequent creation of the Institute of Education Sciences (IES) led

to public investment in the use of experimental design to evaluate education policies

and interventions. The legislation mandated the RCT as the design of choice for eval-

uating education interventions: ‘Scientifically valid educational evaluation employs

experimental designs using random assignment, when feasible, and other research

methodologies that allow for the strongest possible causal inferences when random

assignment is not feasible’ (p. 5). Since that time, over 200 experiments and quasi-

experiments have been funded by the IES and undertaken in education in the USA.

The IES does fund quasi-experiments, but only if randomisation is not thought to be

feasible, which occurs in rare circumstances where, for example, it may be deemed

unethical to undertake random allocation.

The UK equivalent to the greater use of experimental trials in education in the

USA was the creation of the Education Endowment Foundation (EEF) in 2011, and

its requirement that ‘. . . all EEF projects will be rigorously evaluated by independent

experts in educational research according to minimum standards . . . The impact of

projects on attainment will be evaluated, where possible, using randomised controlled

trials’ (Education Endowment Foundation, 2017). The EEF has now funded over

120 RCTs and quasi-experiments, evaluating education policies and practices; simi-

lar to the IES, it only funds quasi-experiments where randomisation is not feasible.

The similarity of the criticisms of educational research made by both Hargreaves

and Goldacre, alluded to earlier in this paper, seemed to indicate that little had chan-

ged in relation to the nature of educational research; however, the evidence shows a

different picture. Between 1980 and 2015 the number of RCTs in education demon-

strated significant increases, particularly from 2006 onwards (Connolly, 2015).

1022 D. Wyse & C. Torgerson

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

According to Connolly’s analysis, although the USA and Canada are still responsible

for undertaking the majority of RCTs (approximately 375), the UK had a significant

number (approximately 80), in comparison to much larger population areas (e.g. rest

of Europe approximately 140; Australia/New Zealand 50) (Connolly, 2015). More

than 200 of the RCTs have focused on interventions taking place over a full academic

year or longer (short-duration RCTs was a criticism made by Slavin, 2002). Approxi-

mately 540 RCTs focused on: physical health and wellbeing; behaviour and social

wellbeing; and professional training. Approximately 90 evaluated literacy/English lan-

guage interventions; and approximately 225 focused on other academic interventions

and outcomes, study-related skills and numeracy/maths interventions (Connolly,

2015). There is less evidence about the frequency of use of other experimental

designs.

During this period of growing emphasis on RCTs in education the longstanding

philosophical critique of ‘positivist’ methodologies also continued. RCTs, as part of

evidence-based education research, have been criticised because they are reduction-

ist and not appropriate for the evaluation of educational interventions which, as a

result of the complexity of the social context, are necessarily more challenging com-

pared with experiments in the natural sciences (e.g. Morrison, 2001). But others

have countered with the opinion that, for some research questions, a well-con-

ducted RCT is the strongest research design when seeking to compare effectiveness

of interventions. For example, the potential of RCTs was seen in a complex inter-

vention on sex education at secondary education level that paid careful attention to

the methodological challenges of evaluation in the real world of secondary schools

(Moore et al., 2003). Another more recent strand of the debate has linked a critique

of positivism with support for ‘realist’ approaches (e.g. as recommended by the

criminologists Pawson & Tilley, 1997), including the important idea that ‘what

works’ should have a central focus on who an intervention works for, and the con-

text in which a specific intervention can work. In an exploration of Pawson and Til-

ley’s ideas, Bonell et al. (2012) acknowledge the importance of attention to theories

of causal mechanisms but critique the realist position on the grounds of: misunder-

standing of the use of counterfactuals; the resultant limit on findings based on plau-

sibility rather than on probability (in a statistical sense); and a lack of

acknowledgement that well-conducted experiments do include attention to mecha-

nisms and context but are also able to assess causal attribution, something which

realist approaches cannot do. Stronger experimental studies have, for some time,

recognised context and methodological limitations. This recognition is evident in

the seminal book on experimental design: ‘The experiment is not a clear window

that reveals nature directly to us. To the contrary, experiments yield hypothetical

and fallible knowledge that is often dependent on context and imbued with many

unstated theoretical assumptions . . . In this sense, all scientists are epistemological

constructivists and relativists, the difference is whether they are strong or weak rela-

tivists’ (Shadish et al., 2002, p. 29). More recently, the epistemological debate has

also been informed by ongoing developments in mixed-methods design and

methodology, including the more routine use of process evaluation, or embedded

ethnography, as part of RCTs. These developments include the recognition that the

dualisms and intellectual tensions that are part of mixed-methods methodology,

What works in education 1023

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

and of understanding what works, are usefully framed by philosophical pragmatism

(Johnson et al., 2017).

The teaching of grammar in national curricula

Recent growth of interest in grammar for writing has been clearly evident in develop-

ments in national curricula in a range of countries with English as a main language.

For example, in the Australian Curriculum’s English learning area, the language strand

is positioned first in the curriculum structure before the strands for literature and lit-

eracy. For children aged 10 to 11 this strand includes explicit attention to ‘sentences

and clause-level grammar’ and to ‘noun groups/phrases’ and ‘adjective groups/

phrases’ (Australian Curriculum Assessment and Reporting Authority, 2017). In the

USA the Common Core State Standards text for English Language Arts for the same

age of children specifies reading, writing, speaking and listening, then language. As

part of the language specification ‘Conventions of Standard English Grammar and

Usage’ (including forming perfect verb tenses; explaining the function of preposi-

tions; etc.) is listed before ‘Knowledge of Language’ and ‘Vocabulary Acquisition and

Use’ (National Governors Association Center for Best Practices Council of Chief

State School Officers, 2010). These kinds of emphases on grammar are not only evi-

dent in high-income nations and states but also in other post-colonial countries with

historic links to the British Empire, for example in the countries of Africa (e.g. see

Wyse et al., 2014).

The emphases in New Zealand’s national curriculum appear to have some differ-

ences from the countries surveyed in this paper so far. In The New Zealand Curriculum

the focus on language is a holistic one, with an emphasis on the making and creating

of meaning (New ZealandMinistry of Education, 2007, p. 18). This holistic attention

to language is also reflected in the strong place of the indigenous language Te Reo

M�aori and New Zealand Sign Language, and in the title ‘an English medium curricu-

lum’. The emphasis on grammar also appears to be different. For example, the speci-

fication of ‘Language features’ as part of ‘Speaking, Writing and Presenting’ is

positioned last in the list of curriculum requirements, and emphasises the way that

pupils should understand grammar as follows: ‘Use a wide range of text conventions,

including grammatical and spelling conventions, appropriately, effectively, and with

accuracy’ (New Zealand Curriculum, Years and Curriculum Levels, Level Six Eng-

lish).

In the different countries of the UK the national curricula for language and English

have differed markedly since political devolutions of power, with England having

increasingly more emphasis on discreet elements such as grammar and phonics (Wyse

et al., 2013). The importance attributed to grammar by policy makers in England

since 2011 can be seen in the intensification of the teaching of formal grammar as part

of the subject of English in England’s national curriculum. In the national curriculum

of 2014 the programmes of study for writing for 9- to 11-year-old pupils include

statutory requirements for the teaching of ‘Writing – transcription’, including spel-

ling, handwriting and presentation. These sections are followed by writing composi-

tion (planning and drafting), then vocabulary, grammar and punctuation. Increased

attention to vocabulary, grammar and punctuation is added through an appendix that

1024 D. Wyse & C. Torgerson

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

includes an emphasis on ‘explicit knowledge of grammar’ (Department for Educa-

tion, 2013a, p. 75), where pupils in Year 3 (7 to 8 years old) are expected to under-

stand terminology that includes ‘subordinate clause’ and for Year 6 (10 to 11 years

old) the need to be introduced, for example, to the ‘use of the passive to affect the pre-

sentation of information in a sentence’ (Department for Education, 2013a, p. 79,

emphasis in original).

In addition to the emphasis in the national curriculum programmes of study, the

national statutory tests for 11-year-old pupils in England included for the first time in

2011 a separate spelling, punctuation and grammar test where formal grammar was

further emphasised. In addition, the requirements for teacher assessment of writing

included a strong emphasis on grammar as part of the assessment criteria. In 2016

these emphases were still in place. For example, the national statutory test for Spel-

ling, Punctuation and Grammar included a strong emphasis on formal grammar

including questions that required knowledge of grammatical terminology (e.g. ‘27.

Underline the subordinate clause in each sentence below’; Standards and Testing

Agency, 2016, p. 17, emphasis in original). All questions in the paper attracted one

mark each. Although the 2016 criteria for statutory teacher assessment of writing,

produced by pupils in lessons, included aspects such as ‘creating atmosphere’ in their

writing, there was a strong emphasis on usage according to areas of formal grammar

such as ‘passive and modal verbs’ and ‘adverbs, preposition phrases and expanded

noun phrases’, etc. (Standards and Testing Agency, 2015).

The politics and policies that led to the emphasis on formal grammar in England’s

national curriculum implemented from 2014 onwards began with a government

White Paper in 2010 that included the commitment to ‘Review and reform the

National Curriculum so that it becomes a benchmark outlining the knowledge and

concepts pupils should be expected to master to take their place as educated members

of society’ (Department for Education, 2010, p. 41). The link between statutory

assessment, the curriculum and school accountability was also made clear: ‘The

National Curriculum will continue to inform the design and content of assessment at

the end of key stage two, which will apply to every child and which will provide a

guide to the performance of primary schools’ (Department for Education, 2010, p.

42). After publication of the White Paper the government commissioned a review of

assessment in England led by Lord Bew. Bew’s final report noted that ‘there are some

elements of writing – spelling, grammar, punctuation, vocabulary – where there are

clear “right” and “wrong” answers, which lend themselves to externally-marked test-

ing . . . Internationally a number of jurisdictions conduct externally-marked tests of

spelling, punctuation and grammar . . . These are essential skills and we recommend

that externally-marked tests of spelling, punctuation, grammar and vocabu-

lary should be developed’ (Bew, 2011, p. 60, emphasis in original).

A public consultation on the proposals for the new national curriculum was held

between February and April 2013. It attracted 17,312 respondents with 4,576

described as ‘non-campaign respondents’ and 12,736 described as ‘campaign

respondents’ (i.e. organisations devoted to a particular issue; the report of the consul-

tation made clear that campaign responses were not included in the percentages of

answers to questions but were reflected in the commentaries about the answers).

3,682 respondents addressed the question ‘Do you have any comments on the

What works in education 1025

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

content set out in the draft programmes of study?’ With regard to the teaching of the

subject English, and the teaching of grammar within that subject, ‘There was recogni-

tion that the teaching of phonics, punctuation, spelling and grammar was necessary,

but some felt that there was an over-emphasis on these aspects’ (Department for

Education, 2013b, p. 7). It is disappointing that the number of respondents who

replied about grammar was not specified in the report as this would have provided

some further evidence relevant to the strength of opinion on this issue.

There was also a follow-up consultation, open from July to August 2013, on the

draft legislative order, which attracted further comment about English and grammar.

Although 21 respondents (11%) supported the greater focus on spelling, grammar

and punctuation,

a total of 36 respondents (19%) however expressed concern in relation to the more

demanding grammatical content included for years 2 and 4 . . . 52 respondents (28%) said

the English primary curriculum was too prescriptive, in particular in reference to the level

of specification in the appendices [where the grammatical knowledge to be learned by

pupils is specified]. These respondents argued that this undermined the aims of the new

national curriculum in relation to greater professional freedom and were concerned that

this may have implications for the provision of a balanced and broadly based school cur-

riculum. (Department for Education, 2013c, p. 6)

One interpretation of these data in the second consultation is that 47% of respon-

dents were critical of the grammar specified in the national curriculum and its appen-

dices, but 11% thought the emphasis on correct use of Standard English was

commendable. An overall negative response to the proposed attention to grammar

did not result in changes to this element of the national curriculum.

Reservations about the nature of the specifications for grammar teaching in the

national curriculum and its associated statutory testing continued to cause disagree-

ment. The main government advisor for grammar in the statutory assessment system

described the process of determining the curriculum for grammar as ‘chaotic’ and

said that ‘We started off with the primary curriculum, which we were a bit unconfi-

dent about as none of us had much experience of primary education’ (Mansell,

2017). In April 2017 a House of Commons Education Select Committee report on

assessment in primary schools concluded that:

One issue with the writing assessment is the focus on technical aspects, like grammar and

spelling, over creativity and composition. We are not convinced that this leads directly to

improved writing and urge the Government to reconsider this balance and make spelling,

punctuation and grammar tests non-statutory at Key Stage 2 (House of Commons Educa-

tion Committee, 2017, p. 3)

This brief account of some of the work that led to greater emphasis on grammar in

England’s national curriculum, and subsequent implications, shows that research evi-

dence, of any kind, had insufficient consideration and influence on the national cur-

riculum of 2014. Further corroboration of problems with attention to research

evidence was detailed by BERA President Mary James (British Educational Research

Association, 2012), one of the expert group advising on the national curriculum. In

addition, reflecting on his time as Minister for Schools under Secretary of State for

Education Michael Gove, David Laws claims that decisions were made ‘not based on

1026 D. Wyse & C. Torgerson

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

evidence but on hunch’ (Wilby, 2017) and that Gove had a particular weakness for

basing decisions on ‘ideology and personal experience’ (Wilby, 2017).

Research evidence on grammar for writing

The place of grammar in education has been a point of debate for at least 200 years,

in part because it has been repeatedly linked with the development of the concept of

‘standard’ English (Crystal, 2004). In the twenty-first century general interest in

grammar teaching as an element in the teaching of writing continued (Wyse, 2001;

Andrews et al., 2004a,b; Myhill & Watson, 2014). In 2001, as a result of a compre-

hensive narrative review of empirical studies, it was concluded that:

The findings from international research clearly indicate that the teaching of grammar (us-

ing a range of models) has negligible positive effects on improving secondary pupils’ writ-

ing. Of further concern is the negative impact on pupils’ motivation. In the [National

Literacy Strategy] Framework for Teaching the move towards the teaching of grammatical

‘technical vocabulary’ such as adjective; noun: collective, common, proper; pronoun: per-

sonal, possessive; verb, and verb tense to six and seven year-old children in England is

highly questionable. It is regrettable that there is not more evidence about primary pupils;

however, the developmental arguments that such teaching is inappropriate at primary level are

persuasive. (Wyse, 2001, p. 422, emphasis added)

This finding was subsequently supported in two systematic reviews (SRs) undertaken

by one of the authors of this paper and colleagues (Andrews et al., 2004a,b). In the

first systematic review evaluating the effect of grammar teaching (syntax) in English

on 5- to 16-year-olds’ accuracy and quality in written composition, Andrews et al.

(2004a) concluded there was insufficient high-quality evidence to ‘counter the pre-

vailing belief that the teaching of the principles underlying and informing word order

or “syntax” has virtually no influence on the writing quality or accuracy of 5 to 16

year-olds’ (Andrews et al., 2004a). This conclusion applied to both the ‘traditional’

approach of emphasising word order and parts of speech and the ‘transformational’

approach, based on transformational-generative grammar. The current picture of

robust research in relation to grammar teaching to support pupil’s writing is shown in

Tables 1 and 2.

As the evidence summarised in Tables 1 and 2 shows, as far as primary/elementary

education is concerned there is strong evidence that grammar teaching of a range of

types, but particularly traditional grammar teaching, is not effective for improving

pupils’ writing. There is evidence that sentence-combining is effective but no experi-

mental studies have been carried out in the UK. At secondary education level there is

a slightly more mixed picture. The majority of the evidence suggests that, apart from

sentence-combining, grammar teaching is not effective for improving pupils’ writing.

However one robust study, Myhill et al. (2011), showed that contextualised grammar

teaching was effective for improving secondary pupils’ writing, although the approach

was more effective for higher-attaining pupils.

In about 2010, a challenge to the longstanding view that grammar teaching was not

the most effective way to improve writing emerged from researchers in the UK. For

example, in an interview, it was stated: ‘. . . what we have for the first time ever, inter-

nationally, is research evidence that shows that the teaching of grammar can have an

What works in education 1027

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

T a b le

1 .

A se le ct io n o f k ey

m et a -a n a ly se s a n d in fl u en

ti a l si n g le ex p er im

en ta l st u d ie s (p ri m a ry /e le m en

ta ry

ed u ca ti o n )1

C it a ti o n

U K

o n ly o r o th er

co u n tr ie s

T y p e o f p u p il s

S a m p le ; a g e o f p u p il s

D es ig n

D u ra ti o n o f

in te rv en

ti o n

In te rv en

ti o n

su m m a ry

C o n tr o l su m m a ry

S u m m a ry

o f m a in

o u tc o m e a n d

co n cl u si o n

A n d re w s et a l.

(2 0 0 4 a )

O th er

co u n tr ie s:

w o rl d w id e

M u lt ip le ty p es

o f

p u p il s

M u lt ip le sa m p le s;

v a ri ed

S y st em

a ti c re v ie w

M u lt ip le

M u lt ip le in te rv en

ti o n

ty p es

(e .g .,

g en

er a ti v e

g ra m m a r, ex p o su re

to st o ry

a n d

st a n d a rd

E n g li sh

fe a tu re s,

tr a n sf o rm

a ti o n a l

g ra m m a r,

tr a d it io n a l

g ra m m a r,

co n te x tu a li se d

g ra m m a r)

M u lt ip le b u t n o t

sp ec if ie d

G ra m m a r te a ch

in g

h a s v ir tu a ll y n o

im p a ct

o n p u p il s’

w ri ti n g . T ea ch

in g

o f sy n ta x in

E n g li sh

sh o u ld

ce a se

to b e

p a rt o f th e

cu rr ic u lu m

A n d re w s et a l.

(2 0 0 4 b )

O th er

co u n tr ie s:

w o rl d w id e

M u lt ip le ty p es

o f

p u p il s

M u lt ip le ty p es

o f

p u p il s

M et a -a n a ly si s

M u lt ip le

S en

te n ce -c o m b in in g

M u lt ip le b u t n o t

sp ec if ie d

T h e N a ti o n a l

C u rr ic u lu m

in

E n g la n d sh o u ld

b e

re v is ed

to ta k e in to

a cc o u n t th a t th e

te a ch

in g o f

se n te n ce -

co m b in in g is

ef fe ct iv e

1028 D. Wyse & C. Torgerson

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

T a b le

1 .

(C o n ti n u ed

)

C it a ti o n

U K

o n ly o r o th er

co u n tr ie s

T y p e o f p u p il s

S a m p le ; a g e o f p u p il s

D es ig n

D u ra ti o n o f

in te rv en

ti o n

In te rv en

ti o n

su m m a ry

C o n tr o l su m m a ry

S u m m a ry

o f m a in

o u tc o m e a n d

co n cl u si o n

G ra h a m

et a l.

(2 0 1 2 )

O th er

co u n tr ie s:

w o rl d w id e

M u lt ip le ty p es

o f

p u p il s

M u lt ip le sa m p le s;

v a ri ed

M et a -a n a ly si s

M u lt ip le

M u lt ip le in te rv en

ti o n

ty p es

(‘ e. g .,

co m p a ri so n s w er e

m a d e to

p ro ce ss

w ri ti n g , st ra te g y

in st ru ct io n , a n d

ty p ic a l la n g u a g e

a rt s in st ru ct io n ’, p .

8 8 7 )

G ra m m a r in st ru ct io n

(‘ e. g ., st u d en

ts

sy st em

a ti ca ll y

st u d ie d p a rt s o f

sp ee ch

,

d ia g ra m m ed

se n te n ce s, a n d so

fo rt h ’, p . 8 8 1 )

T ea ch

in g g ra m m a r

d o es

n o t im

p ro v e

p u p il s’ w ri ti n g

A fo cu

s o n a ra n g e o f

ev id en

ce -b a se d

a p p ro a ch

es to

te a ch

in g w ri ti n g is

m o re

b en

ef ic ia l

th a n g ra m m a r

te a ch

in g fo r w ri ti n g

G ra h a m

a n d

H a rr is (2 0 1 7 )

O th er

co u n tr ie s:

w o rl d w id e

M u lt ip le ty p es

o f

p u p il s

M u lt ip le sa m p le s;

v a ri ed

M et a -a n a ly si s o f

m et a a n a ly se s

M u lt ip le

M u lt ip le in te rv en

ti o n

ty p es

(‘ e. g .,

co m p a ri so n s w er e

m a d e to

p ro ce ss

w ri ti n g , st ra te g y

in st ru ct io n , a n d

ty p ic a l la n g u a g e

a rt s in st ru ct io n ’, p .

8 8 7 )

G ra m m a r in st ru ct io n

A p a rt fr o m

th e

se n te n ce

co m b in in g

a p p ro a ch

,

g ra m m a r te a ch

in g

d o es

n o t h a v e a

p o si ti v e ef fe ct

o n

p u p il s’ w ri ti n g

T h er e a re

a ra n g e o f

ev id en

ce -b a se d

a p p ro a ch

es a n d

st ra te g ie s th a t ca n

h a v e a p o si ti v e

ef fe ct

o n p u p il s’

w ri ti n g

What works in education 1029

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

T a b le

1 .

(C o n ti n u ed

)

C it a ti o n

U K

o n ly o r o th er

co u n tr ie s

T y p e o f p u p il s

S a m p le ; a g e o f p u p il s

D es ig n

D u ra ti o n o f

in te rv en

ti o n

In te rv en

ti o n

su m m a ry

C o n tr o l su m m a ry

S u m m a ry

o f m a in

o u tc o m e a n d

co n cl u si o n

F o g el a n d E h ri

(2 0 0 0 )

O th er

co u n tr ie s:

U S A

T w o N o rt h -

ea st er n U S

ci ti es

w it h

si ze a b le

p o p u la ti o n s o f

A fr ic a n -

A m er ic a n

re si d en

ts

8 9 A fr ic a n -A

m er ic a n

B E V -s p ea k in g 3 rd -

a n d 4 th -g ra d e

el em

en ta ry

sc h o o l

st u d en

ts ; a g e 8 –1

0 .

T w el v e in ta ct

3 rd -

a n d 4 th -g ra d e

el em

en ta ry

sc h o o l

cl a ss es

R C T

T w o se ss io n s o f

a b o u t 6 0 m in u te s

ea ch

In te rv en

ti o n g ro u p 1

(E x p o su re

E ):

F o cu

s o n

co rr ec ti n g n o n -

st a n d a rd

fo rm

s o f

E n g li sh

co m m o n

to B la ck

E n g li sh

V er n a cu

la r

In te rv en

ti o n g ro u p 2

(E x p o su re

p lu s

st ra te g y in st ru ct io n

E S ): sa m e a s g ro u p

1 p lu s st ra te g y

in st ru ct io n

In te rv en

ti o n g ro u p 3

(E x p o su re , st ra te g y

in st ru ct io n p lu s

p ra ct ic e E S P ):

sa m e a s g ro u p 2

p lu s p ra ct ic e

T h re e ex p er im

en ta l

co n d it io n s

T h e co

m b in a ti o n o f

ex p o su re

to

co rr ec ti n g n o n -

st a n d a rd

g ra m m a r,

st ra te g y in st ru ct io n

a n d p ra ct ic e, a t

le a st a t p a ra g ra p h

le v el , is b en

ef ic ia l

fo r w ri ti n g

1030 D. Wyse & C. Torgerson

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

T a b le

1 .

(C o n ti n u ed

)

C it a ti o n

U K

o n ly o r o th er

co u n tr ie s

T y p e o f p u p il s

S a m p le ; a g e o f p u p il s

D es ig n

D u ra ti o n o f

in te rv en

ti o n

In te rv en

ti o n

su m m a ry

C o n tr o l su m m a ry

S u m m a ry

o f m a in

o u tc o m e a n d

co n cl u si o n

S a d d le r a n d

G ra h a m

(2 0 0 5 )

O th er

co u n tr ie s:

U S A

N in e cl a ss ro o m s/

th re e sc h o o ls .

M o re

sk il le d

w ri te rs

v s. le ss

sk il le d w ri te rs

4 4 p u p il s; 9 –1

0 . 4 th

g ra d e

R C T

3 0 le ss o n s

S en

te n ce -c o m b in in g

T ra d it io n a l

g ra m m a r.

G ra m m a r sk il ls :

p a rt s o f sp ee ch

.

P re ci si o n o f

st u d en

t v o ca b u la ry

in th ei r w ri ti n g .

T h e in st ru ct o r

m o d el le d , w h il e

th in k in g a lo u d ,

h o w to

a p p ly th e

ta rg et

p a rt o f

sp ee ch

fo r th a t

u n it . T h is in v o lv ed

sh o w in g th e

st u d en

ts a se n te n ce

w it h th e ta rg et

p a rt

o f sp ee ch

m is si n g

a n d re a d in g th e

se n te n ce

a lo u d

S en

te n ce -c o m b in in g

w it h p ee r

a ss is ta n ce

h a d a

p o si ti v e ef fe ct

o n

p u p il s’ w ri ti n g

T o rg er so n et a l.

(2 0 1 4 )

U K

5 3 p ri m a ry

sc h o o ls fr o m

fo u r

g eo

g ra p h ic a l

re g io n s a cr o ss

E n g la n d

E st im

a te d 2 5 4 9 –

2 6 4 9 ; a g e 1 0 –1

1 .

Y ea r 6

R C T

1 5 le ss o n s o v er

fo u r

w ee k s

C o n te x tu a li se d

g ra m m a r

P u p il s ra n d o m is ed

to

th e ‘b u si n es s- a s-

u su a l’ g ro u p

re ce iv ed

th ei r u su a l

li te ra cy

le ss o n a s

p la n n ed

b y th ei r

te a ch

er

C o n te x tu a li se d

g ra m m a r w a s n o t

ef fe ct iv e in

im p ro v in g p u p il s’

w ri ti n g a s a w h o le -

cl a ss

in te rv en

ti o n

1 T h e tw

o m et a -a n a ly se s b y A n d re w s et a l. a d d re ss

p ri m a ry

a n d se co

n d a ry

ed u ca ti o n .

What works in education 1031

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

T a b le

2 .

A se le ct io n o f k ey

m et a -a n a ly se s a n d in fl u en

ti a l si n g le ex p er im

en ta l st u d ie s (s ec o n d a ry

ed u ca ti o n )

C it a ti o n

U K

o n ly o r o th er

co u n tr ie s

T y p e o f p u p il s

S a m p le ; a g e o f p u p il s

D es ig n

D u ra ti o n o f

in te rv en

ti o n

In te rv en

ti o n

su m m a ry

C o n tr o l su m m a ry

S u m m a ry

o f m a in

o u tc o m e a n d

co n cl u si o n

G ra h a m

a n d

P er in

(2 0 0 7 a )

O th er

co u n tr ie s:

w o rl d w id e

M u lt ip le ty p es

o f

p u p il s

M u lt ip le sa m p le s;

v a ri ed

M et a -a n a ly si s

M u lt ip le

M u lt ip le in te rv en

ti o n

ty p es : se n te n ce -

co m b in in g ; p ro ce ss

w ri ti n g ; ex p o si to ry

sk il ls ; cr ea ti v e

th in k in g a n d

se n te n ce -

co m b in in g

G ra m m a r

in st ru ct io n ,

tr a d it io n a l

g ra m m a r

in st ru ct io n a n d o n e

st u d y o f g ra m m a r

in st ru ct io n in

co n te x t

T ea ch

in g g ra m m a r

d o es

n o t im

p ro v e

p u p il s’ w ri ti n g

T h er e a re

a v a ri et y o f

in st ru ct io n a l

p ro ce d u re s th a t

im p ro v e th e q u a li ty

o f th e w ri ti n g o f

a d o le sc en

t

st u d en

ts

G ra h a m

a n d

P er in

(2 0 0 7 b )

O th er

co u n tr ie s:

w o rl d w id e

M u lt ip le ty p es

o f

p u p il s

M u lt ip le sa m p le s;

v a ri ed

M et a -a n a ly si s

M u lt ip le

M u lt ip le in te rv en

ti o n

ty p es

in cl u d in g :

st ra te g y

in st ru ct io n ;

p ro ce ss

a p p ro a ch

;

g ro u p w o rk ;

g ra m m a r

in st ru ct io n in

co n te x t; re a d in g

a n d w ri ti n g

T ra d it io n a l g ra m m a r

in st ru ct io n (n o

fu rt h er

sp ec if ic a ti o n )

G ra m m a r te a ch

in g

ca n h a v e a n eg a ti v e

ef fe ct

o n p u p il s’

w ri ti n g

A fo cu

s o n a ra n g e o f

ev id en

ce -b a se d

a p p ro a ch

es to

te a ch

in g w ri ti n g is

m o re

b en

ef ic ia l

th a n g ra m m a r

te a ch

in g fo r w ri ti n g

G ra h a m

a n d

P er in

(2 0 0 7 c)

O th er

co u n tr ie s:

U S A

M u lt ip le ty p es

o f

p u p il s

M u lt ip le sa m p le s;

v a ri ed

M et a -a n a ly si s.

In cl u d es

re v ie w o f

si n g le su b je ct

d es ig n s (i .e ., n o t

R C T s)

M u lt ip le

M u lt ip le in te rv en

ti o n

ty p es : fr ee

ti m e;

m in i- le ss o n s; d ir ec t

m et h o d tr a in in g o n

er ro rs ; g o a l- se tt in g

a n d a tt ri b u ti o n a l

fe ed

b a ck

T ra d it io n a l g ra m m a r

[‘ T h is a p p ro a ch

in v o lv ed

th e

ex p li ci t a n d

sy st em

a ti c te a ch

in g

o f g ra m m a r (e .g .,

th e st u d y o f p a rt s

o f sp ee ch

a n d

se n te n ce s) ’, p .

3 1 9 ]

T h er e a re

se ri o u s

q u es ti o n s a b o u t

th e v a lu e o f

tr a d it io n a l sc h o o l

g ra m m a r

1032 D. Wyse & C. Torgerson

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

T a b le

2 .

(C o n ti n u ed

)

C it a ti o n

U K

o n ly o r o th er

co u n tr ie s

T y p e o f p u p il s

S a m p le ; a g e o f p u p il s

D es ig n

D u ra ti o n o f

in te rv en

ti o n

In te rv en

ti o n

su m m a ry

C o n tr o l su m m a ry

S u m m a ry

o f m a in

o u tc o m e a n d

co n cl u si o n

B a te m a n a n d

Z id o n is (1 9 6 6 )

O th er

co u n tr ie s:

U S A

U n iv er si ty

sc h o o l o f

th e U n iv er si ty

o f

O h io

5 0 p u p il s; a g e 1 4 –1

6 .

9 th

a n d 1 0 th

g ra d e

R C T

T w o y ea rs ;

ex p er im

en ta l

co n d it io n

in tr o d u ce d in to

n o rm

a l te a ch

in g

T ea ch

in g n o t

sp ec if ie d v er y

cl ea rl y , b u t th e

C h o m sk y

tr a n sf o rm

a ti o n a l

a p p ro a ch

a p p ea rs

to b e se n te n ce -

co m b in in g : ‘F o rt y -

si x

tr a n sf o rm

a ti o n a l

ru le s se rv ed

to

id en

ti fy

th e

g ra m m a ti ca l

o p er a ti o n s th a t

ea ch

se n te n ce

in

th e sa m p le

re fl ec te d . T h es e

ru le s a re

o f fo u r

ty p es : E m b ed

d in g ,

C o n jo in in g ,

D el et in g , a n d

S im

p le ’ (p . 8 )

‘E a ch

cl a ss

[i n te rv en

ti o n a n d

co n tr o l] st u d ie d

w h a t w o u ld

b e

co n si d er ed

th e

re g u la r cu

rr ic u lu m

a t th e sc h o o l w it h

th is ex ce p ti o n : th e

ex p er im

en ta l cl a ss

st u d ie d m a te ri a ls

sp ec ia ll y a d a p te d

b y th e in v es ti g a to rs

fr o m

th e a re a o f

g en

er a ti v e

g ra m m a r’ (p . 7 )

S en

te n ce -c o m b in in g

b a se d o n

C h o m sk y ’s

‘g en

er a ti v e

g ra m m a r’ ca n h el p

fo rm

a ti o n o f w el l-

fo rm

ed se n te n ce s

What works in education 1033

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

T a b le

2 .

(C o n ti n u ed

)

C it a ti o n

U K

o n ly o r o th er

co u n tr ie s

T y p e o f p u p il s

S a m p le ; a g e o f p u p il s

D es ig n

D u ra ti o n o f

in te rv en

ti o n

In te rv en

ti o n

su m m a ry

C o n tr o l su m m a ry

S u m m a ry

o f m a in

o u tc o m e a n d

co n cl u si o n

M y h il l et a l.

(2 0 1 1 )

U K

S ta n d a rd

cl a ss es

3 2 Y ea r 8 m ix ed

-

a b il it y cl a ss es

in

co m p re h en

si v e

sc h o o ls w it h

b et w ee n 2 4 a n d 3 0

st u d en

ts in

ea ch

cl a ss ; 7 4 4 to ta l

(4 1 2 p u p il s in

in te rv en

ti o n

cl a ss es ; 3 3 2 in

co n tr o l cl a ss es );

a g e 1 2 –1

3

R C T

O n e y ea r, th re e- w ee k

p er io d o n ce

p er

te rm

C o n te x tu a li se d

g ra m m a r:

‘c o m p ri se d

d et a il ed

te a ch

in g

sc h em

es o f w o rk

in

w h ic h g ra m m a r

w a s em

b ed

d ed

w h er e a

m ea n in g fu l

co n n ec ti o n co

u ld

b e m a d e b et w ee n

th e g ra m m a r p o in t

a n d w ri ti n g ’ (p .

1 4 6 )

N o rm

a l te a ch

in g th a t

in cl u d ed

a tt en

ti o n

to ‘u si n g g ra m m a r

a cc u ra te ly a n d

a p p ro p ri a te ly .. .’

(p . 1 4 6 )

C o n te x tu a li se d

g ra m m a r te a ch

in g

h a s a p o si ti v e ef fe ct

o n p u p il s’ w ri ti n g

O ’H

a re

(1 9 7 3 )

O th er

co u n tr ie s:

U S A

F lo ri d a S ta te

U n iv er si ty

H ig h

S ch

o o l

8 3 p u p il s (4 1 in

in te rv en

ti o n

g ro u p s; 4 2 in

co n tr o l g ro u p s) ;

a g e 1 2 –1

3 . 7 th

g ra d e

R C T

1 9 le ss o n s

S en

te n ce -c o m b in in g

R eg u la r cu

rr ic u lu m

in E n g li sh

S en

te n ce -c o m b in in g ,

n o t fo rm

a l

k n o w le d g e o f

g ra m m a r, h a s a

fa v o u ra b le ef fe ct

o n p u p il s’ w ri ti n g

E ll ey

et a l.

(1 9 7 6 )

O th er

co u n tr ie s:

A u st ra li a

O u ts k ir ts o f

A u ck la n d ci ty ; o n e

la rg e co

-

ed u ca ti o n a l h ig h

sc h o o l

2 4 8 p u p il s; a g e 1 3 –

1 7 . 3 rd

fo rm

to 6 th

fo rm

Q u a si -e x p er im

en t

A p p ro x im

a te ly 5 7 4

p er io d s o f E n g li sh

in th re e y ea rs

C o n te x tu a li se d

g ra m m a r

G ro u p 1 : ‘R

ea d in g -

W ri ti n g ’

G ro u p 2 : ‘L et ’s

L ea rn

E n g li sh ’

A ra n g e o f ty p es

o f

g ra m m a r te a ch

in g

sh o w ed

n o o v er a ll

b en

ef it s fo r p u p il s’

w ri ti n g

1034 D. Wyse & C. Torgerson

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

T a b le

2 .

(C o n ti n u ed

)

C it a ti o n

U K

o n ly o r o th er

co u n tr ie s

T y p e o f p u p il s

S a m p le ; a g e o f p u p il s

D es ig n

D u ra ti o n o f

in te rv en

ti o n

In te rv en

ti o n

su m m a ry

C o n tr o l su m m a ry

S u m m a ry

o f m a in

o u tc o m e a n d

co n cl u si o n

F ea rn

a n d

F a rn a n (2 0 0 7 )

O th er

co u n tr ie s:

U S A

U rb a n h ig h sc h o o l

A ll th re e cl a ss es

co n ta in ed

2 4 to

2 6

1 0 th -g ra d er s; a g e

1 5 –1

6

Q u a si -e x p er im

en t

F iv e w ee k s, 1 0 to

1 2

m in u te s tw

ic e p er

w ee k

N o t co

m p le te ly cl ea r;

re la ti v el y fo rm

a l

g ra m m a r te a ch

in g

a n d so m e

co n te x tu a li se d

w o rk

T ra d it io n a l g ra m m a r

M ix tu re

o f fo rm

a l

g ra m m a r a n d

co n te x tu a li se d

g ra m m a r ca n

en h a n ce

w ri ti n g

In th e co

n te x t o f

h ig h -s ta k es

g ra m m a r te st s,

te a ch

er s’ fo cu

s

sh o u ld

b e a

li n g u is ti c o n e o n

th e o rg a n is a ti o n

a n d re o rg a n is a ti o n

o f w o rd s a s p a rt o f

w ri ti n g

H a rr is (1 9 6 2 )

U K

F iv e sc h o o ls (t w o

g ra m m a r sc h o o ls ,

o n e se co

n d a ry

m o d er n a n d o n e

co m p re h en

si v e/

te ch

n ic a l)

1 0 9 p u p il s in

n o n -

g ra m m a r cl a ss es ;

1 1 9 in

g ra m m a r

cl a ss es ; a g e 1 1 –1

3

Q u a si -e x p er im

en t

T w o y ea rs ;

ex p er im

en ta l

co n d it io n

in tr o d u ce d in to

n o rm

a l te a ch

in g

F o rm

a l g ra m m a r:

ex p li ci t le a rn in g

a n d n a m in g o f

p a rt s; co

rr ec ti o n s

re fe rr ed

to u si n g

te ch

n ic a l

v o ca b u la ry

su ch

a s

fa il u re

o f v er b a n d

su b je ct

a g re em

en t.

In cl u d ed

th e

co n te x tu a li se d

el em

en t o f

p ra ct ic in g th e u se

o f th e fo rm

in

w h o le -t ex t co

n te x t

E x te n si o n o f n o rm

a l

co m p o si ti o n

p ra ct ic e en

a b li n g

th e co

m p le ti o n o f

lo n g er

p ro je ct s

su ch

a s d ia ry ,

n ew

sp a p er , st o ry ,

et c.

F o rm

a l g ra m m a r

te a ch

in g d o es

n o t

h a v e a p o si ti v e

ef fe ct

o n p u p il s’

w ri ti n g

What works in education 1035

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

impact on children’s writing skills. But the way that we taught it was completely

unique’ (Education Arena, n.d.). More specifically, it was claimed in relation to a

RCT evaluating grammar teaching that, ‘the strong positive effect of the intervention

signals for the first time the potentiality of grammar as an enabling element in writing

development and evidences a clearly theorised role for grammar in writing pedagogy’

(Myhill et al., 2011, p. 162). The overall claim that an experiment had demonstrated

that grammar teaching could have a positive impact on secondary pupils’ writing

skills (Myhill et al., 2011)1 was in opposition to the conclusions of a seminal experi-

ment published in 1976 that had evaluated a similar grammar intervention to improve

writing (Elley et al., 1976). This experiment had concluded that grammar teaching

did not have a positive effect on writing.

The Elley and Myhill studies have been selected for detailed comparison in this

paper because both were experimental trials (one a RCT, the other a QED), both

were regarded as having significant wider impact, including political and profes-

sional impact, and both were published in peer-reviewed research journals. Elley

et al.’s (1976) quasi-experiment has been regarded as one of the most rigorous in

the field, having been reprinted by the US National Council for Teachers of Eng-

lish (NCTE) because it was ‘so important [and] a model of evaluation’ (Elley

et al., 1976, p. 5). The impact of Myhill et al.’s (2010) randomised experiment

was recognised by the UK Economic and Social Research Council because it

‘shaped policy and curriculum development in England - including the first author

leading the advisory group of four writing the Grammar Annex of the Primary

English curriculum; participation in the KS2 English Test team; and providing

expert testimony in discussions of the English curriculum revision with the Minis-

ter of State for Schools (2012) . . . Professor Myhill also provided evidence for the

new Australian curriculum’ (Economic and Social Research Council, 2016). As

will be demonstrated in detail below, each of these studies had relative strengths

and limitations, not least in their basic design; however, due to the availability of

any other experimental research addressing the same teaching approaches, and

having had the same reach and significance as these two studies, we consider such

a comparison relevant. We do acknowledge, however, the challenges and limita-

tions in making the comparison, given the differences between the two studies, in

particular, in terms of the countries and years in which they were undertaken and

published.

Below we discuss the two studies in detail, in terms of the intervention and control

conditions, design and features and components of design, and assess the main

methodological strengths and limitations.

Teaching methods for the control and intervention groups

An important consideration for any experimental trial, or a systematic review of trials

(and for our comparison in this paper) is that the nature of the teaching methods is

clearly specified in the publication, and is a suitable comparison, including a compar-

ison with at least one appropriate control group. In both the Elley and Myhill studies

a form of contextualised teaching of grammar was one of the interventions evalu-

ated.2

1036 D. Wyse & C. Torgerson

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

For the Elley study one of the intervention groups used an approach called the

Transformational Grammar (TG) course, based on Jerome Bruner’s concept of the

spiral curriculum. In this intervention group all the activities ‘were related to the cen-

tral core of each strand of the curriculum, thus giving it [the teaching approach] a

clear and consistent unity of purpose’ (p. 8). The TG intervention included the three

strands of (a) Grammar (Transformational), (b) Rhetoric and (c) Literature.

One control group in the Elley study used an approach called ‘Reading–Writing’,

which included rhetoric and literature (as did the TG intervention) but substituted

extra reading and creative writing instead of transformational grammar. The other

control group used an approach called ‘Let’s Learn English’: a traditional approach

to grammar including the learning of parts of speech and some applications of them.

The Elley intervention and control groups ‘had approximately 574 periods of Eng-

lish in the three years, distributed such that each class had similar proportions of

morning and afternoon periods, and of time spent on literature, on composition work,

and evaluation exercises’ (p. 10). Although it was claimed that ‘no detectable bias

was apparent in their approach to their teaching of any of the [grammar] courses’ (p.

10), there was no attempt to establish fidelity to the interventions, which is a signifi-

cant limitation of this study.

In the Myhill study, the intervention took place over three weeks per term for one

school year: ‘for both the intervention and comparison groups, the learning focus, the

period of study, the learning objectives and the assessed written outcomes were the

same’ (p. 147). For the intervention group the teaching designed by the project team

‘explicitly sought to introduce grammatical constructions and terminology at a point

in the teaching sequence which was relevant to the genre being studied’ (p. 148). The

intervention and control groups were both taught the same writing genre over a three-

week period once per term of the year of study. The teaching in both groups also

addressed the same learning objectives from England’s national framework for Eng-

lish that was being implemented at the time. The intervention in the Myhill study

‘comprised detailed teaching schemes of work in which grammar was embedded

where a meaningful connection could be made between the grammar point and writ-

ing’ (p. 146). TheMyhill intervention was based on the following principles:

• The grammatical meta-language is used but it is always explained through exam-

ples and patterns.

• Links are always made between the feature introduced and how it might enhance

the writing being tackled.

• The use of ‘imitation’: offering model patterns for students to play with and then

use in their own writing.

• The inclusion of activities which encourage talking about language and effects.

• The use of authentic examples from authentic texts.

• The use of activities which support students in making choices and being designers

of writing.

• The encouragement of language play, experimentation and games.(Myhill et al.,

p. 148)

There are two issues with the specification of teaching approaches in the interven-

tion and control groups in the Myhill study. Firstly, in each term of delivery both

What works in education 1037

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

intervention and control groups experienced teaching where ‘. . . using grammar accu-

rately and appropriately. . .’ was a pre-planned objective in the scheme of work. In the

intervention groups: ‘The intervention comprised detailed teaching schemes of work

in which grammar was embedded where a meaningful connection could be made

between the grammar point and writing’ (p. 7). The control groups did receive some

grammar teaching, as the teaching objectives used by both intervention and control

groups specify: ‘Autumn Term/Narrative Fiction/Using grammar accurately and

appropriately’ (p. 7, emphasis added). Secondly, like the Elley study there were no

checks for fidelity in either condition: ‘Fidelity is a problematic concept in a naturalis-

tic educational setting such as this, as identical implementation of the intervention

teaching materials is neither possible nor desirable. Teachers [in the intervention]

were not asked to follow the lesson plans rigidly; they were allowed to adapt materials

to suit the needs of their students, but were also asked to remain as close as possi-

ble to the materials’ (p. 9). So it is possible that the grammar teaching delivered by

the teachers in the control condition included contextualised teaching of the Myhill

kind; or that they used formal grammar; or more probably that there was a mixture of

approaches. As a result the specific role of grammar was not isolated in the trial. It

cannot be definitively claimed that it was the grammar that was effective, or not effec-

tive, in either the Myhill or the Elley studies because it could have been a range of fac-

tors, including simply better teaching as a result of the training (i.e. the Hawthorne

effect).

Site, sampling, design and allocation to groups

The Elley study took place in one large co-educational high school on the outskirts of

Auckland city. At the start it involved 248 pupils in eight matched classes of average

ability who were taught, observed and regularly assessed from the beginning of third-

form year in February 1970 to the latter part of fifth-form year in November 1972.

The results of the reading test, the assessment of the distribution of fathers’ incomes,

the secondary certificate of education exam results and the inclusion of 15% Polyne-

sian pupils indicated a so-called ‘normal’ sample. Elley noted that, ‘At the outset, one

bright and three slow-learning classes were deliberately excluded from the total third-

form intake of 380 pupils, thus rendering it more homogeneous, and increasing the

chance of identifying systematic differences between groups’ (p. 7). The experimental

pupils ‘were classified into eight matched classes of 31 pupils’ on the basis of a num-

ber of tests, and additional matching criteria were ‘ethnic group, sex, contributing

school, and subject options’ (p. 7). Although the pupils were allocated as individuals

to the eight classes, the study—after this allocation—works as a cluster trial as the

pupils in the eight classes were taught together. The three experimental groups con-

tained three, three and two classes respectively, and the pupils were tested during the

intervention period and at the end.

Limitations of the sampling and grouping in the Elley study include: the lack of

random allocation to groups; the small sample size of eight classes or clusters in total

split between three groups (statistical methodologists state that, as a minimum, there

should be four clusters per group in a cluster randomised trial; Donner & Klar,

2000); and the fact that it was undertaken in only one school, thereby reducing

1038 D. Wyse & C. Torgerson

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

external validity. This latter issue introduces the possibility of potential ‘contamina-

tion’ or ‘spill over’ of the intervention and control conditions between the groups,

and whether this occurred or not is not clear. The lack of random allocation is impor-

tant because random allocation minimises any selection bias at the start of the experi-

ment. In a quasi-experiment matching is sometimes used to ensure baseline

equivalence, as in this case. The classes were matched on a number of variables

including performance on a number of pre-tests. However, Elley et al. did not report

the results of the matching and, therefore, we have to take on trust that the classes

were, in fact, matched on the observed variables. Also, matching cannot account for

imbalance on unknown variables, which can in turn introduce a potential source of

bias which could affect outcome. Furthermore, Elley and colleagues did not adjust

for the clustering in their analysis and instead analysed their data as though this was

an individually allocated quasi-experiment; although they made some attempts to

control for teacher effect, given the small sample size (see above), this would not have

been possible. This study also suffered from high attrition of pupils—over 30% by the

final follow-up in Year 3.

In the Myhill study the authors identified a sample of 32 mixed comprehensive

schools from the South West and Midlands areas of England. Lists of schools from

local authorities were randomly sampled until the desired sample size was achieved.

Once the schools had been recruited, a Year 8 class was selected (with children aged

12 to 13 years) and the classes were stratified according to the teachers’ ‘Grammar

Subject Knowledge’ (GSK); the classes were then randomised using a random num-

ber generator. In these respects, the Myhill study is of higher design quality than the

Elley study: a random sample of schools in two geographical areas in the UK was used

to form the intervention and control groups, thereby increasing external validity. The

design was a large-cluster RCT, with school as the cluster, thereby minimising the

potential for contamination between groups.

Tests and measures

Data for the Elley study were collected in the form of a series of set essays at the end

of each year, marked by teachers from neighbouring schools, plus a battery of stan-

dardised tests. The essays were assessed by carefully briefed panels of English teach-

ers from neighbouring secondary schools. In the first year each pupil wrote four

essays which were assessed by four markers, working independently using a 16-point

scale that included criteria for content, organisation, style and mechanics. In subse-

quent years the number of essays was reduced to three essays and two markers, appar-

ently with no loss of reliability. The battery of tests included: ‘PAT’ reading

comprehension and vocabulary tests (NZCER, 1969, Elley et al., 1976); sentence-

combining; error-correction tests; literature-appreciation tests; and anonymous ques-

tionnaires to assess attitudes to work.

In the Myhill study a pre-test was administered to the pupils, and at the end of the

study a post-test was given. The test was a piece of first-person narrative ‘written

under controlled conditions’, encouraging the pupils to draw on their personal experi-

ences. The test design and marking ‘were led by Cambridge Assessment’. Each test

was marked by two people, and a third marker resolved any differences. The markers

What works in education 1039

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

did not know from which pupil group the pieces had originated (blinded assessment

of outcome). The marking was based on the mark scheme format used by secondary

schools at the time. The outcome was the change in the test scores using an ordinary

least-squares regression approach with pupil-level data. One control class did not

adhere to the intervention and was removed from the analysis.

Results and conclusions

The Elley intervention TG and ‘Let’s Learn English’ (LLE) grammar groups found

English more ‘repetitive’ and ‘useless’ than the control group did. The reading/writ-

ing (RW) group showed more positive attitudes to reading. The TG group was par-

ticularly negative about ‘sentence-study’. In the fourth year (14 to 15 year olds),

only one comparison (from 30 possible) showed significant differences (on essay

content). In the School Certificate Examination there were no significant differ-

ences between the three programmes. In the fifth year (15 to 16 year olds), only 2

of the 12 variables listed showed any significant differences (sentence-combining

test and English usage test). Again, in the School Certificate Examination there

were no significant differences between the three groups. Overall, TG and tradi-

tional grammar teaching showed no measurable benefits. Participants in the RW

group, who studied no formal grammar for three years, demonstrated competence

in writing and related language skills fully equal to that shown by the two grammar

groups. Elley et al. concluded that ‘English grammar, whether traditional or trans-

formational, has virtually no influence on the language growth of typical secondary

school students’ (p. 18). Elley et al. dismissed the idea of the introduction of gram-

mar at primary level mainly based on developmental theory: ‘it seems most unlikely

that such training would be readily applied by children in their own writing. Fur-

thermore, the researchers’ empirical findings do not support the early introduction

of grammar’ (p. 18).

In addition to a wide range of findings that included analysis of teacher subject

knowledge, the Myhill study found a ‘highly significant’ positive difference in marks

in favour of the intervention groups, and concluded that ‘this represents the first

robust statistical evidence for a beneficial impact of the teaching of grammar in stu-

dents’ writing attainment’ (p. 151). The authors also concluded that:

the study represents the first large-scale study in any country of the benefits or otherwise of

teaching grammar within a purposeful context in writing. It stands in contrast to previous

studies which were either small-scale (Bateman & Zidonis, 1966; Fogel & Ehri, 2000) or

which investigated whether discrete grammar instruction improved writing outcomes

(Elley et al., 1975, 1979) , and is the only study of its kind conducted in England. (p. 161)

As we demonstrated earlier, it was not strictly accurate to claim that the Elley study

used ‘discrete grammar instruction’ as its comparator. The issue of scale is also inter-

esting. It is true that the number of students involved in the Myhill study was the lar-

gest to date, but what is important is not the scale per se but the quality, and power, of

the design of any study. Scale is also implicated in the consideration of the results of

just one study versus the combined results of many studies, an approach that is at the

heart of systematic review and meta-analysis.

1040 D. Wyse & C. Torgerson

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

Quality of the methodology of the studies

It is important to note that both studies were undertaken in secondary schools, not in

primary/elementary schools, therefore their findings could not reliably be generalised

to defend any decisions made for primary/elementary education.

The Elley study, as reported, had a number of limitations. Its design was a quasi-

experiment as it did not use random allocation to assign students to classes. In addi-

tion, the sample size of this cluster trial was small and underpowered. It was also lim-

ited by the fact that it was undertaken in only one school. Other issues that

undermine the validity of the study include not stating how the students were allo-

cated into the groups and not stating whether the tests were administered and marked

blind to allocation to minimise potential bias.

The Myhill study, as reported, also had a number of limitations. The authors

did not use an intention to treat (ITT) method of analysis. Removal of a non-

compliant class from the analysis potentially biased the results. This is because

that particular teacher and class were likely to be systematically different from

those who remained in the study. Randomisation ensures differences are balanced

between the two groups at baseline. Removal of a class from one group, post-ran-

domisation, reintroduces the potential for the selection bias that the randomisation

had previously dealt with by ensuring classes were similar between the two

groups at randomisation. The second limitation is the bias in the standard errors.

As the authors acknowledge, their study was a cluster randomised trial and,

although they mention the need to adjust for clustering, they argue that, because

there was only one cluster per school, this was not necessary. Consequently, they

treated the sample as having several hundred independent observations rather than

32 (or 31 after removal of the non-compliant teacher) clustered observations.

Other issues that potentially threaten the validity of the study include: not

describing who did the randomisation and not stating whether this was done

independently of the investigators (developers); and not stating whether the pre-

tests were done before random allocation to minimise potential bias from the par-

ticipants having knowledge of the allocation before undertaking the pre-test.

However, all of the limitations observed in the Myhill study were also possibly

present in the Elley study but, due to some limitations in the reporting of that

study, it is not possible to make a judgement about, for example, whether or not

ITT analysis was used.

To conclude our in-depth analysis of the methodology of these single trials we look

finally at a more recent study addressing the question of whether the Myhill et al. con-

textualised approach was effective and generalisable to the oldest children in primary

schools. The study by Torgerson et al. (2014) was carried out as an independent fol-

low-up trial funded by the EEF. This trial was aimed at pupils in the ‘transition per-

iod’ between primary and secondary school [last term of Year 6 (age 10–11) and first

term of Year 7 (age 11–12)]. The Torgerson et al. study has not yet demonstrated

similar levels of impact and significance as the studies by Elley et al. and Myhill et al.,

but the comparison is relevant because the nature of the grammar intervention evalu-

ated in this RCT was the Myhill approach. The inclusion of primary age pupils in the

Torgerson study is also important for our argument in this paper, although the

What works in education 1041

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

comparison with the Myhill study needs to be treated with caution due to the differ-

ence in participant characteristics.

The design of the Torgerson study was a pragmatic ‘partial split plot’ RCT.

Schools were randomised at the cluster level (similar to the original Myhill design). In

the intervention schools, children were additionally randomised as individuals to

receive the grammar teaching as a whole class or in small groups. This allowed the

evaluators to test whether small-group teaching was effective, as well as whether

grammar teaching per se was effective. Unlike in the Myhill study, the evaluators took

the clustered nature of the data into account in the analysis and also undertook an

ITT analysis, whereby all schools and pupils were included in the analysis, irrespec-

tive of their level of intervention compliance. The results showed that there was a

small, statistically non-significant effect of grammar teaching on literacy outcomes. In

contrast, the small-group teaching delivered a modest, statistically significant effect

on literacy outcomes. Indeed, when the small-group effect was removed from the

grammar teaching by comparing the whole classes in the intervention against the

whole-class control group, the small difference declined from 0.10 of a standard devi-

ation to 0.06. Therefore, the results of this trial suggest, at best, only a very small

effect of grammar teaching on literacy outcomes. However, although the study did

use an ITT analytical strategy, and the correct statistical approach, this was imple-

mented among children during the ‘transition’ from primary to secondary school,

which could have led to an underestimation of the teaching effectiveness, due to the

summer break from attendance at school.

Discussion and conclusions

The last 30 years has shown a gradual increase in the use of experimental trials in edu-

cation research. Greater understanding of the strengths and weaknesses of research

designs is evident in more recent research studies. This greater understanding is

reflected, for example, in the combining of experimental trials with qualitative meth-

ods including implementation process evaluations or embedded ethnography. In gen-

eral, these developments reflect growing sophistication in education research and

social-science research more generally.

Although the numbers of robust experimental trials relevant to effective teaching in

schools have increased, our analysis of trials in relation to the teaching of writing sug-

gests that there are still too many studies that are not of sufficient methodological

quality. In particular, too many studies are weak in relation to allocation of pupils to

groups, and the measures for writing remain a challenge. Randomisation, to form two

or more intervention and control groups, is essential to ensure that the groups are bal-

anced in known and unknown factors that may affect writing outcomes. Randomisa-

tion could be by pupil, by class, by school year or by school. The higher the unit of

allocation (e.g. school versus pupil), the lower the efficiency (in statistical terms) of

the design. In other words, all things being equal, it is necessary to have more children

in a design that randomises at a level above the pupil to see a given difference (if one

exists) that would be statistically significant. The main weakness of randomisation at

the level of the child is contamination or spill-over effects, and the logistics of allocat-

ing pupils in ways that are different from the normal ways that schools allocate pupils

1042 D. Wyse & C. Torgerson

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

to classes, hence the use of group or cluster randomisation of schools. A ‘business-as-

usual’ control group is often appropriate in a pragmatic trial; however, it is useful to

consider additional interventions leading to three or more arms to the trial if there

were other competing interventions that could potentially improve writing skills.

Writing outcome measures ideally need to include robust measures for improve-

ments in writing composition even if the focus is, for example, development of gram-

mar. There is a need to know that the holistic aspects of writing are being enhanced,

not just the key components. Such an outcome measure should be administered and

marked by independent assessors for whom the allocation of teaching approaches to

groups is not known (the markers are ‘masked’). This prevents either conscious or

unconscious marking bias of the outcomes.

When the data are all collected and collated it is important to analyse the data as if

all the pupils had received the intervention to which they were allocated, whether they

did or did not indeed receive the intervention (adopting ‘intention to treat’ or ‘inten-

tion to teach’ analysis; Torgerson & Torgerson, 2008). If schools that comply weakly

with the intervention are excluded from the analysis, this introduces the potential for

selection bias which the original randomisation minimised. There are statistical tech-

niques for looking at the effect of low compliance but removing weakly or non-com-

pliant classes or schools is not one of them.

With regard to our substantive case of grammar, the current evidence from ran-

domised controlled trials does not support the widespread use of grammar teaching

for improving writing among native English-speaking children. Based on the experi-

mental trial and meta-analysis evidence about writing teaching more generally (e.g. in

Tables 1 and 2), our hypotheses are that supporting primary/elementary pupils’

grammar is most likely to require teachers intervening during the writing process, and

interacting to discuss the use of grammar in relation to the overall purpose of the writ-

ing task and the purpose of the writing. The necessity to use technical terms with

pupils, such as subordinate clause or subjunctive, remains a question open to

research, but it is doubtful that attention to such terms is beneficial. It is probable that

adopting everyday language to discuss improvements in the use of grammar in writing

will be more beneficial. Small-group and whole-class teaching that includes a focus

on the actual use of grammar in real examples of writing (including professionally

produced pieces, realistic examples produced by teachers including ‘think aloud’ live

drafting of text and drafts of pupils’ writing) may also be more effective.

When the decisions taken by, and for, schools and teachers about what approaches

to adopt are informed by research, there are important choices to be made. Although

grammar for writing has been a main focus of this paper, if the overall goal is to

improve pupils’ writing then a much wider set of research evidence about writing

needs to be considered. Improvements in pupils’ writing have to be achieved across

many different dimensions. For example, robust evidence has shown that an

approach with primary age pupils that used strategy instruction (itself an approach

backed by robust multiple trial evidence), combined with pupils’ experience of offsite

visits to places of educational interest, had powerful effects. This work had its origins

in the USA, but an evaluation using RCT design undertaken in England confirmed

its transferability to a different national context, although the trial was relatively small

and the results need to be confirmed in a larger effectiveness trial (Torgerson &

What works in education 1043

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

Torgerson, 2014). However, once again this work is but one study and one approach.

The most recent meta-analyses of high-quality research studies on writing suggest

that, rather than emphasise grammar, the following practices could be selected as a

priority for teaching writing in primary/elementary education: (a) an increase in the

amount of time that pupils have for writing; (b) adoption of a process approach to

writing; (c) creation of a classroom environment that is appropriately supportive of

pupils’ attempts at learning to write better; (d) development of pupils’ writing skills,

strategies and knowledge, including ways of planning writing; (e) a use of assessment

for learning techniques; (f) a use of computers as part of the process of writing; (g) a

use of writing meaningfully across different subject areas (Graham et al., 2016; Wyse,

2017). The robustness of the evidence underpinning these practices is built not on

single studies but on multiple RCTs and experimental trials.

The mismatch between curriculum policy for the subject English and the research

evidence base is particularly pronounced at primary/elementary level in England. The

national curriculum in England and its associated national statutory tests include a

heavy emphasis on formal grammar teaching, and to varying degrees the national cur-

ricula in other English-speaking countries also have an emphasis on formal grammar

teaching. Sentence-combining remains the only approach to grammar for writing that

is supported by robust research evidence from experimental trials, although there are

no RCTs that have been undertaken in the UK. The use of sentence-combining as

part of the process of writing would be a good area for new research.

In relation to the use of evidence to guide policy, a key risk is for policy makers and

their advisors to attend too closely to single studies, within a field of interest, that

might support a preferred policy direction rather than take due account of multiple

studies published over many years. The problems of attending to a single study have

been seen in relation to the teaching of reading in the UK (Wyse & Goswami, 2008;

Ellis & Moss, 2013); this, in addition to ideological belief, appears to be a reason for

the dramatic emphasis on grammar in England’s primary national curriculum that

was implemented from 2014 onwards, a trend that is counter to the research evidence

overall, and one that risks having a negative impact on children’s literacy learning and

hence life chances. The outcomes of reviews of multiple studies, including systematic

review and meta-analysis and high-quality narrative reviews, are a much more reliable

evidence-base for policy decisions than single studies. But this kind of evidence also

requires mediation by experts who possess both substantive, methodological and

practical knowledge and experience.

Although policy makers and politicians around the world have engaged with the

importance of research evidence, for example in the prioritisation of evidence-based

practices based on RCTs, there is a resulting need for policy to accurately reflect the

outcomes of robust reviews of multiple sets of evidence. Such reviews may indicate

that a policy should be in a direction that is contrary to a minister’s ideology and per-

sonal beliefs. At other times there may not be sufficient research evidence to warrant

a particular policy decision in any direction: in these cases there is the option to fur-

ther prioritise schools’ autonomy and teachers’ professional judgement. Better poli-

cies are likely to be made in future if policy decisions are informed by expert critical

synthesis of multiple robust research studies, including systematic reviews and meta-

analyses, relevant to the contexts of implementation. Finally, a necessary

1044 D. Wyse & C. Torgerson

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

consequence of the kind of attention to research evidence that we advocate may mean

that curriculum policy should change more slowly and more incrementally because

accumulation of the multiple studies that are required to warrant decisions in impor-

tant areas such as the teaching of writing takes many years.

NOTES

1 For brevity, in the rest of the paper we refer to this study as the ‘Myhill study’ and the Elley et al. study as the ‘Elley study’.

2 Contrary to Myhill et al.’s claim that the Elley study did not include contextualised grammar teaching as one of the interventions it is evident from the description in the Elley paper of the TG approach, inspired by Bru- ner’s spiral curriculum as we show in this paper that it did (also confirmed in a personal communication with Warwick Elley in 2013).

References

Andrews, R., Torgerson, C., Beverton, S., Locke, T., Low, G., Robinson, A. & Zhu, D. (2004a)

The effect of grammar teaching (syntax) in English on 5 to 16 year olds’ accuracy and quality

in written composition, in: Research evidence in education library (London, Institute of Educa-

tion).

Andrews, R., Torgerson, C., Beverton, S., Freeman, A., Locke, T. & Low, G et al. (2004b) The

effect of grammar teaching (sentence combining) in English on 5 to 16 year olds’ accuracy and

quality in written composition, in: Research evidence in education library (London, Institute of

Education).

Australian Curriculum Assessment and Reporting Authority (2017) Australian Curriculum: English.

Available online at: www.australiancurriculum.edu.au/english/structure (accessed 27 January

2017).

Bateman, D. R. & Zidonis, F. J. (1966) The effect of a study of transformational grammar on the writing

of ninth and tenth graders (Champagne, IL, National Council of Teachers of English).

Bew, P. (2011) Independent review of Key Stage 2 testing, assessment and accountability. Final Report

(London, Department of Education).

Bonell, C., Fletcher, A., Morton, M., Lorenc, T. & Moore, L. (2012) Realist randomised con-

trolled trials: A new approach to evaluating complex public health interventions, Social Science

and Medicine, 75, 2299–2306. British Educational Research Association (2012) Background to Michael Gove’s response to the Report

of the Expert Panel for the National Curriculum Review in England. Available online at: www.be

ra.ac.uk/promoting-educational-research/issues/background-to-michael-goves-response-to-

the-report-of-the-expert-panel-for-the-national-curriculum-review-in-england (accessed 27

January 2017).

Connolly, P. (2015, September) The trials of evidence-based practice in education. Keynote address at

the British Educational Research Association Annual Conference,Queen’s University Belfast.

Crystal, D. (2004) The stories of English (London, Penguin/Allen Lane).

Department for Education (2010) The importance of teaching: The schools White Paper 2010 (Nor-

wich, The Stationery Office).

Department for Education (2013a) The national curriculum in England: Framework document.

December 2014 (London, Department for Education).

Department for Education (2013b) Reform of the national curriculum in England. Report of the consul-

tation conducted February–April 2013 (London, Department for Education).

Department for Education (2013c) Reforming the national curriculum in England. Summary report of

the July to August 2013 consultation on the new programmes of study and attainment targets from

September 2014 (London, Department for Education).

Donner, A. & Klar, N. (2000) Design and analysis of cluster randomization trials in health research

(London, Arnold).

What works in education 1045

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

Economic and Social Research Council (2014) Improving literacy with grammar methods. Available

online at: www.esrc.ac.uk/news-events-and-publications/impact-case-studies/improving-literac

y-with-grammar-methods/ (accessed 27 January 2017).

Education Arena (n.d.) Expert interview with Debra Myhill. Available online at: www.educationare

na.com/searchResults/index.asp?cx=006001425124917276529%3A7n-kwidpf_c&cof=

FORID%3A11&ie=UTF-8&q=hot+topic+debbie+myhill&sa=GO&siteurl=www.educationa

rena.com%2FeducContact.asp&ref=www.google.co.uk%2F&ss=5817j2852731j23 (accessed

27 January 2017).

Education Endowment Foundation (2017) The EEF’s approach to evaluation. Available online at:

educationendowmentfoundation.org.uk/our-work/the-eefs-approach-to-evaluation/ (accessed

27 January 2017).

Elley, W. B., Barham, I. H., Lamb, H. & Wyllie, M. (1976) The role of grammar in a secondary

school English curriculum, Research in the Teaching of English, 10, 5–21. Ellis, S. & Moss, G. (2013) Ethics, education policy and research: The phonics question reconsid-

ered, British Educational Research Journal, 40(2), 241–260. Fearn, L. & Farnan, N. (2007) When is a verb?, Using functional grammar to teach writing, Journal of

Basic Writing, 26(1), 63–87. Fogel, H. & Ehri, L. C. (2000) Teaching elementary students who speak black English vernacular

to write in standard English: Effects of dialect transformation practice, Contemporary Educa-

tional Psychology, 25, 212–235. Goldacre, B. (2013) Building evidence into education (London, Department for Education).

Graham, S. & Harris, K. (2017) Evidence-based writing practices: A meta-analysis of existing

meta-analyses, in: R. Redondo & K. Harris (Eds) Design principles for teaching effective writing

(Leiden, BRILL).

Graham, S. & Perin, D. (2007a) A meta-analysis of writing instruction for adolescent students,

Journal of Educational Psychology, 99(3), 445–476. Graham, S. & Perin, D. (2007b) Writing next: Effective strategies to improve writing of adolescents in

middle and high schools. Report to Carnegie Corporation of New York (Washington, D.C., Alli-

ance for Excellent Education).

Graham, S. & Perin, D. (2007c) What we know, what we still need to know: Teaching adolescents

to write, Scientific Studies of Reading, 11(4), 313–335. Graham, S., McKeown, D., Kiuhara, S. & Harris, K. R. (2012) A meta-analysis of writing instruc-

tion for students in the elementary grades, Journal of Educational Psychology, 104(4), 879–896. Graham, S., Harris, K. & Chambers, A. (2016) Evidence-based practice and writing instruction: A

review of reviews, in: C. MacArthur, S. Graham & J. Fitzgerald (Eds) Handbook of writing

research (2nd edn) (New York, Guilford Press).

Hargreaves, D. (1996) Teaching as a research-based profession: Possibilities and prospects. The Teacher

Training Agency Annual Lecture, April 1996. Available online at: eppi.ioe.ac.uk/cms/Portals/

0/ (accessed 4 April 2016).

Harris, R. J. (1962) An experimental inquiry into the functions and value of formal grammar in the teach-

ing of English, with special reference to the teaching of correct written English to children aged twelve to

fourteen. Ph.D. thesis, University of London.

House of Commons Education Committee (2017) Primary assessment. Eleventh Report of Session 2016– 17. Report, together with formal minutes relating to the report (London, House of Commons).

Johnson, B., Onwuegbuzie, A., de Waal, C., Stefurak, T. & Hildebrand, D. (2017) Unpacking

pragmatism for mixed methods research, in: D. Wyse, N. Selwyn, E. Smith & N. Selwyn (Eds)

The BERA/SAGE handbook of educational research (London, SAGE).

Kaestle, C. (1993) The awful reputation of education research, Educational Researcher, 22(1), 26–31. Mansell, W. (2017, May 9) Battle on the adverbials front: Grammar advisers raise worries about

Sats tests and teaching, The Guardian. Available online at: www.theguardian.com/education/

2017/may/09/fronted-adverbials-sats-grammar-test-primary.

Moore, L., Graham, A. & Diamond, I. (2003) On the feasibility of conducting randomised trials in

education: Case study of a sex education intervention, British Educational Research Journal, 29

(5), 673–689.

1046 D. Wyse & C. Torgerson

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense

Morrison, K. (2001) Randomised controlled trials for evidence-based education: Some problems in

judging ‘what works’, Evaluation & Research in Education, 15(2), 69–83. Myhill, D. & Watson, A. (2014) The role of grammar in the writing curriculum: A review, Journal

of Child Language Teaching and Therapy, 30(1), 41–62. Myhill, D., Jones, S., Lines, H. & Watson, A. (2011) Re-thinking grammar: The impact of embed-

ded grammar teaching on students’ writing and students’ metalinguistic understanding,

Research Papers in Education, 27(2), 139–166. National Governors Association Center for Best Practices Council of Chief State School Officers

(2010) Common Core State Standards (English Language Arts Standards). Available online at:

www.corestandards.org/ELA-Literacy/L/5/ (accessed 27 January 2017).

New ZealandMinistry of Education (2007). The New Zealand Curriculum: For English-medium teach-

ing and learning in years 1–13. O’Hare, F. (1973) Sentence combining: Improving student writing without formal grammar instruction.

No. 15 in a series of research reports sponsored by the NCTE Committee on Research

(Urbana, IL, National Council of Teachers of English).

Pawson, R. & Tilley, N. (1997) Realistic evaluation (London, SAGE).

Saddler, B. & Graham, S. (2005) The effects of peer-assisted sentence-combining instruction on

the writing performance of more and less skilled young writers, Journal of Educational Psychol-

ogy, 97(1), 43–54. Shadish, W., Cook, T. & Campbell, D. (2002) Experimental and quasi-experimental designs for gener-

alized causal inference (Belmont, CA, Wadsworth).

Slavin, R. (2002) Evidence-based education policies: Transforming educational practice and

research, Educational Researcher, 3(7), 15–21. Standards and Testing Agency (2015) 2016national curriculum assessments. Interim teacher assessment

frameworks at the end of key stage 2 (London, Standards and Testing Agency).

Standards and Testing Agency (2016) Key stage 2 English grammar, punctuation and spelling Paper 1:

questions (London, Standards and Testing Agency).

Torgerson, D. & Torgerson, C. (2008) Designing and running randomised trials in health, education

and the social sciences (Basingstoke, Palgrave Macmillan).

Torgerson, D. J., Torgerson, C. J., Mitchell, N., Buckley, H., Ainsworth, H., Heaps, C. & Jeffer-

son, L. (2014) Grammar for writing: Evaluation report and executive summary (London, Educa-

tional Endowment Foundation).

UNESCO (2006) EFA global monitoring report 2006: Education for all (Paris, UNESCO).

United Nations (2017) Sustainable development goals. Available online at: www.un.org/sustainab

ledevelopment/sustainable-development-goals/ (accessed 27 January 2017).

Walters, J. E. (1931) Seniors as counsellors, Journal of Higher Education, 2(8), 446–448. Walters, J. E. (1932) Measuring effectiveness of personnel counselling, Personnel Journal, 11, 227–

236.

Wilby, P. (2017, August 1) The quality of education policymaking is poor, The Guardian. Available

online at: www.theguardian.com/education/2017/aug/01/david-laws-education-policy-schools-

minister-thinktank-epi?CMP=Share_iOSApp_Other.

Wyse, D. (2001) Grammar. For writing?: A critical review of empirical evidence, British Journal of

Educational Studies, 49(4), 411–427. Wyse, D. & Goswami, U. (2008) Synthetic phonics and the teaching of reading, British Educational

Research Journal, 34(6), 691–710. Wyse, D., Baumfield, V., Egan, D., Gallagher, C., Hayward, L., Hulme, M., et al. (2013) Creating

the curriculum (London, Routledge).

Wyse, D., Fentiman, A., Sugrue, C. & Moon, S. (2014) English language teaching and whole school

professional development in Tanzania, International Journal of Educational Development, 38, 59–68. Wyse, D. (2017) How Writing Works: From the Birth of the Alphabet to the Rise of Social Media

(Cambridge, Cambridge University Press).

What works in education 1047

© 2017 British Educational Research Association

14693518, 2017, 6, D ow

nloaded from https://bera-journals.onlinelibrary.w

iley.com /doi/10.1002/berj.3315 by C

alifornia State U niversity, W

iley O nline L

ibrary on [12/11/2022]. See the T erm

s and C onditions (https://onlinelibrary.w

iley.com /term

s-and-conditions) on W iley O

nline L ibrary for rules of use; O

A articles are governed by the applicable C

reative C om

m ons L

icense