answer
Experimental trials and ‘what works?’ in
education: The case of grammar for writing
Dominic Wysea,* and Carole Torgersonb aUCL Institute of Education, London, UK; bDurham University, Durham, UK
The place of evidence to inform educational effectiveness has received increasing attention interna-
tionally in the last two decades. An important contribution to evidence-informed policy has been
greater attention to experimental trials including randomised controlled trials (RCTs). The aim of
this paper is to examine the use of evidence, particularly the use of evidence from experimental tri-
als, to inform national curriculum policy. To do this the teaching of grammar to help pupils’ writing
was selected as a case. Two well-regarded and influential experimental trials that had a significant
effect on policy, and that focused on the effectiveness of grammar teaching to support pupils’ writ-
ing, are examined in detail. In addition to the analysis of their methodology, the nature of the two
trials is also considered in relation to other key studies in the field of grammar teaching for writing
and a recently published robust RCT. The paper shows a significant and persistent mismatch
between national curriculum policy in England and the robust evidence that is available with regard
to the teaching of writing. It is concluded that there is a need for better evidence-informed decisions
by policy makers to ensure a national curriculum specification for writing that is more likely to have
positive impact on pupils.
Keywords: experimental trials; research evidence; grammar teaching; teaching writing
Introduction
One of the most important questions in education is: what works best to help children
and young people learn? Possible answers to this question are a daily reality for teach-
ers and their pupils, and the question is also of great concern to wider society, not
least because of governments’ significant expenditure on education, and the expecta-
tions that arise from this expenditure. Society expects schooling to enhance pupils’
learning as a result of teaching that is effective.
Over the last decade, across the world, the political impetus to examine ‘what
works?’ as part of educational effectiveness has coincided with the growth in the use
of two specific research designs to evaluate educational policy and practice: interna-
tional comparative surveys using large data sets, and more recently a growth in experi-
ments and quasi-experiments (Connolly, 2015). International comparative work,
including the testing of representative samples of pupils, is a prominent feature in
education policy evaluation in both low-income and high-income nation states.
Examples include: the goal-driven approach of the United Nations Sustainable Devel-
opment Goals (United Nations, 2017); the test-driven comparisons of specific aspects
*Corresponding author. UCL Institute of Education, 20 Bedford Way, Bloomsbury, London
WC1H 0AL, UK. E-mail: d.wyse@ucl.ac.uk; Twitter: @Dominic_Wyse
© 2017 British Educational Research Association
British Educational Research Journal Vol. 43, No. 6, December 2017, pp. 1019–1047
DOI: 10.1002/berj.3315
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
of education such as literacy (UNESCO, 2006); and large-scale international surveys
such as the Programme for International Student Assessment (PISA, for secondary
schooling), the Progress in International Reading and Literacy Study (PIRLS, for pri-
mary schooling) and the Trends in International Maths and Science Study (TIMSS,
covering both primary and secondary schooling) that combine pupil testing with sur-
veys exploring some aspects of educational policies in the comparator countries. The
research designs used in these international surveys are able to establish correlations
between education policies and outcomes, but are not able to establish whether such
policies cause the observed outcomes. A causal relationship to demonstrate effective-
ness requires a design which features a control group [i.e. a ‘true’ experiment—a ran-
domised controlled trial (RCT)—or a quasi-experiment (QE)]. The extent to which
studies using a variant of RCT or QE design can establish stronger or weaker causal
inference also depends on the robustness within the design and its conduct. Although
RCTs comparing the curriculum policies of whole countries are not feasible, RCTs
of specific approaches to teaching are feasible, not least in areas such as literacy that
are included as a comparator in most international analyses of the kinds described
above.
A paramount source of information about effective teaching should be research;
however, the extent to which education research has contributed answers to the ques-
tions of teaching efficacy and effectiveness is fiercely debated. As early as 1972, in the
US congress there was a view that education research was ‘mediocre and useless’
(Kaestle, 1993, p. 27). Thirty years later, it was observed that education in the USA
had been dragged ‘kicking and screaming, into the 20th century’ (Slavin, 2002, p. 15)
as a result of developments in education policy linked to ‘scientifically based
research’, such as the Elementary and Secondary Education ActNo Child Left Behind,
and emphasis on ‘proven, comprehensive reform models’ (Slavin, 2002, p. 15). At
the time, the US Office of Educational Research and Improvement invited nomina-
tion of programmes to be evaluated, ultimately using experimental designs, by third-
party evaluators (Slavin, 2002).
In the UK, the debate about the capacity of education research to contribute to
questions about effectiveness was reignited around 20 years ago. A trend of criticisms
of education research was typified by the Teacher Training Agency Annual Lecture in
1996 given by David H. Hargreaves who was, at the time, a Professor of Education at
the University of Cambridge. Hargreaves’ strong criticism of education research
included his opinion that it was poor value for money in relation to improving educa-
tion in schools, and that the teaching profession had been inadequately served by edu-
cation research. In a comparison with medicine, Hargreaves’ conclusion was that, ‘In
education we too need evidence about what works with whom under what conditions
and with what effects’ (Hargreaves, 1996, p. 8). More recently, the debate came to
prominence as a result of the work of the medical doctor, research fellow and journal-
ist Ben Goldacre (Goldacre, 2013). One notable aspect of Goldacre’s 2013 argument
is how similar it was to some of the points made 20 years previously by Hargreaves.
For example, the advocacy for RCT designs, the idea that research evidence about
education practice is weak, and comparisons with medicine were all addressed in
Hargreaves’ original lecture.
1020 D. Wyse & C. Torgerson
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
The aim of this paper is to examine the use of experimental trials in relation to evi-
dence about effective teaching, and to consider some links between research and
national curriculum policy. To do this we selected one important research area as a
case: the teaching of grammar to help pupils’ writing. The teaching of grammar for
writing is a useful case because the topic has attracted a fierce ideological debate as
well as a significant number of experimental and quasi-experimental trials evaluating
interventions to improve writing. Unlike previous work that has had a main focus on
the methodology of experimental trials or on the implications of evidence from experi-
mental trials for an aspect of policy and/or practice, our argument is built on an in-
depth analysis of methodology and research outcomes in a specified aspect of educa-
tion, namely the teaching of grammar to improve writing. Through examination of
methodology and a substantive topic, a stronger case can be made in relation to which
teaching method is likely to be effective.
The paper begins with a historical account of the debate about research evidence
and experimental trials in education. We then review the research evidence on gram-
mar teaching for writing. The curriculum policy context for grammar teaching is seen
in our brief description of representations of grammar in national curricula interna-
tionally, and an account of the development of England’s national curriculum of
2014. The main part of the paper is a detailed exploration of two well-regarded exper-
imental trials, published in peer-reviewed research journals, and focused on evaluat-
ing the role of grammar teaching in supporting the development of writing. The
studies examined similar approaches to teaching grammar in the same phase of edu-
cation, and both papers had a recognised impact on policy and practice. The studies
were also chosen because, although they addressed very similar teaching approaches,
they came to different conclusions about their effectiveness. One of the two papers
concluded that grammar teaching to support writing was not effective, whilst the other
paper concluded that it was effective. The important considerations for the argument
in the present paper are: (a) what the comparison of the two studies reveals about the
methodology of experimental trials; (b) the extent to which the outcomes of either of
the two studies are replicated in other experimental trials in the same field; and (c) as
a result of considering (a) and (b), whether the research evidence of grammar for writ-
ing is appropriately reflected in national curriculum policy in England, and what the
implications are for research, policy and practice.
Experimental trials in education research
Although the RCT is widely used in medical research, one of its first known uses to
investigate human activity (as opposed to RCT use in the natural sciences) in the
modern period was in the field of education. In the early 1930s in the USA, Walters
undertook two randomised experiments in the field of education (Walters, 1931,
1932). In a university setting Walters randomised the selection of members of the
freshmen class in the School of Mechanical Engineering at Purdue University. Some
of the freshmen were allocated to mentoring delivered by five seniors with a ‘good
scholarship record, pleasing personality, excellent health and fine social environ-
ment’, and some were allocated to a control mentoring condition. Academic out-
comes were then measured, and Walters concluded that the students in the
What works in education 1021
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
mentoring condition had better outcomes than the students in the control condition
(no mentoring). Walters’ experiment is the first known use of the term ‘random sam-
pling’—or randomisation—to form equivalent groups: ‘The 220 delinquent freshmen
were divided into two groups by random sampling.’ The following year Walters under-
took a replication trial with a much larger sample size and random allocation to one
of three ‘arms’: mentoring by seniors; mentoring by Faculty members; and a control
condition (Walters, 1931, 1932) and he concluded that the senior students were more
effective in personal mentoring in reducing drop-out or exam failure than the Faculty
members.
Between 1900 and the 1960s many ‘explanatory’ experiments were undertaken in
the field of education, sometimes using randomisation. These tended to be con-
ducted by educational psychologists, working in psychology laboratories, investigat-
ing basic psychological processes relevant to learning. Between the 1930s and the
1970s many RCTs in education were undertaken in the USA (some large scale), but
there was a dearth of high-quality RCTs in education research in the UK. Between
the 1970s and the 2000s, there were very few large-scale RCTs in education in the
USA, as the design had largely fallen out of favour, although there were a few notable
exceptions.
After the 50-year lull in activity, greater emphasis on experimental trials to inform
education policy in the USA and the UK became evident. This step-change in the his-
tory of the use of the design was largely driven by two distinct policy initiatives on
either side of the Atlantic. In 2002, when George Bush enacted the No Child Left
Behind Act, the subsequent creation of the Institute of Education Sciences (IES) led
to public investment in the use of experimental design to evaluate education policies
and interventions. The legislation mandated the RCT as the design of choice for eval-
uating education interventions: ‘Scientifically valid educational evaluation employs
experimental designs using random assignment, when feasible, and other research
methodologies that allow for the strongest possible causal inferences when random
assignment is not feasible’ (p. 5). Since that time, over 200 experiments and quasi-
experiments have been funded by the IES and undertaken in education in the USA.
The IES does fund quasi-experiments, but only if randomisation is not thought to be
feasible, which occurs in rare circumstances where, for example, it may be deemed
unethical to undertake random allocation.
The UK equivalent to the greater use of experimental trials in education in the
USA was the creation of the Education Endowment Foundation (EEF) in 2011, and
its requirement that ‘. . . all EEF projects will be rigorously evaluated by independent
experts in educational research according to minimum standards . . . The impact of
projects on attainment will be evaluated, where possible, using randomised controlled
trials’ (Education Endowment Foundation, 2017). The EEF has now funded over
120 RCTs and quasi-experiments, evaluating education policies and practices; simi-
lar to the IES, it only funds quasi-experiments where randomisation is not feasible.
The similarity of the criticisms of educational research made by both Hargreaves
and Goldacre, alluded to earlier in this paper, seemed to indicate that little had chan-
ged in relation to the nature of educational research; however, the evidence shows a
different picture. Between 1980 and 2015 the number of RCTs in education demon-
strated significant increases, particularly from 2006 onwards (Connolly, 2015).
1022 D. Wyse & C. Torgerson
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
According to Connolly’s analysis, although the USA and Canada are still responsible
for undertaking the majority of RCTs (approximately 375), the UK had a significant
number (approximately 80), in comparison to much larger population areas (e.g. rest
of Europe approximately 140; Australia/New Zealand 50) (Connolly, 2015). More
than 200 of the RCTs have focused on interventions taking place over a full academic
year or longer (short-duration RCTs was a criticism made by Slavin, 2002). Approxi-
mately 540 RCTs focused on: physical health and wellbeing; behaviour and social
wellbeing; and professional training. Approximately 90 evaluated literacy/English lan-
guage interventions; and approximately 225 focused on other academic interventions
and outcomes, study-related skills and numeracy/maths interventions (Connolly,
2015). There is less evidence about the frequency of use of other experimental
designs.
During this period of growing emphasis on RCTs in education the longstanding
philosophical critique of ‘positivist’ methodologies also continued. RCTs, as part of
evidence-based education research, have been criticised because they are reduction-
ist and not appropriate for the evaluation of educational interventions which, as a
result of the complexity of the social context, are necessarily more challenging com-
pared with experiments in the natural sciences (e.g. Morrison, 2001). But others
have countered with the opinion that, for some research questions, a well-con-
ducted RCT is the strongest research design when seeking to compare effectiveness
of interventions. For example, the potential of RCTs was seen in a complex inter-
vention on sex education at secondary education level that paid careful attention to
the methodological challenges of evaluation in the real world of secondary schools
(Moore et al., 2003). Another more recent strand of the debate has linked a critique
of positivism with support for ‘realist’ approaches (e.g. as recommended by the
criminologists Pawson & Tilley, 1997), including the important idea that ‘what
works’ should have a central focus on who an intervention works for, and the con-
text in which a specific intervention can work. In an exploration of Pawson and Til-
ley’s ideas, Bonell et al. (2012) acknowledge the importance of attention to theories
of causal mechanisms but critique the realist position on the grounds of: misunder-
standing of the use of counterfactuals; the resultant limit on findings based on plau-
sibility rather than on probability (in a statistical sense); and a lack of
acknowledgement that well-conducted experiments do include attention to mecha-
nisms and context but are also able to assess causal attribution, something which
realist approaches cannot do. Stronger experimental studies have, for some time,
recognised context and methodological limitations. This recognition is evident in
the seminal book on experimental design: ‘The experiment is not a clear window
that reveals nature directly to us. To the contrary, experiments yield hypothetical
and fallible knowledge that is often dependent on context and imbued with many
unstated theoretical assumptions . . . In this sense, all scientists are epistemological
constructivists and relativists, the difference is whether they are strong or weak rela-
tivists’ (Shadish et al., 2002, p. 29). More recently, the epistemological debate has
also been informed by ongoing developments in mixed-methods design and
methodology, including the more routine use of process evaluation, or embedded
ethnography, as part of RCTs. These developments include the recognition that the
dualisms and intellectual tensions that are part of mixed-methods methodology,
What works in education 1023
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
and of understanding what works, are usefully framed by philosophical pragmatism
(Johnson et al., 2017).
The teaching of grammar in national curricula
Recent growth of interest in grammar for writing has been clearly evident in develop-
ments in national curricula in a range of countries with English as a main language.
For example, in the Australian Curriculum’s English learning area, the language strand
is positioned first in the curriculum structure before the strands for literature and lit-
eracy. For children aged 10 to 11 this strand includes explicit attention to ‘sentences
and clause-level grammar’ and to ‘noun groups/phrases’ and ‘adjective groups/
phrases’ (Australian Curriculum Assessment and Reporting Authority, 2017). In the
USA the Common Core State Standards text for English Language Arts for the same
age of children specifies reading, writing, speaking and listening, then language. As
part of the language specification ‘Conventions of Standard English Grammar and
Usage’ (including forming perfect verb tenses; explaining the function of preposi-
tions; etc.) is listed before ‘Knowledge of Language’ and ‘Vocabulary Acquisition and
Use’ (National Governors Association Center for Best Practices Council of Chief
State School Officers, 2010). These kinds of emphases on grammar are not only evi-
dent in high-income nations and states but also in other post-colonial countries with
historic links to the British Empire, for example in the countries of Africa (e.g. see
Wyse et al., 2014).
The emphases in New Zealand’s national curriculum appear to have some differ-
ences from the countries surveyed in this paper so far. In The New Zealand Curriculum
the focus on language is a holistic one, with an emphasis on the making and creating
of meaning (New ZealandMinistry of Education, 2007, p. 18). This holistic attention
to language is also reflected in the strong place of the indigenous language Te Reo
M�aori and New Zealand Sign Language, and in the title ‘an English medium curricu-
lum’. The emphasis on grammar also appears to be different. For example, the speci-
fication of ‘Language features’ as part of ‘Speaking, Writing and Presenting’ is
positioned last in the list of curriculum requirements, and emphasises the way that
pupils should understand grammar as follows: ‘Use a wide range of text conventions,
including grammatical and spelling conventions, appropriately, effectively, and with
accuracy’ (New Zealand Curriculum, Years and Curriculum Levels, Level Six Eng-
lish).
In the different countries of the UK the national curricula for language and English
have differed markedly since political devolutions of power, with England having
increasingly more emphasis on discreet elements such as grammar and phonics (Wyse
et al., 2013). The importance attributed to grammar by policy makers in England
since 2011 can be seen in the intensification of the teaching of formal grammar as part
of the subject of English in England’s national curriculum. In the national curriculum
of 2014 the programmes of study for writing for 9- to 11-year-old pupils include
statutory requirements for the teaching of ‘Writing – transcription’, including spel-
ling, handwriting and presentation. These sections are followed by writing composi-
tion (planning and drafting), then vocabulary, grammar and punctuation. Increased
attention to vocabulary, grammar and punctuation is added through an appendix that
1024 D. Wyse & C. Torgerson
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
includes an emphasis on ‘explicit knowledge of grammar’ (Department for Educa-
tion, 2013a, p. 75), where pupils in Year 3 (7 to 8 years old) are expected to under-
stand terminology that includes ‘subordinate clause’ and for Year 6 (10 to 11 years
old) the need to be introduced, for example, to the ‘use of the passive to affect the pre-
sentation of information in a sentence’ (Department for Education, 2013a, p. 79,
emphasis in original).
In addition to the emphasis in the national curriculum programmes of study, the
national statutory tests for 11-year-old pupils in England included for the first time in
2011 a separate spelling, punctuation and grammar test where formal grammar was
further emphasised. In addition, the requirements for teacher assessment of writing
included a strong emphasis on grammar as part of the assessment criteria. In 2016
these emphases were still in place. For example, the national statutory test for Spel-
ling, Punctuation and Grammar included a strong emphasis on formal grammar
including questions that required knowledge of grammatical terminology (e.g. ‘27.
Underline the subordinate clause in each sentence below’; Standards and Testing
Agency, 2016, p. 17, emphasis in original). All questions in the paper attracted one
mark each. Although the 2016 criteria for statutory teacher assessment of writing,
produced by pupils in lessons, included aspects such as ‘creating atmosphere’ in their
writing, there was a strong emphasis on usage according to areas of formal grammar
such as ‘passive and modal verbs’ and ‘adverbs, preposition phrases and expanded
noun phrases’, etc. (Standards and Testing Agency, 2015).
The politics and policies that led to the emphasis on formal grammar in England’s
national curriculum implemented from 2014 onwards began with a government
White Paper in 2010 that included the commitment to ‘Review and reform the
National Curriculum so that it becomes a benchmark outlining the knowledge and
concepts pupils should be expected to master to take their place as educated members
of society’ (Department for Education, 2010, p. 41). The link between statutory
assessment, the curriculum and school accountability was also made clear: ‘The
National Curriculum will continue to inform the design and content of assessment at
the end of key stage two, which will apply to every child and which will provide a
guide to the performance of primary schools’ (Department for Education, 2010, p.
42). After publication of the White Paper the government commissioned a review of
assessment in England led by Lord Bew. Bew’s final report noted that ‘there are some
elements of writing – spelling, grammar, punctuation, vocabulary – where there are
clear “right” and “wrong” answers, which lend themselves to externally-marked test-
ing . . . Internationally a number of jurisdictions conduct externally-marked tests of
spelling, punctuation and grammar . . . These are essential skills and we recommend
that externally-marked tests of spelling, punctuation, grammar and vocabu-
lary should be developed’ (Bew, 2011, p. 60, emphasis in original).
A public consultation on the proposals for the new national curriculum was held
between February and April 2013. It attracted 17,312 respondents with 4,576
described as ‘non-campaign respondents’ and 12,736 described as ‘campaign
respondents’ (i.e. organisations devoted to a particular issue; the report of the consul-
tation made clear that campaign responses were not included in the percentages of
answers to questions but were reflected in the commentaries about the answers).
3,682 respondents addressed the question ‘Do you have any comments on the
What works in education 1025
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
content set out in the draft programmes of study?’ With regard to the teaching of the
subject English, and the teaching of grammar within that subject, ‘There was recogni-
tion that the teaching of phonics, punctuation, spelling and grammar was necessary,
but some felt that there was an over-emphasis on these aspects’ (Department for
Education, 2013b, p. 7). It is disappointing that the number of respondents who
replied about grammar was not specified in the report as this would have provided
some further evidence relevant to the strength of opinion on this issue.
There was also a follow-up consultation, open from July to August 2013, on the
draft legislative order, which attracted further comment about English and grammar.
Although 21 respondents (11%) supported the greater focus on spelling, grammar
and punctuation,
a total of 36 respondents (19%) however expressed concern in relation to the more
demanding grammatical content included for years 2 and 4 . . . 52 respondents (28%) said
the English primary curriculum was too prescriptive, in particular in reference to the level
of specification in the appendices [where the grammatical knowledge to be learned by
pupils is specified]. These respondents argued that this undermined the aims of the new
national curriculum in relation to greater professional freedom and were concerned that
this may have implications for the provision of a balanced and broadly based school cur-
riculum. (Department for Education, 2013c, p. 6)
One interpretation of these data in the second consultation is that 47% of respon-
dents were critical of the grammar specified in the national curriculum and its appen-
dices, but 11% thought the emphasis on correct use of Standard English was
commendable. An overall negative response to the proposed attention to grammar
did not result in changes to this element of the national curriculum.
Reservations about the nature of the specifications for grammar teaching in the
national curriculum and its associated statutory testing continued to cause disagree-
ment. The main government advisor for grammar in the statutory assessment system
described the process of determining the curriculum for grammar as ‘chaotic’ and
said that ‘We started off with the primary curriculum, which we were a bit unconfi-
dent about as none of us had much experience of primary education’ (Mansell,
2017). In April 2017 a House of Commons Education Select Committee report on
assessment in primary schools concluded that:
One issue with the writing assessment is the focus on technical aspects, like grammar and
spelling, over creativity and composition. We are not convinced that this leads directly to
improved writing and urge the Government to reconsider this balance and make spelling,
punctuation and grammar tests non-statutory at Key Stage 2 (House of Commons Educa-
tion Committee, 2017, p. 3)
This brief account of some of the work that led to greater emphasis on grammar in
England’s national curriculum, and subsequent implications, shows that research evi-
dence, of any kind, had insufficient consideration and influence on the national cur-
riculum of 2014. Further corroboration of problems with attention to research
evidence was detailed by BERA President Mary James (British Educational Research
Association, 2012), one of the expert group advising on the national curriculum. In
addition, reflecting on his time as Minister for Schools under Secretary of State for
Education Michael Gove, David Laws claims that decisions were made ‘not based on
1026 D. Wyse & C. Torgerson
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
evidence but on hunch’ (Wilby, 2017) and that Gove had a particular weakness for
basing decisions on ‘ideology and personal experience’ (Wilby, 2017).
Research evidence on grammar for writing
The place of grammar in education has been a point of debate for at least 200 years,
in part because it has been repeatedly linked with the development of the concept of
‘standard’ English (Crystal, 2004). In the twenty-first century general interest in
grammar teaching as an element in the teaching of writing continued (Wyse, 2001;
Andrews et al., 2004a,b; Myhill & Watson, 2014). In 2001, as a result of a compre-
hensive narrative review of empirical studies, it was concluded that:
The findings from international research clearly indicate that the teaching of grammar (us-
ing a range of models) has negligible positive effects on improving secondary pupils’ writ-
ing. Of further concern is the negative impact on pupils’ motivation. In the [National
Literacy Strategy] Framework for Teaching the move towards the teaching of grammatical
‘technical vocabulary’ such as adjective; noun: collective, common, proper; pronoun: per-
sonal, possessive; verb, and verb tense to six and seven year-old children in England is
highly questionable. It is regrettable that there is not more evidence about primary pupils;
however, the developmental arguments that such teaching is inappropriate at primary level are
persuasive. (Wyse, 2001, p. 422, emphasis added)
This finding was subsequently supported in two systematic reviews (SRs) undertaken
by one of the authors of this paper and colleagues (Andrews et al., 2004a,b). In the
first systematic review evaluating the effect of grammar teaching (syntax) in English
on 5- to 16-year-olds’ accuracy and quality in written composition, Andrews et al.
(2004a) concluded there was insufficient high-quality evidence to ‘counter the pre-
vailing belief that the teaching of the principles underlying and informing word order
or “syntax” has virtually no influence on the writing quality or accuracy of 5 to 16
year-olds’ (Andrews et al., 2004a). This conclusion applied to both the ‘traditional’
approach of emphasising word order and parts of speech and the ‘transformational’
approach, based on transformational-generative grammar. The current picture of
robust research in relation to grammar teaching to support pupil’s writing is shown in
Tables 1 and 2.
As the evidence summarised in Tables 1 and 2 shows, as far as primary/elementary
education is concerned there is strong evidence that grammar teaching of a range of
types, but particularly traditional grammar teaching, is not effective for improving
pupils’ writing. There is evidence that sentence-combining is effective but no experi-
mental studies have been carried out in the UK. At secondary education level there is
a slightly more mixed picture. The majority of the evidence suggests that, apart from
sentence-combining, grammar teaching is not effective for improving pupils’ writing.
However one robust study, Myhill et al. (2011), showed that contextualised grammar
teaching was effective for improving secondary pupils’ writing, although the approach
was more effective for higher-attaining pupils.
In about 2010, a challenge to the longstanding view that grammar teaching was not
the most effective way to improve writing emerged from researchers in the UK. For
example, in an interview, it was stated: ‘. . . what we have for the first time ever, inter-
nationally, is research evidence that shows that the teaching of grammar can have an
What works in education 1027
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
T a b le
1 .
A se le ct io n o f k ey
m et a -a n a ly se s a n d in fl u en
ti a l si n g le ex p er im
en ta l st u d ie s (p ri m a ry /e le m en
ta ry
ed u ca ti o n )1
C it a ti o n
U K
o n ly o r o th er
co u n tr ie s
T y p e o f p u p il s
S a m p le ; a g e o f p u p il s
D es ig n
D u ra ti o n o f
in te rv en
ti o n
In te rv en
ti o n
su m m a ry
C o n tr o l su m m a ry
S u m m a ry
o f m a in
o u tc o m e a n d
co n cl u si o n
A n d re w s et a l.
(2 0 0 4 a )
O th er
co u n tr ie s:
w o rl d w id e
M u lt ip le ty p es
o f
p u p il s
M u lt ip le sa m p le s;
v a ri ed
S y st em
a ti c re v ie w
M u lt ip le
M u lt ip le in te rv en
ti o n
ty p es
(e .g .,
g en
er a ti v e
g ra m m a r, ex p o su re
to st o ry
a n d
st a n d a rd
E n g li sh
fe a tu re s,
tr a n sf o rm
a ti o n a l
g ra m m a r,
tr a d it io n a l
g ra m m a r,
co n te x tu a li se d
g ra m m a r)
M u lt ip le b u t n o t
sp ec if ie d
G ra m m a r te a ch
in g
h a s v ir tu a ll y n o
im p a ct
o n p u p il s’
w ri ti n g . T ea ch
in g
o f sy n ta x in
E n g li sh
sh o u ld
ce a se
to b e
p a rt o f th e
cu rr ic u lu m
A n d re w s et a l.
(2 0 0 4 b )
O th er
co u n tr ie s:
w o rl d w id e
M u lt ip le ty p es
o f
p u p il s
M u lt ip le ty p es
o f
p u p il s
M et a -a n a ly si s
M u lt ip le
S en
te n ce -c o m b in in g
M u lt ip le b u t n o t
sp ec if ie d
T h e N a ti o n a l
C u rr ic u lu m
in
E n g la n d sh o u ld
b e
re v is ed
to ta k e in to
a cc o u n t th a t th e
te a ch
in g o f
se n te n ce -
co m b in in g is
ef fe ct iv e
1028 D. Wyse & C. Torgerson
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
T a b le
1 .
(C o n ti n u ed
)
C it a ti o n
U K
o n ly o r o th er
co u n tr ie s
T y p e o f p u p il s
S a m p le ; a g e o f p u p il s
D es ig n
D u ra ti o n o f
in te rv en
ti o n
In te rv en
ti o n
su m m a ry
C o n tr o l su m m a ry
S u m m a ry
o f m a in
o u tc o m e a n d
co n cl u si o n
G ra h a m
et a l.
(2 0 1 2 )
O th er
co u n tr ie s:
w o rl d w id e
M u lt ip le ty p es
o f
p u p il s
M u lt ip le sa m p le s;
v a ri ed
M et a -a n a ly si s
M u lt ip le
M u lt ip le in te rv en
ti o n
ty p es
(‘ e. g .,
co m p a ri so n s w er e
m a d e to
p ro ce ss
w ri ti n g , st ra te g y
in st ru ct io n , a n d
ty p ic a l la n g u a g e
a rt s in st ru ct io n ’, p .
8 8 7 )
G ra m m a r in st ru ct io n
(‘ e. g ., st u d en
ts
sy st em
a ti ca ll y
st u d ie d p a rt s o f
sp ee ch
,
d ia g ra m m ed
se n te n ce s, a n d so
fo rt h ’, p . 8 8 1 )
T ea ch
in g g ra m m a r
d o es
n o t im
p ro v e
p u p il s’ w ri ti n g
A fo cu
s o n a ra n g e o f
ev id en
ce -b a se d
a p p ro a ch
es to
te a ch
in g w ri ti n g is
m o re
b en
ef ic ia l
th a n g ra m m a r
te a ch
in g fo r w ri ti n g
G ra h a m
a n d
H a rr is (2 0 1 7 )
O th er
co u n tr ie s:
w o rl d w id e
M u lt ip le ty p es
o f
p u p il s
M u lt ip le sa m p le s;
v a ri ed
M et a -a n a ly si s o f
m et a a n a ly se s
M u lt ip le
M u lt ip le in te rv en
ti o n
ty p es
(‘ e. g .,
co m p a ri so n s w er e
m a d e to
p ro ce ss
w ri ti n g , st ra te g y
in st ru ct io n , a n d
ty p ic a l la n g u a g e
a rt s in st ru ct io n ’, p .
8 8 7 )
G ra m m a r in st ru ct io n
A p a rt fr o m
th e
se n te n ce
co m b in in g
a p p ro a ch
,
g ra m m a r te a ch
in g
d o es
n o t h a v e a
p o si ti v e ef fe ct
o n
p u p il s’ w ri ti n g
T h er e a re
a ra n g e o f
ev id en
ce -b a se d
a p p ro a ch
es a n d
st ra te g ie s th a t ca n
h a v e a p o si ti v e
ef fe ct
o n p u p il s’
w ri ti n g
What works in education 1029
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
T a b le
1 .
(C o n ti n u ed
)
C it a ti o n
U K
o n ly o r o th er
co u n tr ie s
T y p e o f p u p il s
S a m p le ; a g e o f p u p il s
D es ig n
D u ra ti o n o f
in te rv en
ti o n
In te rv en
ti o n
su m m a ry
C o n tr o l su m m a ry
S u m m a ry
o f m a in
o u tc o m e a n d
co n cl u si o n
F o g el a n d E h ri
(2 0 0 0 )
O th er
co u n tr ie s:
U S A
T w o N o rt h -
ea st er n U S
ci ti es
w it h
si ze a b le
p o p u la ti o n s o f
A fr ic a n -
A m er ic a n
re si d en
ts
8 9 A fr ic a n -A
m er ic a n
B E V -s p ea k in g 3 rd -
a n d 4 th -g ra d e
el em
en ta ry
sc h o o l
st u d en
ts ; a g e 8 –1
0 .
T w el v e in ta ct
3 rd -
a n d 4 th -g ra d e
el em
en ta ry
sc h o o l
cl a ss es
R C T
T w o se ss io n s o f
a b o u t 6 0 m in u te s
ea ch
In te rv en
ti o n g ro u p 1
(E x p o su re
E ):
F o cu
s o n
co rr ec ti n g n o n -
st a n d a rd
fo rm
s o f
E n g li sh
co m m o n
to B la ck
E n g li sh
V er n a cu
la r
In te rv en
ti o n g ro u p 2
(E x p o su re
p lu s
st ra te g y in st ru ct io n
E S ): sa m e a s g ro u p
1 p lu s st ra te g y
in st ru ct io n
In te rv en
ti o n g ro u p 3
(E x p o su re , st ra te g y
in st ru ct io n p lu s
p ra ct ic e E S P ):
sa m e a s g ro u p 2
p lu s p ra ct ic e
T h re e ex p er im
en ta l
co n d it io n s
T h e co
m b in a ti o n o f
ex p o su re
to
co rr ec ti n g n o n -
st a n d a rd
g ra m m a r,
st ra te g y in st ru ct io n
a n d p ra ct ic e, a t
le a st a t p a ra g ra p h
le v el , is b en
ef ic ia l
fo r w ri ti n g
1030 D. Wyse & C. Torgerson
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
T a b le
1 .
(C o n ti n u ed
)
C it a ti o n
U K
o n ly o r o th er
co u n tr ie s
T y p e o f p u p il s
S a m p le ; a g e o f p u p il s
D es ig n
D u ra ti o n o f
in te rv en
ti o n
In te rv en
ti o n
su m m a ry
C o n tr o l su m m a ry
S u m m a ry
o f m a in
o u tc o m e a n d
co n cl u si o n
S a d d le r a n d
G ra h a m
(2 0 0 5 )
O th er
co u n tr ie s:
U S A
N in e cl a ss ro o m s/
th re e sc h o o ls .
M o re
sk il le d
w ri te rs
v s. le ss
sk il le d w ri te rs
4 4 p u p il s; 9 –1
0 . 4 th
g ra d e
R C T
3 0 le ss o n s
S en
te n ce -c o m b in in g
T ra d it io n a l
g ra m m a r.
G ra m m a r sk il ls :
p a rt s o f sp ee ch
.
P re ci si o n o f
st u d en
t v o ca b u la ry
in th ei r w ri ti n g .
T h e in st ru ct o r
m o d el le d , w h il e
th in k in g a lo u d ,
h o w to
a p p ly th e
ta rg et
p a rt o f
sp ee ch
fo r th a t
u n it . T h is in v o lv ed
sh o w in g th e
st u d en
ts a se n te n ce
w it h th e ta rg et
p a rt
o f sp ee ch
m is si n g
a n d re a d in g th e
se n te n ce
a lo u d
S en
te n ce -c o m b in in g
w it h p ee r
a ss is ta n ce
h a d a
p o si ti v e ef fe ct
o n
p u p il s’ w ri ti n g
T o rg er so n et a l.
(2 0 1 4 )
U K
5 3 p ri m a ry
sc h o o ls fr o m
fo u r
g eo
g ra p h ic a l
re g io n s a cr o ss
E n g la n d
E st im
a te d 2 5 4 9 –
2 6 4 9 ; a g e 1 0 –1
1 .
Y ea r 6
R C T
1 5 le ss o n s o v er
fo u r
w ee k s
C o n te x tu a li se d
g ra m m a r
P u p il s ra n d o m is ed
to
th e ‘b u si n es s- a s-
u su a l’ g ro u p
re ce iv ed
th ei r u su a l
li te ra cy
le ss o n a s
p la n n ed
b y th ei r
te a ch
er
C o n te x tu a li se d
g ra m m a r w a s n o t
ef fe ct iv e in
im p ro v in g p u p il s’
w ri ti n g a s a w h o le -
cl a ss
in te rv en
ti o n
1 T h e tw
o m et a -a n a ly se s b y A n d re w s et a l. a d d re ss
p ri m a ry
a n d se co
n d a ry
ed u ca ti o n .
What works in education 1031
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
T a b le
2 .
A se le ct io n o f k ey
m et a -a n a ly se s a n d in fl u en
ti a l si n g le ex p er im
en ta l st u d ie s (s ec o n d a ry
ed u ca ti o n )
C it a ti o n
U K
o n ly o r o th er
co u n tr ie s
T y p e o f p u p il s
S a m p le ; a g e o f p u p il s
D es ig n
D u ra ti o n o f
in te rv en
ti o n
In te rv en
ti o n
su m m a ry
C o n tr o l su m m a ry
S u m m a ry
o f m a in
o u tc o m e a n d
co n cl u si o n
G ra h a m
a n d
P er in
(2 0 0 7 a )
O th er
co u n tr ie s:
w o rl d w id e
M u lt ip le ty p es
o f
p u p il s
M u lt ip le sa m p le s;
v a ri ed
M et a -a n a ly si s
M u lt ip le
M u lt ip le in te rv en
ti o n
ty p es : se n te n ce -
co m b in in g ; p ro ce ss
w ri ti n g ; ex p o si to ry
sk il ls ; cr ea ti v e
th in k in g a n d
se n te n ce -
co m b in in g
G ra m m a r
in st ru ct io n ,
tr a d it io n a l
g ra m m a r
in st ru ct io n a n d o n e
st u d y o f g ra m m a r
in st ru ct io n in
co n te x t
T ea ch
in g g ra m m a r
d o es
n o t im
p ro v e
p u p il s’ w ri ti n g
T h er e a re
a v a ri et y o f
in st ru ct io n a l
p ro ce d u re s th a t
im p ro v e th e q u a li ty
o f th e w ri ti n g o f
a d o le sc en
t
st u d en
ts
G ra h a m
a n d
P er in
(2 0 0 7 b )
O th er
co u n tr ie s:
w o rl d w id e
M u lt ip le ty p es
o f
p u p il s
M u lt ip le sa m p le s;
v a ri ed
M et a -a n a ly si s
M u lt ip le
M u lt ip le in te rv en
ti o n
ty p es
in cl u d in g :
st ra te g y
in st ru ct io n ;
p ro ce ss
a p p ro a ch
;
g ro u p w o rk ;
g ra m m a r
in st ru ct io n in
co n te x t; re a d in g
a n d w ri ti n g
T ra d it io n a l g ra m m a r
in st ru ct io n (n o
fu rt h er
sp ec if ic a ti o n )
G ra m m a r te a ch
in g
ca n h a v e a n eg a ti v e
ef fe ct
o n p u p il s’
w ri ti n g
A fo cu
s o n a ra n g e o f
ev id en
ce -b a se d
a p p ro a ch
es to
te a ch
in g w ri ti n g is
m o re
b en
ef ic ia l
th a n g ra m m a r
te a ch
in g fo r w ri ti n g
G ra h a m
a n d
P er in
(2 0 0 7 c)
O th er
co u n tr ie s:
U S A
M u lt ip le ty p es
o f
p u p il s
M u lt ip le sa m p le s;
v a ri ed
M et a -a n a ly si s.
In cl u d es
re v ie w o f
si n g le su b je ct
d es ig n s (i .e ., n o t
R C T s)
M u lt ip le
M u lt ip le in te rv en
ti o n
ty p es : fr ee
ti m e;
m in i- le ss o n s; d ir ec t
m et h o d tr a in in g o n
er ro rs ; g o a l- se tt in g
a n d a tt ri b u ti o n a l
fe ed
b a ck
T ra d it io n a l g ra m m a r
[‘ T h is a p p ro a ch
in v o lv ed
th e
ex p li ci t a n d
sy st em
a ti c te a ch
in g
o f g ra m m a r (e .g .,
th e st u d y o f p a rt s
o f sp ee ch
a n d
se n te n ce s) ’, p .
3 1 9 ]
T h er e a re
se ri o u s
q u es ti o n s a b o u t
th e v a lu e o f
tr a d it io n a l sc h o o l
g ra m m a r
1032 D. Wyse & C. Torgerson
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
T a b le
2 .
(C o n ti n u ed
)
C it a ti o n
U K
o n ly o r o th er
co u n tr ie s
T y p e o f p u p il s
S a m p le ; a g e o f p u p il s
D es ig n
D u ra ti o n o f
in te rv en
ti o n
In te rv en
ti o n
su m m a ry
C o n tr o l su m m a ry
S u m m a ry
o f m a in
o u tc o m e a n d
co n cl u si o n
B a te m a n a n d
Z id o n is (1 9 6 6 )
O th er
co u n tr ie s:
U S A
U n iv er si ty
sc h o o l o f
th e U n iv er si ty
o f
O h io
5 0 p u p il s; a g e 1 4 –1
6 .
9 th
a n d 1 0 th
g ra d e
R C T
T w o y ea rs ;
ex p er im
en ta l
co n d it io n
in tr o d u ce d in to
n o rm
a l te a ch
in g
T ea ch
in g n o t
sp ec if ie d v er y
cl ea rl y , b u t th e
C h o m sk y
tr a n sf o rm
a ti o n a l
a p p ro a ch
a p p ea rs
to b e se n te n ce -
co m b in in g : ‘F o rt y -
si x
tr a n sf o rm
a ti o n a l
ru le s se rv ed
to
id en
ti fy
th e
g ra m m a ti ca l
o p er a ti o n s th a t
ea ch
se n te n ce
in
th e sa m p le
re fl ec te d . T h es e
ru le s a re
o f fo u r
ty p es : E m b ed
d in g ,
C o n jo in in g ,
D el et in g , a n d
S im
p le ’ (p . 8 )
‘E a ch
cl a ss
[i n te rv en
ti o n a n d
co n tr o l] st u d ie d
w h a t w o u ld
b e
co n si d er ed
th e
re g u la r cu
rr ic u lu m
a t th e sc h o o l w it h
th is ex ce p ti o n : th e
ex p er im
en ta l cl a ss
st u d ie d m a te ri a ls
sp ec ia ll y a d a p te d
b y th e in v es ti g a to rs
fr o m
th e a re a o f
g en
er a ti v e
g ra m m a r’ (p . 7 )
S en
te n ce -c o m b in in g
b a se d o n
C h o m sk y ’s
‘g en
er a ti v e
g ra m m a r’ ca n h el p
fo rm
a ti o n o f w el l-
fo rm
ed se n te n ce s
What works in education 1033
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
T a b le
2 .
(C o n ti n u ed
)
C it a ti o n
U K
o n ly o r o th er
co u n tr ie s
T y p e o f p u p il s
S a m p le ; a g e o f p u p il s
D es ig n
D u ra ti o n o f
in te rv en
ti o n
In te rv en
ti o n
su m m a ry
C o n tr o l su m m a ry
S u m m a ry
o f m a in
o u tc o m e a n d
co n cl u si o n
M y h il l et a l.
(2 0 1 1 )
U K
S ta n d a rd
cl a ss es
3 2 Y ea r 8 m ix ed
-
a b il it y cl a ss es
in
co m p re h en
si v e
sc h o o ls w it h
b et w ee n 2 4 a n d 3 0
st u d en
ts in
ea ch
cl a ss ; 7 4 4 to ta l
(4 1 2 p u p il s in
in te rv en
ti o n
cl a ss es ; 3 3 2 in
co n tr o l cl a ss es );
a g e 1 2 –1
3
R C T
O n e y ea r, th re e- w ee k
p er io d o n ce
p er
te rm
C o n te x tu a li se d
g ra m m a r:
‘c o m p ri se d
d et a il ed
te a ch
in g
sc h em
es o f w o rk
in
w h ic h g ra m m a r
w a s em
b ed
d ed
w h er e a
m ea n in g fu l
co n n ec ti o n co
u ld
b e m a d e b et w ee n
th e g ra m m a r p o in t
a n d w ri ti n g ’ (p .
1 4 6 )
N o rm
a l te a ch
in g th a t
in cl u d ed
a tt en
ti o n
to ‘u si n g g ra m m a r
a cc u ra te ly a n d
a p p ro p ri a te ly .. .’
(p . 1 4 6 )
C o n te x tu a li se d
g ra m m a r te a ch
in g
h a s a p o si ti v e ef fe ct
o n p u p il s’ w ri ti n g
O ’H
a re
(1 9 7 3 )
O th er
co u n tr ie s:
U S A
F lo ri d a S ta te
U n iv er si ty
H ig h
S ch
o o l
8 3 p u p il s (4 1 in
in te rv en
ti o n
g ro u p s; 4 2 in
co n tr o l g ro u p s) ;
a g e 1 2 –1
3 . 7 th
g ra d e
R C T
1 9 le ss o n s
S en
te n ce -c o m b in in g
R eg u la r cu
rr ic u lu m
in E n g li sh
S en
te n ce -c o m b in in g ,
n o t fo rm
a l
k n o w le d g e o f
g ra m m a r, h a s a
fa v o u ra b le ef fe ct
o n p u p il s’ w ri ti n g
E ll ey
et a l.
(1 9 7 6 )
O th er
co u n tr ie s:
A u st ra li a
O u ts k ir ts o f
A u ck la n d ci ty ; o n e
la rg e co
-
ed u ca ti o n a l h ig h
sc h o o l
2 4 8 p u p il s; a g e 1 3 –
1 7 . 3 rd
fo rm
to 6 th
fo rm
Q u a si -e x p er im
en t
A p p ro x im
a te ly 5 7 4
p er io d s o f E n g li sh
in th re e y ea rs
C o n te x tu a li se d
g ra m m a r
G ro u p 1 : ‘R
ea d in g -
W ri ti n g ’
G ro u p 2 : ‘L et ’s
L ea rn
E n g li sh ’
A ra n g e o f ty p es
o f
g ra m m a r te a ch
in g
sh o w ed
n o o v er a ll
b en
ef it s fo r p u p il s’
w ri ti n g
1034 D. Wyse & C. Torgerson
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
T a b le
2 .
(C o n ti n u ed
)
C it a ti o n
U K
o n ly o r o th er
co u n tr ie s
T y p e o f p u p il s
S a m p le ; a g e o f p u p il s
D es ig n
D u ra ti o n o f
in te rv en
ti o n
In te rv en
ti o n
su m m a ry
C o n tr o l su m m a ry
S u m m a ry
o f m a in
o u tc o m e a n d
co n cl u si o n
F ea rn
a n d
F a rn a n (2 0 0 7 )
O th er
co u n tr ie s:
U S A
U rb a n h ig h sc h o o l
A ll th re e cl a ss es
co n ta in ed
2 4 to
2 6
1 0 th -g ra d er s; a g e
1 5 –1
6
Q u a si -e x p er im
en t
F iv e w ee k s, 1 0 to
1 2
m in u te s tw
ic e p er
w ee k
N o t co
m p le te ly cl ea r;
re la ti v el y fo rm
a l
g ra m m a r te a ch
in g
a n d so m e
co n te x tu a li se d
w o rk
T ra d it io n a l g ra m m a r
M ix tu re
o f fo rm
a l
g ra m m a r a n d
co n te x tu a li se d
g ra m m a r ca n
en h a n ce
w ri ti n g
In th e co
n te x t o f
h ig h -s ta k es
g ra m m a r te st s,
te a ch
er s’ fo cu
s
sh o u ld
b e a
li n g u is ti c o n e o n
th e o rg a n is a ti o n
a n d re o rg a n is a ti o n
o f w o rd s a s p a rt o f
w ri ti n g
H a rr is (1 9 6 2 )
U K
F iv e sc h o o ls (t w o
g ra m m a r sc h o o ls ,
o n e se co
n d a ry
m o d er n a n d o n e
co m p re h en
si v e/
te ch
n ic a l)
1 0 9 p u p il s in
n o n -
g ra m m a r cl a ss es ;
1 1 9 in
g ra m m a r
cl a ss es ; a g e 1 1 –1
3
Q u a si -e x p er im
en t
T w o y ea rs ;
ex p er im
en ta l
co n d it io n
in tr o d u ce d in to
n o rm
a l te a ch
in g
F o rm
a l g ra m m a r:
ex p li ci t le a rn in g
a n d n a m in g o f
p a rt s; co
rr ec ti o n s
re fe rr ed
to u si n g
te ch
n ic a l
v o ca b u la ry
su ch
a s
fa il u re
o f v er b a n d
su b je ct
a g re em
en t.
In cl u d ed
th e
co n te x tu a li se d
el em
en t o f
p ra ct ic in g th e u se
o f th e fo rm
in
w h o le -t ex t co
n te x t
E x te n si o n o f n o rm
a l
co m p o si ti o n
p ra ct ic e en
a b li n g
th e co
m p le ti o n o f
lo n g er
p ro je ct s
su ch
a s d ia ry ,
n ew
sp a p er , st o ry ,
et c.
F o rm
a l g ra m m a r
te a ch
in g d o es
n o t
h a v e a p o si ti v e
ef fe ct
o n p u p il s’
w ri ti n g
What works in education 1035
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
impact on children’s writing skills. But the way that we taught it was completely
unique’ (Education Arena, n.d.). More specifically, it was claimed in relation to a
RCT evaluating grammar teaching that, ‘the strong positive effect of the intervention
signals for the first time the potentiality of grammar as an enabling element in writing
development and evidences a clearly theorised role for grammar in writing pedagogy’
(Myhill et al., 2011, p. 162). The overall claim that an experiment had demonstrated
that grammar teaching could have a positive impact on secondary pupils’ writing
skills (Myhill et al., 2011)1 was in opposition to the conclusions of a seminal experi-
ment published in 1976 that had evaluated a similar grammar intervention to improve
writing (Elley et al., 1976). This experiment had concluded that grammar teaching
did not have a positive effect on writing.
The Elley and Myhill studies have been selected for detailed comparison in this
paper because both were experimental trials (one a RCT, the other a QED), both
were regarded as having significant wider impact, including political and profes-
sional impact, and both were published in peer-reviewed research journals. Elley
et al.’s (1976) quasi-experiment has been regarded as one of the most rigorous in
the field, having been reprinted by the US National Council for Teachers of Eng-
lish (NCTE) because it was ‘so important [and] a model of evaluation’ (Elley
et al., 1976, p. 5). The impact of Myhill et al.’s (2010) randomised experiment
was recognised by the UK Economic and Social Research Council because it
‘shaped policy and curriculum development in England - including the first author
leading the advisory group of four writing the Grammar Annex of the Primary
English curriculum; participation in the KS2 English Test team; and providing
expert testimony in discussions of the English curriculum revision with the Minis-
ter of State for Schools (2012) . . . Professor Myhill also provided evidence for the
new Australian curriculum’ (Economic and Social Research Council, 2016). As
will be demonstrated in detail below, each of these studies had relative strengths
and limitations, not least in their basic design; however, due to the availability of
any other experimental research addressing the same teaching approaches, and
having had the same reach and significance as these two studies, we consider such
a comparison relevant. We do acknowledge, however, the challenges and limita-
tions in making the comparison, given the differences between the two studies, in
particular, in terms of the countries and years in which they were undertaken and
published.
Below we discuss the two studies in detail, in terms of the intervention and control
conditions, design and features and components of design, and assess the main
methodological strengths and limitations.
Teaching methods for the control and intervention groups
An important consideration for any experimental trial, or a systematic review of trials
(and for our comparison in this paper) is that the nature of the teaching methods is
clearly specified in the publication, and is a suitable comparison, including a compar-
ison with at least one appropriate control group. In both the Elley and Myhill studies
a form of contextualised teaching of grammar was one of the interventions evalu-
ated.2
1036 D. Wyse & C. Torgerson
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
For the Elley study one of the intervention groups used an approach called the
Transformational Grammar (TG) course, based on Jerome Bruner’s concept of the
spiral curriculum. In this intervention group all the activities ‘were related to the cen-
tral core of each strand of the curriculum, thus giving it [the teaching approach] a
clear and consistent unity of purpose’ (p. 8). The TG intervention included the three
strands of (a) Grammar (Transformational), (b) Rhetoric and (c) Literature.
One control group in the Elley study used an approach called ‘Reading–Writing’,
which included rhetoric and literature (as did the TG intervention) but substituted
extra reading and creative writing instead of transformational grammar. The other
control group used an approach called ‘Let’s Learn English’: a traditional approach
to grammar including the learning of parts of speech and some applications of them.
The Elley intervention and control groups ‘had approximately 574 periods of Eng-
lish in the three years, distributed such that each class had similar proportions of
morning and afternoon periods, and of time spent on literature, on composition work,
and evaluation exercises’ (p. 10). Although it was claimed that ‘no detectable bias
was apparent in their approach to their teaching of any of the [grammar] courses’ (p.
10), there was no attempt to establish fidelity to the interventions, which is a signifi-
cant limitation of this study.
In the Myhill study, the intervention took place over three weeks per term for one
school year: ‘for both the intervention and comparison groups, the learning focus, the
period of study, the learning objectives and the assessed written outcomes were the
same’ (p. 147). For the intervention group the teaching designed by the project team
‘explicitly sought to introduce grammatical constructions and terminology at a point
in the teaching sequence which was relevant to the genre being studied’ (p. 148). The
intervention and control groups were both taught the same writing genre over a three-
week period once per term of the year of study. The teaching in both groups also
addressed the same learning objectives from England’s national framework for Eng-
lish that was being implemented at the time. The intervention in the Myhill study
‘comprised detailed teaching schemes of work in which grammar was embedded
where a meaningful connection could be made between the grammar point and writ-
ing’ (p. 146). TheMyhill intervention was based on the following principles:
• The grammatical meta-language is used but it is always explained through exam-
ples and patterns.
• Links are always made between the feature introduced and how it might enhance
the writing being tackled.
• The use of ‘imitation’: offering model patterns for students to play with and then
use in their own writing.
• The inclusion of activities which encourage talking about language and effects.
• The use of authentic examples from authentic texts.
• The use of activities which support students in making choices and being designers
of writing.
• The encouragement of language play, experimentation and games.(Myhill et al.,
p. 148)
There are two issues with the specification of teaching approaches in the interven-
tion and control groups in the Myhill study. Firstly, in each term of delivery both
What works in education 1037
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
intervention and control groups experienced teaching where ‘. . . using grammar accu-
rately and appropriately. . .’ was a pre-planned objective in the scheme of work. In the
intervention groups: ‘The intervention comprised detailed teaching schemes of work
in which grammar was embedded where a meaningful connection could be made
between the grammar point and writing’ (p. 7). The control groups did receive some
grammar teaching, as the teaching objectives used by both intervention and control
groups specify: ‘Autumn Term/Narrative Fiction/Using grammar accurately and
appropriately’ (p. 7, emphasis added). Secondly, like the Elley study there were no
checks for fidelity in either condition: ‘Fidelity is a problematic concept in a naturalis-
tic educational setting such as this, as identical implementation of the intervention
teaching materials is neither possible nor desirable. Teachers [in the intervention]
were not asked to follow the lesson plans rigidly; they were allowed to adapt materials
to suit the needs of their students, but were also asked to remain as close as possi-
ble to the materials’ (p. 9). So it is possible that the grammar teaching delivered by
the teachers in the control condition included contextualised teaching of the Myhill
kind; or that they used formal grammar; or more probably that there was a mixture of
approaches. As a result the specific role of grammar was not isolated in the trial. It
cannot be definitively claimed that it was the grammar that was effective, or not effec-
tive, in either the Myhill or the Elley studies because it could have been a range of fac-
tors, including simply better teaching as a result of the training (i.e. the Hawthorne
effect).
Site, sampling, design and allocation to groups
The Elley study took place in one large co-educational high school on the outskirts of
Auckland city. At the start it involved 248 pupils in eight matched classes of average
ability who were taught, observed and regularly assessed from the beginning of third-
form year in February 1970 to the latter part of fifth-form year in November 1972.
The results of the reading test, the assessment of the distribution of fathers’ incomes,
the secondary certificate of education exam results and the inclusion of 15% Polyne-
sian pupils indicated a so-called ‘normal’ sample. Elley noted that, ‘At the outset, one
bright and three slow-learning classes were deliberately excluded from the total third-
form intake of 380 pupils, thus rendering it more homogeneous, and increasing the
chance of identifying systematic differences between groups’ (p. 7). The experimental
pupils ‘were classified into eight matched classes of 31 pupils’ on the basis of a num-
ber of tests, and additional matching criteria were ‘ethnic group, sex, contributing
school, and subject options’ (p. 7). Although the pupils were allocated as individuals
to the eight classes, the study—after this allocation—works as a cluster trial as the
pupils in the eight classes were taught together. The three experimental groups con-
tained three, three and two classes respectively, and the pupils were tested during the
intervention period and at the end.
Limitations of the sampling and grouping in the Elley study include: the lack of
random allocation to groups; the small sample size of eight classes or clusters in total
split between three groups (statistical methodologists state that, as a minimum, there
should be four clusters per group in a cluster randomised trial; Donner & Klar,
2000); and the fact that it was undertaken in only one school, thereby reducing
1038 D. Wyse & C. Torgerson
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
external validity. This latter issue introduces the possibility of potential ‘contamina-
tion’ or ‘spill over’ of the intervention and control conditions between the groups,
and whether this occurred or not is not clear. The lack of random allocation is impor-
tant because random allocation minimises any selection bias at the start of the experi-
ment. In a quasi-experiment matching is sometimes used to ensure baseline
equivalence, as in this case. The classes were matched on a number of variables
including performance on a number of pre-tests. However, Elley et al. did not report
the results of the matching and, therefore, we have to take on trust that the classes
were, in fact, matched on the observed variables. Also, matching cannot account for
imbalance on unknown variables, which can in turn introduce a potential source of
bias which could affect outcome. Furthermore, Elley and colleagues did not adjust
for the clustering in their analysis and instead analysed their data as though this was
an individually allocated quasi-experiment; although they made some attempts to
control for teacher effect, given the small sample size (see above), this would not have
been possible. This study also suffered from high attrition of pupils—over 30% by the
final follow-up in Year 3.
In the Myhill study the authors identified a sample of 32 mixed comprehensive
schools from the South West and Midlands areas of England. Lists of schools from
local authorities were randomly sampled until the desired sample size was achieved.
Once the schools had been recruited, a Year 8 class was selected (with children aged
12 to 13 years) and the classes were stratified according to the teachers’ ‘Grammar
Subject Knowledge’ (GSK); the classes were then randomised using a random num-
ber generator. In these respects, the Myhill study is of higher design quality than the
Elley study: a random sample of schools in two geographical areas in the UK was used
to form the intervention and control groups, thereby increasing external validity. The
design was a large-cluster RCT, with school as the cluster, thereby minimising the
potential for contamination between groups.
Tests and measures
Data for the Elley study were collected in the form of a series of set essays at the end
of each year, marked by teachers from neighbouring schools, plus a battery of stan-
dardised tests. The essays were assessed by carefully briefed panels of English teach-
ers from neighbouring secondary schools. In the first year each pupil wrote four
essays which were assessed by four markers, working independently using a 16-point
scale that included criteria for content, organisation, style and mechanics. In subse-
quent years the number of essays was reduced to three essays and two markers, appar-
ently with no loss of reliability. The battery of tests included: ‘PAT’ reading
comprehension and vocabulary tests (NZCER, 1969, Elley et al., 1976); sentence-
combining; error-correction tests; literature-appreciation tests; and anonymous ques-
tionnaires to assess attitudes to work.
In the Myhill study a pre-test was administered to the pupils, and at the end of the
study a post-test was given. The test was a piece of first-person narrative ‘written
under controlled conditions’, encouraging the pupils to draw on their personal experi-
ences. The test design and marking ‘were led by Cambridge Assessment’. Each test
was marked by two people, and a third marker resolved any differences. The markers
What works in education 1039
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
did not know from which pupil group the pieces had originated (blinded assessment
of outcome). The marking was based on the mark scheme format used by secondary
schools at the time. The outcome was the change in the test scores using an ordinary
least-squares regression approach with pupil-level data. One control class did not
adhere to the intervention and was removed from the analysis.
Results and conclusions
The Elley intervention TG and ‘Let’s Learn English’ (LLE) grammar groups found
English more ‘repetitive’ and ‘useless’ than the control group did. The reading/writ-
ing (RW) group showed more positive attitudes to reading. The TG group was par-
ticularly negative about ‘sentence-study’. In the fourth year (14 to 15 year olds),
only one comparison (from 30 possible) showed significant differences (on essay
content). In the School Certificate Examination there were no significant differ-
ences between the three programmes. In the fifth year (15 to 16 year olds), only 2
of the 12 variables listed showed any significant differences (sentence-combining
test and English usage test). Again, in the School Certificate Examination there
were no significant differences between the three groups. Overall, TG and tradi-
tional grammar teaching showed no measurable benefits. Participants in the RW
group, who studied no formal grammar for three years, demonstrated competence
in writing and related language skills fully equal to that shown by the two grammar
groups. Elley et al. concluded that ‘English grammar, whether traditional or trans-
formational, has virtually no influence on the language growth of typical secondary
school students’ (p. 18). Elley et al. dismissed the idea of the introduction of gram-
mar at primary level mainly based on developmental theory: ‘it seems most unlikely
that such training would be readily applied by children in their own writing. Fur-
thermore, the researchers’ empirical findings do not support the early introduction
of grammar’ (p. 18).
In addition to a wide range of findings that included analysis of teacher subject
knowledge, the Myhill study found a ‘highly significant’ positive difference in marks
in favour of the intervention groups, and concluded that ‘this represents the first
robust statistical evidence for a beneficial impact of the teaching of grammar in stu-
dents’ writing attainment’ (p. 151). The authors also concluded that:
the study represents the first large-scale study in any country of the benefits or otherwise of
teaching grammar within a purposeful context in writing. It stands in contrast to previous
studies which were either small-scale (Bateman & Zidonis, 1966; Fogel & Ehri, 2000) or
which investigated whether discrete grammar instruction improved writing outcomes
(Elley et al., 1975, 1979) , and is the only study of its kind conducted in England. (p. 161)
As we demonstrated earlier, it was not strictly accurate to claim that the Elley study
used ‘discrete grammar instruction’ as its comparator. The issue of scale is also inter-
esting. It is true that the number of students involved in the Myhill study was the lar-
gest to date, but what is important is not the scale per se but the quality, and power, of
the design of any study. Scale is also implicated in the consideration of the results of
just one study versus the combined results of many studies, an approach that is at the
heart of systematic review and meta-analysis.
1040 D. Wyse & C. Torgerson
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
Quality of the methodology of the studies
It is important to note that both studies were undertaken in secondary schools, not in
primary/elementary schools, therefore their findings could not reliably be generalised
to defend any decisions made for primary/elementary education.
The Elley study, as reported, had a number of limitations. Its design was a quasi-
experiment as it did not use random allocation to assign students to classes. In addi-
tion, the sample size of this cluster trial was small and underpowered. It was also lim-
ited by the fact that it was undertaken in only one school. Other issues that
undermine the validity of the study include not stating how the students were allo-
cated into the groups and not stating whether the tests were administered and marked
blind to allocation to minimise potential bias.
The Myhill study, as reported, also had a number of limitations. The authors
did not use an intention to treat (ITT) method of analysis. Removal of a non-
compliant class from the analysis potentially biased the results. This is because
that particular teacher and class were likely to be systematically different from
those who remained in the study. Randomisation ensures differences are balanced
between the two groups at baseline. Removal of a class from one group, post-ran-
domisation, reintroduces the potential for the selection bias that the randomisation
had previously dealt with by ensuring classes were similar between the two
groups at randomisation. The second limitation is the bias in the standard errors.
As the authors acknowledge, their study was a cluster randomised trial and,
although they mention the need to adjust for clustering, they argue that, because
there was only one cluster per school, this was not necessary. Consequently, they
treated the sample as having several hundred independent observations rather than
32 (or 31 after removal of the non-compliant teacher) clustered observations.
Other issues that potentially threaten the validity of the study include: not
describing who did the randomisation and not stating whether this was done
independently of the investigators (developers); and not stating whether the pre-
tests were done before random allocation to minimise potential bias from the par-
ticipants having knowledge of the allocation before undertaking the pre-test.
However, all of the limitations observed in the Myhill study were also possibly
present in the Elley study but, due to some limitations in the reporting of that
study, it is not possible to make a judgement about, for example, whether or not
ITT analysis was used.
To conclude our in-depth analysis of the methodology of these single trials we look
finally at a more recent study addressing the question of whether the Myhill et al. con-
textualised approach was effective and generalisable to the oldest children in primary
schools. The study by Torgerson et al. (2014) was carried out as an independent fol-
low-up trial funded by the EEF. This trial was aimed at pupils in the ‘transition per-
iod’ between primary and secondary school [last term of Year 6 (age 10–11) and first
term of Year 7 (age 11–12)]. The Torgerson et al. study has not yet demonstrated
similar levels of impact and significance as the studies by Elley et al. and Myhill et al.,
but the comparison is relevant because the nature of the grammar intervention evalu-
ated in this RCT was the Myhill approach. The inclusion of primary age pupils in the
Torgerson study is also important for our argument in this paper, although the
What works in education 1041
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
comparison with the Myhill study needs to be treated with caution due to the differ-
ence in participant characteristics.
The design of the Torgerson study was a pragmatic ‘partial split plot’ RCT.
Schools were randomised at the cluster level (similar to the original Myhill design). In
the intervention schools, children were additionally randomised as individuals to
receive the grammar teaching as a whole class or in small groups. This allowed the
evaluators to test whether small-group teaching was effective, as well as whether
grammar teaching per se was effective. Unlike in the Myhill study, the evaluators took
the clustered nature of the data into account in the analysis and also undertook an
ITT analysis, whereby all schools and pupils were included in the analysis, irrespec-
tive of their level of intervention compliance. The results showed that there was a
small, statistically non-significant effect of grammar teaching on literacy outcomes. In
contrast, the small-group teaching delivered a modest, statistically significant effect
on literacy outcomes. Indeed, when the small-group effect was removed from the
grammar teaching by comparing the whole classes in the intervention against the
whole-class control group, the small difference declined from 0.10 of a standard devi-
ation to 0.06. Therefore, the results of this trial suggest, at best, only a very small
effect of grammar teaching on literacy outcomes. However, although the study did
use an ITT analytical strategy, and the correct statistical approach, this was imple-
mented among children during the ‘transition’ from primary to secondary school,
which could have led to an underestimation of the teaching effectiveness, due to the
summer break from attendance at school.
Discussion and conclusions
The last 30 years has shown a gradual increase in the use of experimental trials in edu-
cation research. Greater understanding of the strengths and weaknesses of research
designs is evident in more recent research studies. This greater understanding is
reflected, for example, in the combining of experimental trials with qualitative meth-
ods including implementation process evaluations or embedded ethnography. In gen-
eral, these developments reflect growing sophistication in education research and
social-science research more generally.
Although the numbers of robust experimental trials relevant to effective teaching in
schools have increased, our analysis of trials in relation to the teaching of writing sug-
gests that there are still too many studies that are not of sufficient methodological
quality. In particular, too many studies are weak in relation to allocation of pupils to
groups, and the measures for writing remain a challenge. Randomisation, to form two
or more intervention and control groups, is essential to ensure that the groups are bal-
anced in known and unknown factors that may affect writing outcomes. Randomisa-
tion could be by pupil, by class, by school year or by school. The higher the unit of
allocation (e.g. school versus pupil), the lower the efficiency (in statistical terms) of
the design. In other words, all things being equal, it is necessary to have more children
in a design that randomises at a level above the pupil to see a given difference (if one
exists) that would be statistically significant. The main weakness of randomisation at
the level of the child is contamination or spill-over effects, and the logistics of allocat-
ing pupils in ways that are different from the normal ways that schools allocate pupils
1042 D. Wyse & C. Torgerson
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
to classes, hence the use of group or cluster randomisation of schools. A ‘business-as-
usual’ control group is often appropriate in a pragmatic trial; however, it is useful to
consider additional interventions leading to three or more arms to the trial if there
were other competing interventions that could potentially improve writing skills.
Writing outcome measures ideally need to include robust measures for improve-
ments in writing composition even if the focus is, for example, development of gram-
mar. There is a need to know that the holistic aspects of writing are being enhanced,
not just the key components. Such an outcome measure should be administered and
marked by independent assessors for whom the allocation of teaching approaches to
groups is not known (the markers are ‘masked’). This prevents either conscious or
unconscious marking bias of the outcomes.
When the data are all collected and collated it is important to analyse the data as if
all the pupils had received the intervention to which they were allocated, whether they
did or did not indeed receive the intervention (adopting ‘intention to treat’ or ‘inten-
tion to teach’ analysis; Torgerson & Torgerson, 2008). If schools that comply weakly
with the intervention are excluded from the analysis, this introduces the potential for
selection bias which the original randomisation minimised. There are statistical tech-
niques for looking at the effect of low compliance but removing weakly or non-com-
pliant classes or schools is not one of them.
With regard to our substantive case of grammar, the current evidence from ran-
domised controlled trials does not support the widespread use of grammar teaching
for improving writing among native English-speaking children. Based on the experi-
mental trial and meta-analysis evidence about writing teaching more generally (e.g. in
Tables 1 and 2), our hypotheses are that supporting primary/elementary pupils’
grammar is most likely to require teachers intervening during the writing process, and
interacting to discuss the use of grammar in relation to the overall purpose of the writ-
ing task and the purpose of the writing. The necessity to use technical terms with
pupils, such as subordinate clause or subjunctive, remains a question open to
research, but it is doubtful that attention to such terms is beneficial. It is probable that
adopting everyday language to discuss improvements in the use of grammar in writing
will be more beneficial. Small-group and whole-class teaching that includes a focus
on the actual use of grammar in real examples of writing (including professionally
produced pieces, realistic examples produced by teachers including ‘think aloud’ live
drafting of text and drafts of pupils’ writing) may also be more effective.
When the decisions taken by, and for, schools and teachers about what approaches
to adopt are informed by research, there are important choices to be made. Although
grammar for writing has been a main focus of this paper, if the overall goal is to
improve pupils’ writing then a much wider set of research evidence about writing
needs to be considered. Improvements in pupils’ writing have to be achieved across
many different dimensions. For example, robust evidence has shown that an
approach with primary age pupils that used strategy instruction (itself an approach
backed by robust multiple trial evidence), combined with pupils’ experience of offsite
visits to places of educational interest, had powerful effects. This work had its origins
in the USA, but an evaluation using RCT design undertaken in England confirmed
its transferability to a different national context, although the trial was relatively small
and the results need to be confirmed in a larger effectiveness trial (Torgerson &
What works in education 1043
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
Torgerson, 2014). However, once again this work is but one study and one approach.
The most recent meta-analyses of high-quality research studies on writing suggest
that, rather than emphasise grammar, the following practices could be selected as a
priority for teaching writing in primary/elementary education: (a) an increase in the
amount of time that pupils have for writing; (b) adoption of a process approach to
writing; (c) creation of a classroom environment that is appropriately supportive of
pupils’ attempts at learning to write better; (d) development of pupils’ writing skills,
strategies and knowledge, including ways of planning writing; (e) a use of assessment
for learning techniques; (f) a use of computers as part of the process of writing; (g) a
use of writing meaningfully across different subject areas (Graham et al., 2016; Wyse,
2017). The robustness of the evidence underpinning these practices is built not on
single studies but on multiple RCTs and experimental trials.
The mismatch between curriculum policy for the subject English and the research
evidence base is particularly pronounced at primary/elementary level in England. The
national curriculum in England and its associated national statutory tests include a
heavy emphasis on formal grammar teaching, and to varying degrees the national cur-
ricula in other English-speaking countries also have an emphasis on formal grammar
teaching. Sentence-combining remains the only approach to grammar for writing that
is supported by robust research evidence from experimental trials, although there are
no RCTs that have been undertaken in the UK. The use of sentence-combining as
part of the process of writing would be a good area for new research.
In relation to the use of evidence to guide policy, a key risk is for policy makers and
their advisors to attend too closely to single studies, within a field of interest, that
might support a preferred policy direction rather than take due account of multiple
studies published over many years. The problems of attending to a single study have
been seen in relation to the teaching of reading in the UK (Wyse & Goswami, 2008;
Ellis & Moss, 2013); this, in addition to ideological belief, appears to be a reason for
the dramatic emphasis on grammar in England’s primary national curriculum that
was implemented from 2014 onwards, a trend that is counter to the research evidence
overall, and one that risks having a negative impact on children’s literacy learning and
hence life chances. The outcomes of reviews of multiple studies, including systematic
review and meta-analysis and high-quality narrative reviews, are a much more reliable
evidence-base for policy decisions than single studies. But this kind of evidence also
requires mediation by experts who possess both substantive, methodological and
practical knowledge and experience.
Although policy makers and politicians around the world have engaged with the
importance of research evidence, for example in the prioritisation of evidence-based
practices based on RCTs, there is a resulting need for policy to accurately reflect the
outcomes of robust reviews of multiple sets of evidence. Such reviews may indicate
that a policy should be in a direction that is contrary to a minister’s ideology and per-
sonal beliefs. At other times there may not be sufficient research evidence to warrant
a particular policy decision in any direction: in these cases there is the option to fur-
ther prioritise schools’ autonomy and teachers’ professional judgement. Better poli-
cies are likely to be made in future if policy decisions are informed by expert critical
synthesis of multiple robust research studies, including systematic reviews and meta-
analyses, relevant to the contexts of implementation. Finally, a necessary
1044 D. Wyse & C. Torgerson
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
consequence of the kind of attention to research evidence that we advocate may mean
that curriculum policy should change more slowly and more incrementally because
accumulation of the multiple studies that are required to warrant decisions in impor-
tant areas such as the teaching of writing takes many years.
NOTES
1 For brevity, in the rest of the paper we refer to this study as the ‘Myhill study’ and the Elley et al. study as the ‘Elley study’.
2 Contrary to Myhill et al.’s claim that the Elley study did not include contextualised grammar teaching as one of the interventions it is evident from the description in the Elley paper of the TG approach, inspired by Bru- ner’s spiral curriculum as we show in this paper that it did (also confirmed in a personal communication with Warwick Elley in 2013).
References
Andrews, R., Torgerson, C., Beverton, S., Locke, T., Low, G., Robinson, A. & Zhu, D. (2004a)
The effect of grammar teaching (syntax) in English on 5 to 16 year olds’ accuracy and quality
in written composition, in: Research evidence in education library (London, Institute of Educa-
tion).
Andrews, R., Torgerson, C., Beverton, S., Freeman, A., Locke, T. & Low, G et al. (2004b) The
effect of grammar teaching (sentence combining) in English on 5 to 16 year olds’ accuracy and
quality in written composition, in: Research evidence in education library (London, Institute of
Education).
Australian Curriculum Assessment and Reporting Authority (2017) Australian Curriculum: English.
Available online at: www.australiancurriculum.edu.au/english/structure (accessed 27 January
2017).
Bateman, D. R. & Zidonis, F. J. (1966) The effect of a study of transformational grammar on the writing
of ninth and tenth graders (Champagne, IL, National Council of Teachers of English).
Bew, P. (2011) Independent review of Key Stage 2 testing, assessment and accountability. Final Report
(London, Department of Education).
Bonell, C., Fletcher, A., Morton, M., Lorenc, T. & Moore, L. (2012) Realist randomised con-
trolled trials: A new approach to evaluating complex public health interventions, Social Science
and Medicine, 75, 2299–2306. British Educational Research Association (2012) Background to Michael Gove’s response to the Report
of the Expert Panel for the National Curriculum Review in England. Available online at: www.be
ra.ac.uk/promoting-educational-research/issues/background-to-michael-goves-response-to-
the-report-of-the-expert-panel-for-the-national-curriculum-review-in-england (accessed 27
January 2017).
Connolly, P. (2015, September) The trials of evidence-based practice in education. Keynote address at
the British Educational Research Association Annual Conference,Queen’s University Belfast.
Crystal, D. (2004) The stories of English (London, Penguin/Allen Lane).
Department for Education (2010) The importance of teaching: The schools White Paper 2010 (Nor-
wich, The Stationery Office).
Department for Education (2013a) The national curriculum in England: Framework document.
December 2014 (London, Department for Education).
Department for Education (2013b) Reform of the national curriculum in England. Report of the consul-
tation conducted February–April 2013 (London, Department for Education).
Department for Education (2013c) Reforming the national curriculum in England. Summary report of
the July to August 2013 consultation on the new programmes of study and attainment targets from
September 2014 (London, Department for Education).
Donner, A. & Klar, N. (2000) Design and analysis of cluster randomization trials in health research
(London, Arnold).
What works in education 1045
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
Economic and Social Research Council (2014) Improving literacy with grammar methods. Available
online at: www.esrc.ac.uk/news-events-and-publications/impact-case-studies/improving-literac
y-with-grammar-methods/ (accessed 27 January 2017).
Education Arena (n.d.) Expert interview with Debra Myhill. Available online at: www.educationare
na.com/searchResults/index.asp?cx=006001425124917276529%3A7n-kwidpf_c&cof=
FORID%3A11&ie=UTF-8&q=hot+topic+debbie+myhill&sa=GO&siteurl=www.educationa
rena.com%2FeducContact.asp&ref=www.google.co.uk%2F&ss=5817j2852731j23 (accessed
27 January 2017).
Education Endowment Foundation (2017) The EEF’s approach to evaluation. Available online at:
educationendowmentfoundation.org.uk/our-work/the-eefs-approach-to-evaluation/ (accessed
27 January 2017).
Elley, W. B., Barham, I. H., Lamb, H. & Wyllie, M. (1976) The role of grammar in a secondary
school English curriculum, Research in the Teaching of English, 10, 5–21. Ellis, S. & Moss, G. (2013) Ethics, education policy and research: The phonics question reconsid-
ered, British Educational Research Journal, 40(2), 241–260. Fearn, L. & Farnan, N. (2007) When is a verb?, Using functional grammar to teach writing, Journal of
Basic Writing, 26(1), 63–87. Fogel, H. & Ehri, L. C. (2000) Teaching elementary students who speak black English vernacular
to write in standard English: Effects of dialect transformation practice, Contemporary Educa-
tional Psychology, 25, 212–235. Goldacre, B. (2013) Building evidence into education (London, Department for Education).
Graham, S. & Harris, K. (2017) Evidence-based writing practices: A meta-analysis of existing
meta-analyses, in: R. Redondo & K. Harris (Eds) Design principles for teaching effective writing
(Leiden, BRILL).
Graham, S. & Perin, D. (2007a) A meta-analysis of writing instruction for adolescent students,
Journal of Educational Psychology, 99(3), 445–476. Graham, S. & Perin, D. (2007b) Writing next: Effective strategies to improve writing of adolescents in
middle and high schools. Report to Carnegie Corporation of New York (Washington, D.C., Alli-
ance for Excellent Education).
Graham, S. & Perin, D. (2007c) What we know, what we still need to know: Teaching adolescents
to write, Scientific Studies of Reading, 11(4), 313–335. Graham, S., McKeown, D., Kiuhara, S. & Harris, K. R. (2012) A meta-analysis of writing instruc-
tion for students in the elementary grades, Journal of Educational Psychology, 104(4), 879–896. Graham, S., Harris, K. & Chambers, A. (2016) Evidence-based practice and writing instruction: A
review of reviews, in: C. MacArthur, S. Graham & J. Fitzgerald (Eds) Handbook of writing
research (2nd edn) (New York, Guilford Press).
Hargreaves, D. (1996) Teaching as a research-based profession: Possibilities and prospects. The Teacher
Training Agency Annual Lecture, April 1996. Available online at: eppi.ioe.ac.uk/cms/Portals/
0/ (accessed 4 April 2016).
Harris, R. J. (1962) An experimental inquiry into the functions and value of formal grammar in the teach-
ing of English, with special reference to the teaching of correct written English to children aged twelve to
fourteen. Ph.D. thesis, University of London.
House of Commons Education Committee (2017) Primary assessment. Eleventh Report of Session 2016– 17. Report, together with formal minutes relating to the report (London, House of Commons).
Johnson, B., Onwuegbuzie, A., de Waal, C., Stefurak, T. & Hildebrand, D. (2017) Unpacking
pragmatism for mixed methods research, in: D. Wyse, N. Selwyn, E. Smith & N. Selwyn (Eds)
The BERA/SAGE handbook of educational research (London, SAGE).
Kaestle, C. (1993) The awful reputation of education research, Educational Researcher, 22(1), 26–31. Mansell, W. (2017, May 9) Battle on the adverbials front: Grammar advisers raise worries about
Sats tests and teaching, The Guardian. Available online at: www.theguardian.com/education/
2017/may/09/fronted-adverbials-sats-grammar-test-primary.
Moore, L., Graham, A. & Diamond, I. (2003) On the feasibility of conducting randomised trials in
education: Case study of a sex education intervention, British Educational Research Journal, 29
(5), 673–689.
1046 D. Wyse & C. Torgerson
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense
Morrison, K. (2001) Randomised controlled trials for evidence-based education: Some problems in
judging ‘what works’, Evaluation & Research in Education, 15(2), 69–83. Myhill, D. & Watson, A. (2014) The role of grammar in the writing curriculum: A review, Journal
of Child Language Teaching and Therapy, 30(1), 41–62. Myhill, D., Jones, S., Lines, H. & Watson, A. (2011) Re-thinking grammar: The impact of embed-
ded grammar teaching on students’ writing and students’ metalinguistic understanding,
Research Papers in Education, 27(2), 139–166. National Governors Association Center for Best Practices Council of Chief State School Officers
(2010) Common Core State Standards (English Language Arts Standards). Available online at:
www.corestandards.org/ELA-Literacy/L/5/ (accessed 27 January 2017).
New ZealandMinistry of Education (2007). The New Zealand Curriculum: For English-medium teach-
ing and learning in years 1–13. O’Hare, F. (1973) Sentence combining: Improving student writing without formal grammar instruction.
No. 15 in a series of research reports sponsored by the NCTE Committee on Research
(Urbana, IL, National Council of Teachers of English).
Pawson, R. & Tilley, N. (1997) Realistic evaluation (London, SAGE).
Saddler, B. & Graham, S. (2005) The effects of peer-assisted sentence-combining instruction on
the writing performance of more and less skilled young writers, Journal of Educational Psychol-
ogy, 97(1), 43–54. Shadish, W., Cook, T. & Campbell, D. (2002) Experimental and quasi-experimental designs for gener-
alized causal inference (Belmont, CA, Wadsworth).
Slavin, R. (2002) Evidence-based education policies: Transforming educational practice and
research, Educational Researcher, 3(7), 15–21. Standards and Testing Agency (2015) 2016national curriculum assessments. Interim teacher assessment
frameworks at the end of key stage 2 (London, Standards and Testing Agency).
Standards and Testing Agency (2016) Key stage 2 English grammar, punctuation and spelling Paper 1:
questions (London, Standards and Testing Agency).
Torgerson, D. & Torgerson, C. (2008) Designing and running randomised trials in health, education
and the social sciences (Basingstoke, Palgrave Macmillan).
Torgerson, D. J., Torgerson, C. J., Mitchell, N., Buckley, H., Ainsworth, H., Heaps, C. & Jeffer-
son, L. (2014) Grammar for writing: Evaluation report and executive summary (London, Educa-
tional Endowment Foundation).
UNESCO (2006) EFA global monitoring report 2006: Education for all (Paris, UNESCO).
United Nations (2017) Sustainable development goals. Available online at: www.un.org/sustainab
ledevelopment/sustainable-development-goals/ (accessed 27 January 2017).
Walters, J. E. (1931) Seniors as counsellors, Journal of Higher Education, 2(8), 446–448. Walters, J. E. (1932) Measuring effectiveness of personnel counselling, Personnel Journal, 11, 227–
236.
Wilby, P. (2017, August 1) The quality of education policymaking is poor, The Guardian. Available
online at: www.theguardian.com/education/2017/aug/01/david-laws-education-policy-schools-
minister-thinktank-epi?CMP=Share_iOSApp_Other.
Wyse, D. (2001) Grammar. For writing?: A critical review of empirical evidence, British Journal of
Educational Studies, 49(4), 411–427. Wyse, D. & Goswami, U. (2008) Synthetic phonics and the teaching of reading, British Educational
Research Journal, 34(6), 691–710. Wyse, D., Baumfield, V., Egan, D., Gallagher, C., Hayward, L., Hulme, M., et al. (2013) Creating
the curriculum (London, Routledge).
Wyse, D., Fentiman, A., Sugrue, C. & Moon, S. (2014) English language teaching and whole school
professional development in Tanzania, International Journal of Educational Development, 38, 59–68. Wyse, D. (2017) How Writing Works: From the Birth of the Alphabet to the Rise of Social Media
(Cambridge, Cambridge University Press).
What works in education 1047
© 2017 British Educational Research Association
14693518, 2017, 6, D ow
nloaded from https://bera-journals.onlinelibrary.w
iley.com /doi/10.1002/berj.3315 by C
alifornia State U niversity, W
iley O nline L
ibrary on [12/11/2022]. See the T erm
s and C onditions (https://onlinelibrary.w
iley.com /term
s-and-conditions) on W iley O
nline L ibrary for rules of use; O
A articles are governed by the applicable C
reative C om
m ons L
icense