PSYCH

profileabc7746
Assessment-TOPIC.pdf

R E S E A R C H A R T I C L E

Child development assessment: Practitioner input in the revision for Griffiths III

Elizabeth M. Green1,2 | Louise Stroud1,2 | Candice Marx2 |

Johan Cronje2

1Association for Research in Infant and Child

Development, Birmingham, UK

2Department of Psychology, Nelson Mandela

University, Port Elizabeth, South Africa

Correspondence

Elizabeth M. Green, Association for Research

in Infant and Child Development,

Birmingham, UK.

Email: [email protected]

Funding information

Association for Research in Infant and Child

Development, Grant/Award Number:

1161043

Abstract

Introduction: The input from practitioners in developmental assessment test revision

is a crucial and leading component of the project. This paper highlights six key phases

of the Griffiths III revision process and the value of having a guiding plan that

includes test practitioner input.

Methods: The revision of the Griffiths III consisted of six separate phases that were

supported by practitioner and user input and feedback. These six phases and practi-

tioner views ensured that the necessary core constructs and new areas for item devel-

opment were included in the revised version. These processes also underscored the

construct development and task review, item design, piloting and standardization of

the revised version, as well as its production, release and subsequent training methods.

Results: The six guiding phases provided a methodologically robust frame to the revi-

sion process. Practitioners valued an overall developmental measure with discrete

data about and within the ‘avenues of learning’ allowing them to analyse a child's

strengths and weaknesses. Communication with practitioners across the world dem-

onstrated the wide disparity of culture and environments that the Griffiths Scales are

deployed in. It is not possible to design a revised scale that is appropriate for all areas

of use, so in this revision process, it was decided to design the scales as culturally fair

as possible and support practitioners in other countries to translate and validate the

scales for use.

Conclusions: The revision of the Griffiths III found test users to be valuable sources

of information on the basis of their experiences with the test and professional knowl-

edge. Creating a continuous feedback mechanism within a phased process provided

opportunities for the revision team to engage meaningfully with the data being

obtained as well as test users to advance the scope and quality of the test. Revision

teams are encouraged to consider the process and engagement methods explored in

this study during their projects.

K E Y W O R D S

child development, early assessment, measurement

Received: 16 March 2018 Revised: 26 June 2020 Accepted: 7 July 2020

DOI: 10.1111/cch.12796

682 © 2020 John Wiley & Sons Ltd Child Care Health Dev. 2020;46:682–691.wileyonlinelibrary.com/journal/cch

1 | INTRODUCTION

According to the International Test Commission (ITC) (2013, 2015,

2013, 2017), test developers should have a guiding plan during test

development. For test revision, test developers are able to draw on

the knowledge and experience of practitioners and to develop and

revise tests that meet the needs of those that employ it in daily prac-

tice (Adams, 2000). If the test has previous versions, a body of

research evidence and feedback from registered practitioners of the

test as well as the test market in general is available to provide insight

into areas of the test that may need amendment.

Tests require revision for a number of reasons including outdated

key test components (Adams, 2000); advances in measurement theory,

psychometric practice and norm development (King, 2006); and

changes in test performance, such as the Flynn effect (Flynn, 1984;

Trahan, Stuebing, Fletcher, & Hiscock, 2014), may reduce the overall

difficulty of tests. In addition, concern about the effects of time on the

validity of interpretations of test results is evident in industry publica-

tions. The European Federation of Psychologists' Associations

(EFPA) (2013a, b) label a test as inadequate if the normative and stan-

dardization information is 20 years or older.

Tests of child development require timely revision. Improved nutri-

tion, health care, child-rearing practices and education are cited as pos-

sible causes for the Flynn effect (Strauss, Spreen, & Hunter, 2000).

Child development is a dynamic, moving target, so a critical look at the

underpinning theories, philosophies and principles of development

ensures that these stay relevant. One challenge when revising or

updating a test is the balance between modernization and retaining the

original spirit of the measure. Another challenge is ensuring that the

test remains fit for purpose. Tests of child development are often stan-

dardized on a population of typically developing children, yet they are

used mostly to assess children whose development is thought to be

atypical in order that the test may discriminate between typical and

atypical development. The shape of the normal distribution curve

(i.e. the bell curve) provides sparse comparison data from typically

developing children at the lower end of the curve (−2 SD to −3 SD). This

means that for the 2.5% of children whose performance falls in the

lower tail of the bell curve, when compared against a norm group of

typically developing children, the specific degree of impairment cannot

be determined with any degree of confidence from the normed data for

the test floor. Including some atypically developing children in the sam-

ple in order to improve the test floor is not appropriate. Leaders Project

(2013), in their test review of the Bayley (2006), concluded that the

inclusion of children with clinical diagnoses in the main standardization

sample had not been helpful, limiting the test's ability to diagnose chil-

dren with mild impairments. It may take years after publication for

research to be conducted with a revised test on specific and non-

heterogeneous clinical populations. Research should draw on samples

where true comparison may be made, and efforts should be made to

produce research that will maximize the usability and generalizability of

findings (Oliveri, Lawless, & Young, 2015).

Test scales are a manifestation of latent constructs and are typi-

cally used to capture a behaviour, a feeling or an action that cannot be

captured in a single variable or item (Boateng, Neilands, Frongillo,

Melgar-Quinonez, & Young, 2018, p. 148). This means that, in terms of

validity, the highest level of evidence of the coverage of a construct is

likely to be offered by experts and practitioners working in the field of

the test. However, in spite of the ITC's (2017) comment on the recipro-

cal relationship called for between test developers and test users, there

is scant evidence in the literature about how test developers can draw

on the knowledge and experience of practitioners. Adams (2000)

described the challenges that can emerge on the basis of an incomplete

understanding of each other's needs and a failure to fully appreciate

the potential contributions of the other. Butcher (2000) saw one of the

greatest values of user feedback as providing awareness of practical

issues to the test developer. Gregory (2015) noted that feedback from

examiners is a potentially valuable source of information that is nor-

mally overlooked by test developers.

In 2011, the Association for Research in Infant and Child

Development (ARICD) started the revision process of the Griffiths

Scales (Griffiths, 1970; 1986). Previous restandardizations had

included clarifications and amendments without modernization. The

two Griffiths Scales, Birth to Two Years (Huntley, 1996) and the

Griffiths Mental Development Scales—Extended Revised (GMDS-ER)

(Luiz et al., 2006) for children 2 to 8 years, had differences in test

organization (e.g. number of subscales) and in scoring. Although

there was a need for a continuous version, it was not clear what

changes would be needed to update the scales to meet current

needs, good test specifications and modern developmental research

findings.

Developmental concerns about a child can arise by a number of

different routes, and further evaluation is often required to identify

potential difficulties that may necessitate intervention or special edu-

cation services (Marlow, 2018; Sharma, 2011). Four comprehensive

standardized assessment measures with different theoretical back-

grounds were in use in 2011 (Bedford, Walton, & Ahn, 2013). The

Batelle Developmental Inventory, Second Edition (BDI-2) (Newborg,

2006) measures a child's progress sequentially along a developmental

continuum of critical skills and behaviours from simple to complex

through both global domains and discrete skill sets. The Bayley Scales

of Infant and Toddler development, Third Edition (Bayley-III) (2006) is

eclectic. It has been developed from a variety of different scales of

infant development and infant and toddler research (Bayley, 2006)

and was formulated on the principle that it measures underlying traits

or latent factors. Confirmatory factor analysis (CFA) demonstrated

construct validity by evaluating relationships between test scores and

different underlying traits/factors. The authors concluded that the

test scores best modelled three underlying traits: motor, language and

cognitive factors (Sun et al., 2019). The Mullen Scales of Early Learn-

ing (MSEL) (1995) have a theoretical foundation based on the con-

cepts of neurodevelopment and intrasensory and intersensory

learning. The Griffiths Scales has five avenues of learning: locomotor,

personal–social, hearing and speech, eye and hand coordination, and

performance. As well as normed comparison against a standardized

population, a child's developmental profile can be produced for dis-

tinct avenues of learning.

GREEN ET AL. 683

This paper highlights six key phases of the Griffiths III revision

process and the value of having a guiding plan that includes test prac-

titioner input.

2 | METHODS

2.1 | Phase 1: Literature review, stakeholder feedback and market research

A comprehensive literature review from the start of the millennium

revealed little focused research on the assessment of children's devel-

opment. Research focused mainly on cognitive development, with

particular emphasis on memory and working memory capacity, speed

of information processing, logical reasoning and attention within this

developmental domain (Best & Miller, 2010; Fuchs et al., 2010; Haden

et al., 2011; Pellegrini, 2009). In addition, there has been an extensive

focus on the social development of children, specifically on parent–

child relationships, attachment theories and children's social judge-

ment. Other important areas are neurodevelopmental milestone

achievement (especially the visual and hearing systems); cognitive

development; working and incidental memory; attention and reason-

ing skills; behavioural manifestations of development; movement and

development; comprehensive understanding of the development of

attention; development; and multimedia literacy (Best & Miller, 2010;

Case, Demetriou, Platsidou, & Kazi, 2001; Demetriou & Kazi, 2006).

The six areas underpinning the original Griffiths Scales (Griffiths,

1954) (viz. locomotor, personal–social, language, eye and hand coordi-

nation, performance and practical reasoning) remain important. The

literature review indicated limited literature, particularly from the

United Kingdom, in relation to the need, cost, efficacy and benefits of

developmental testing including the content of such tests. Nor was

there literature on what is needed to update a developmental test

(Aylward, 2009) (Figure 1).

The GMDS-ER has been translated into Italian, French,

Portuguese, Chinese and Russian since its publication in English in

2006. Research studies since 2010 confirm the use of the Griffiths

Scales in assessing children in special populations such as HIV-

exposed or infected children (Lowick, Sawry, & Meyers, 2012; Perez

et al., 2015; Springer, Laughton, Tomlinson, Harvey, & Esser, 2012),

aboriginal infants (McDonald, Comino, Knight, & Webster, 2012;

McDonald, Webster, Knight, & Comino, 2014), measuring the effects

of various surgical procedures or noxious environments or treatments

(Battaglia et al., 2012; Ebbink, Aarsen, Van Gelder, & Van Den

Hout, 2013; Hemels et al., 2012; Laughton et al., 2012; Ostrea

et al., 2012; Rahkonen et al., 2012; Van Der Aa et al., 2013; Peroni

et al., 2014; Van Dyk, Ramanjam, Church, Koren, & Donald, 2014),

genetic groups such as Duchenne muscular dystrophy (Bargagna,

Bozza, Purpura, & Luongo, 2012; Colombo et al., 2014; Chieffo

et al., 2015; Pane et al., 2012, 2013) and multiple births (Tsekoura,

Beli, Boutopoulou, & Orfanidou, 2012).

In revising the Griffiths Scales, it is likely that they will continue

to be used widely internationally. Consequently, consideration needed

to be given to cultural issues. These may be country specific or related

to low income economies. Leung and Barnett (2008) stated that there

is a great need for culturally sensitive and appropriate psychological

assessment where relevant issues include competence of administra-

tors, test selection, contextual relevance of item content, adaption

and translation, administration, and assessment and interpretation of

performance. It was decided to concentrate on the production of a

developmental test that focused on core, universal aspects of child

development with an initial standardization in the United Kingdom

and Republic of Ireland on children who spoke English at home but

with sufficient cross-cultural sensitivity built in. The Griffiths III can

then be adapted as necessary in different countries and contexts

where it is used. The Guidelines for Translating and Adapting Tests

(2nd ed.) of the ITC (2017) could serve as a valuable resource in the

adaptation of the Griffiths III in various countries and contexts.

FIGURE 1 The GMDS Revision—setting the landscape summarizes the six phases of the revision process. Reproduced with permission from ARICD from Stroud et al. (2016)

684 GREEN ET AL.

A qualitative, exploratory, descriptive research approach was used

to explore the opinions and attitudes of child development specialists in

order to develop a practitioner perspective on the structure and content

of the next version. Descriptive research can be an effective analysis of

nonquantified topics and issues. In this research approach, qualitative

data take the form of text, written words, phrases or symbols describing

or representing people, their experience, actions, thoughts, knowledge

and opinions. In contrast to quantitative research that relies on the use

of statistics and measurements, qualitative research is naturalistic, par-

ticipatory and interpretative (Kerlinger & Lee, 2000). An exploratory

study is relevant because it serves as an exploration of a relatively

unknownresearcharea,thatis, therevisionof theGriffithsScales.

Therefore, in order to clarify what practitioners think the Griffiths

Scales should include in the 21st century, an exploratory qualitative

descriptive approach was chosen with the following aims:

• to establish current thinking regarding child development and its

assessment;

• to consider new constructs needed for developmental assessment

in addition to existing constructs;

• to agree on the geographical area for initial standardization of the

revised version of the Griffiths Scales;

• and to establish a detailed description of the structure and content

of the new revised version of the Griffiths Scales.

There were several stages in the present research. Findings from

each sequential stage influenced the design of subsequent stages, as

shown in Figure 2.

2.1.1 | Stage 1: Avenues of learning workshop

Nineteen experienced practitioners (nine paediatricians, eight psychol-

ogists and two allied health professionals) attended a workshop. The

discussion resulted in qualitative ‘sticky note’ data in response to the

following questions:

FIGURE 2 Flow chart showing sequence of research stages in Phase 1

GREEN ET AL. 685

What are the ideal components of a developmental test for

children?

What is the new child development knowledge that is not

in current tests?

What new advances/knowledge do we not want in devel-

opmental assessment?

What areas of emotional/social development would test

users like included in developmental assessment?

2.1.2 | Stage 2: Qualitative interviews

The revision team used critical analysis of the literature review and

the sticky note workshop data to guide the identification of both

open-ended and individually specific interview questions for nine

expert practitioners, two of whom had attended an earlier workshop.

Two thirds of this expert group had not used the existing Griffiths

Scales and were chosen to provide an opportunity to supplement the

views of regular Griffiths users. The expert practitioners worked in a

range of clinical and research environments and had varied psycholog-

ical, medical and allied health professional training. Open-ended ques-

tions posed to these experts by two of the authors included the

following: why they use or do not use the Griffiths Scales and

whether they identified key constructs or areas in child development

that the Griffiths Scales or other developmental scales are not ade-

quately tapping as important aspects for developmental assessment.

Thematic analysis of interview scripts and synthesis of thematic data

from earlier phases led to the description of core guiding principles to

underlie the development of the new Edition of the Griffiths Scales.

2.1.3 | Stage 3: Questionnaire

A practitioner questionnaire was developed with an open-ended

question format to test the core guiding principles of the Griffiths

Scales and to obtain data on the requirements of these practitioners.

Questionnaires were sent to 432 registered practitioners of the

Griffiths Scales in 17 countries for whom there was an available email

address. The questionnaire consisted of three sections:

Section A required biographical information (name, qualifi-

cation and address) from the respondents and a yes/no

response to a question about current use of the Griffiths

Scales.

Section B listed 20 questions for practitioners currently

using the Griffiths Scales, testing their opinions on the core

guiding principles of the Scales.

Section C listed 5 questions for Griffiths registered

practitioners not currently using the Griffiths Scales, testing

reasons for their non-use of the measure, in particular, and

their opinion on the future role of developmental testing in

general.

With the use of the script completed by the practitioners, itera-

tive analysis was performed for identifiable patterns or themes. The

qualitative data were coded separately per question for the users and

nonusers (Braun & Clarke, 2006). Similar responses were grouped

together into categories. This was done by making use of direct

quotes or interpreting common ideas (Aronson, 1994; Nowell, Norris,

White, & Moules, 2017).

2.2 | Phase 2: Construct development and task review

A team of seven practitioners (four psychologists and three paediatri-

cians) was established to provide the ‘avenues of learning’ with both

individual subscale leadership, that is, the domains which would be the

focus of the new subscales of the revised measure and mechanisms to

provide continuity across the process and maintain financial integrity.

Updated subscale definitions were agreed to inform task development.

Current items were examined according to updated theory and current

clinical practice. Fine-grained analysis of the existing items used under-

pinning constructs derived from the literature review and Phase 1. Gaps

were identified using the constructs and a construct map.

2.3 | Phase 3: Item design

Subscale leaders identified constructs relevant to their subscale,

designed a statement of purpose for that subscale, analysed existing

Griffiths Scale tasks to identify subscale suitability, identified

gaps/overlaps and developed new tasks. Critical markers were

established. Cross-subscale cohesion was checked by the full practi-

tioner team. Percentages of achievement for each age year were cal-

culated for every task using data from the GMDS-ER. Normally, items

having the item difficulty level of 20% to 80% are included in a test

(Boopathiraj & Chellamani, 2013, p. 191), but a cut-off of 10% was

used to ensure that an adequate ceiling was achieved at 6 years.

2.4 | Phase 4: Piloting and standardization

Pilot testing was arranged to check task constructs, gather statistical

information and make further refinements in order to produce a final

standardization version. The test was piloted in South Africa with a

culturally diverse team of practitioners who provided feedback that

allowed for refinement of item instructions and scoring.

686 GREEN ET AL.

For the final standardization sample, a strategy was developed to

accommodate a continuous norming solution. The key descriptive

indicators used to select and classify children were geographic loca-

tion, gender, age, urban–rural and socio-economic status by using the

Indices of Deprivation (Department for Communities and Local

Government, 2014). Targets were set for geographic locations, with

suggested month age targets for each area.

2.5 | Phase 5: Production and release of Griffiths III

The test kit was produced, the record book finalized and three

parts of the manual were drafted and finalized with publication in

July 2016.

2.6 | Phase 6: Training

E-learning modules were set up for a Griffiths III conversion course

for registered Griffiths practitioners and Griffiths III Part I for new

users. A 3-day face-to-face course was designed and refined to pro-

duce a final recommended model. Additional training was provided

(both e-learning modules and face to face) for Griffiths tutors with a

registration process for Griffiths III tutors.

3 | RESULTS

3.1 | Phase 1 results

Stage 1. Nineteen experienced practitioners produced sticky note

data as shown in Table S1. Questions that arose during the

workshop included the following: are feelings and emotions

part of a cognitive scale? should self-help items be included

in a Developmental Quotient? and do we need preterm

scales?

Stage 2. Themes emerging from the expert qualitative interviews are

shown inTable S2.

Table 1 provides details of the Core Guiding Principles identified

with supportive evidence from the literature review, from thematic

analyses of both sticky note workshop data and the qualitative inter-

views with the expert practitioners, from the revision team's expert

clinical knowledge.

TABLE 1 Identified core guiding principles for development of new Griffiths Scales

Identified principle Supportive evidence

1. The core of the GMDS should remain the

core—in other words, it needs to answer the clinician's question of ‘is this child developing like other children?’

Literature review—areas are still the main areas Thematic Analysis

2. The underlying premise of the test should

remain the structured observation of

children using play.

Ruth Griffiths—Free behaviour of children in a semistructured way but with rigorous

control of conditions Thematic Analysis

3. The purpose of the GMDS is to measure

general development.

ARICD revision team Expert Interviews The

Market GapThematic Analysis

4. The breadth of the GMDS remains

important, as one test cannot measure

everything.

ARICD revision team Expert Interviews

Thematic Analysis

5. Specialists for particular contexts can adapt

the test, such as in practice and for

research.

Expert Interviews Thematic Analysis

6. The GMDS must be able to identify ‘flags’ in development, which could be analysed

further.

Sticky Note data ARICD revision team

Thematic Analysis

7. The GMDS must be usable by both the

practitioner and the researcher for their

respective purposes.

Literature Review Expert Interviews

Thematic Analysis

8. The main structure of the test should

remain with the possibility of developing

a supplementary set of GMDS scales that

cover second order factors such as

working memory, processing speed,

attention, and socio-emotional and

behavioural aspects of development.

Expert Interviews Other Psychometric

Measures Thematic Analysis

Abbreviations: ARICD, Association for Research in Infant and Child Development; GMDS, Griffiths Mental Development Scales.

GREEN ET AL. 687

Stage 3. Completed questionnaires were received from 85 respon-

dents (a 20% response rate). All respondents were paediatri-

cians or psychologists: 52 used the Griffiths Scales regularly

(regular users), and 33 did not use the scales in their current

work (nonusers). Respondents worked in 15 different coun-

tries in Europe, Australasia, Africa and Asia.

Regular users listed its use for developmental assessment; assess-

ment of special groups; diagnosis, intervention, planning and monitor-

ing; assessment linked to the school context; and training, research,

supervision and clinical trials. There was a balance of opinion (yes = 19,

no = 16, no clear response = 15) from regular users of the Griffiths

Scales on the need to incorporate more screening elements. Nonusers

cited practical reasons such as insufficient time allocated by managers;

work in inappropriate service; and wrong age range of the scales.

Both regular users and nonusers were asked: ‘Will there be a

place in your professional work for developmental testing (as distinct

from cognitive or physical testing) in future?’ A large proportion of

both regular users of the scales (46 of the 51; 90%), and nonusers

(21 of the 29; 72%) stated that a place remained for developmental

testing in their work. The overarching themes from the questionnaire

analysis were as follows:

To revise the Griffiths Scales according to a criterion

referenced (CRT) construction process with a subsequent

normative approach.

To reduce the test ceiling from 8 to 6 years.

To retain the traditional Griffiths Scales assessment for-

mat of structured observation of children at play.

To return to the original Griffiths Scales age group

structured format.

To merge two subscales and create a new subscale of

cognitive functioning.

To incorporate memory tasks across all subscales.

3.2 | Phase 3 results

An experimental version of the new scales was constructed.

3.3 | Phase 4 results

The final normative and standardization sample comprised 426 chil-

dren from the United Kingdom and Republic of Ireland. Further details

are included in Stroud et al. (2016). Raw scores obtained on Griffiths

III were transposed to scaled scores, developmental quotients (per

subscale as well as a general developmental quotient), percentiles, sta-

nines and developmental age equivalent (per raw score).

4 | DISCUSSION

Phase 1 of this revision was designed so that the test users could

have the opportunity to have a reciprocal relationship with the test

development team, as recommended by the ITC (2017). By fostering

this relationship, test developers are able to draw on the knowledge

and experience of users and to develop and revise tests that meet

the needs of those that employ it in daily practice (Adams, 2000). In

later phases, the test development team members were both practi-

tioners and previous tutors of the Griffiths Scales. The results from

the Avenues of Learning practitioner workshop and the qualitative

interviews with the expert practitioners provided a wide range of

possible future inclusions for the Griffiths III. Some of these data

assisted in the clarification of the necessary core constructs, as well

as new areas for item development. Practitioners valued an overall

developmental measure with discrete data about and within the

‘avenues of learning’, allowing them to analyse a child's strengths

and weaknesses.

A number of broad, fundamental questions were examined by

the practitioners in the research. Do the Griffiths Scales tap into

the ‘right’ child development areas? How ‘big’ (out of the box) should

the thinking be in restandardizing the Scales? The inclusion of prac-

titioners in all parts of a scale revision is unusual and is likely to

add to the test's validity. Practitioners clarified the need for for-

mal assessment of social and emotional development and also a

reduction as much as possible of scoring relying on behaviour

which was reported rather than observed. The revision should set

the Griffiths Scales apart from other developmental assessment

measures and retain its unique quality valued by practitioners as

they have been involved throughout the revision process. Practi-

tioners offered valuable input on the sensitivity and specificity to

identify where development deviates from the norm. It is impor-

tant to recognize that once developmental tasks have been identi-

fied and established, and once sensitive specificity has been built

into the Griffiths Scales, a balance between these two variables

had to be achieved. This ensures that the developmental nature

of the Griffiths Scales is retained. The Griffiths Scales is in

essence a ‘child-friendly’ developmental measure, based on the

skill and value of observing children, and is playful in nature. It is

these attributes that are likely to have rendered it the choice of

developmental assessment for children, especially those with clini-

cal diagnoses who are threatened by a formal, rigid testing situa-

tion (Ebbink et al., 2013).

Communication with practitioners internationally demonstrated

the wide disparity of culture and environments that the Griffiths

Scales are deployed in. It was not possible to design a revised scale

appropriate for all areas of use, so it was decided to design the scales

as culturally fair as possible and support practitioners in other coun-

tries to translate and validate the scales for use. As well as continuing

statistical work such as test/retest stability and comparisons with

other scales of development and early child cognition, a follow-up

feedback on the Griffiths III by practitioners is necessary to analyse

the extent to which the Core Guiding Principles delineated in Phase

688 GREEN ET AL.

One have been upheld in the design and production of the

Griffiths III.

The revision of the Griffiths III demonstrated the usefulness of

having six guiding phases and the repeated involvement of practi-

tioners. These phases provided opportunities for meaningful engage-

ment to advance the scope and clinical quality of the test. Revision

teams are encouraged to consider the engagement methods explored

in this study during their projects, as there is little evidence in the lit-

erature of practitioner involvement in this way.

Of the four comprehensive standardized assessment measures

in use in 2011 (Bedford et al., 2013); the BDI-2, the MSEL and the

Bayley-III all have single authorship cited and details of the develop-

ment team are not readily available. The Griffiths Scales, however,

have a history of multiple authorship with the copyright owned by a

charitable learned society (ARICD) and revisions in conjunction with,

rather than by, a publishing company. The core development team

was all experienced practitioners and trainers of the scales. The

usual practice in test revision is to have a mixed skill technical group

managing the revision with input from practitioners taken at an

earlier stage.

4.1 | Strengths and limitations

A major strength of the phased development process of the

Griffiths III was the time spent in Phase 1 to clarify what practitioners

thought the Griffiths Scales should include in the 21st century. The

exploratory qualitative descriptive approach provided effective

analysis of the nonquantified opinions and attitudes of child

development specialists and was an excellent basis for the subsequent

phases. The questionnaire sent to current users and those no longer

using the scales provided guidance for the decisions that the core

development team made in subsequent phases and the retention of

five separate subscales, which led into both clinical and education pro-

gramme planning.

Five of the limitations of this approach include the following:

areas of weakness in the design of the questionnaire (e.g. two of the

questions in particular were not well understood by most of the par-

ticipants and were left unanswered), the questionnaire not allowing

for clarification when the participants made comments that were

vague or unclear, the lack in facility to attract the input from practi-

tioners who are already busy in their professional fields, the question-

naire sample size may have affected generalizability and the limited

availability of time from expert technical support. In addition, further

pilot testing in other countries may have added further information.

4.2 | Summary

The design of the Griffiths III has been guided by the six phases and

involved practitioners internationally throughout the revision process.

The authors are confident that it will continue to retain its unique fea-

tures while also proving fit for the 21st century.

Key messages

• Practitioners' view was incorporated into the design of

the Griffiths III to increase applicability, usability and

validity.

• The child-friendly developmental measure remains

focused on the structured observation of children.

• Modern knowledge of child development and its assess-

ment has been incorporated.

• Sensitive specificity to pick up deviation from the norm is

essential.

• Six guiding phases provide a useful plan to test revision

projects.

CONFLICT OF INTERESTS

None.

ACKNOWLEDGEMENTS

We thank the Association for Research in Infant and Child Develop-

ment (charity no. 1161043; which owns the copyright of the Griffiths

Scales) for the free use of expert time in all phases of this research

and for funding the expenses of the study. The external experts in

Phases 1 and 2 played a vital role in this research. We thank also Pro-

fessor Mark Watson for his help with this draft and Professor Cheryl

Foxcroft for her major role in the revision for Griffiths III.

ORCID

Elizabeth M. Green https://orcid.org/0000-0003-2326-073X

Louise Stroud https://orcid.org/0000-0002-3627-4121

Johan Cronje https://orcid.org/0000-0003-0662-7384

REFERENCES

Adams, K. M. (2000). Practical and ethical issues pertaining to test revi-

sions. Psychological Assessment, 12, 281–286. https://doi.org/10. 1037/1040-3590.12.3.281

Aronson, J. (1994). A pragmatic view of thematic analysis. The Qualitative

Report, 2(1), 1–3. Retrieved from http://www.novaedu. ssss/QR/BackIssues/QR2-1/aronso.html

Aylward, G. P. (2009). Developmental screening and assessment: What are

we thinking? Journal of Developmental and Behavioral Pediatrics, 30,

169–173. https://doi.org/10.1097/DBP.0b013e31819f1c3e Bargagna, S., Bozza, M., Purpura, G., & Luongo, T. (2012). Effect of early

intervention in Down syndrome: A pilot study in young infants.

Archives of Disease in Childhood, 97(Suppl. 2), A490–A491. https://doi. org/10.1136/archdischild-2012-302724.1736

Battaglia, D., Massimi, L., Brogna, C., Losito, E., Pravatà, E., Capone, F., … Rocco, C. (2012). Hemispherotomy in patients with epileptic encepha-

lopathy with continuous spike-waves during sleep and prenatal post-

haemorrhagic hydrocephalus: A pre-and postsurgical neuro-functional

study. Epileptic Disorders, 14, 205–206.

GREEN ET AL. 689

Bayley, N. (2006). Bayley scales of toddler and infant development (Third ed..

Administrative manual). San Antonio, TX: Pearson.

Bedford, H., Walton, S., & Ahn, J. (2013). Measures of child development: A

review, London: UCL Institute of Child Health.

Best, J. R., & Miller, P. H. (2010). A developmental perspective on execu-

tive function. Child Development, 81, 1641–1660. https://doi.org/10. 1111/j.1467-8624.2010.01499.x

Boateng, G. O., Neilands, T. B., Frongillo, E. A., Melgar-Quinonez, H. R., &

Young, S. L. (2018). Best practices for developing and validating scales

for health, social and behavioural research: A primer. Frontiers in Public

Health, 6(149), 1–18. https://doi.org/10.3389/fpubh.2018.00149 Boopathiraj, C., & Chellamani, K. (2013). Analysis of test items on difficulty

level and discrimination index in the test for research in education.

International Journal of Social Science & Interdisciplinary Research, 2(2),

189–193. Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qual-

itative Research in Psychology, 3, 77–101. https://doi.org/10.1191/ 1478088706qp063oa

Butcher, J. N. (2000). Revising psychological tests: Lessons learned from

the revisions of the MMPI. Psychological Assessment, 12, 263–271. https://doi.org/10.1037/1040-3590.12.3.263

Case, R., Demetriou, A., Platsidou, M., & Kazi, S. (2001). Integrating con-

cepts and tests of intelligence from the differential and developmental

traditions. Intelligence, 29, 307–336. https://doi.org/10.1016/S0160- 2896(00)00057-X

Chieffo, D., Brogna, C., Berardinelli, A., D'Angelo, G., Mallardi, M.,

D'Amico, A., … Pane, M. (2015). Early neurodevelopmental findings predict school age cognitive abilities in Duchenne muscular dystrophy:

A longitudinal study. PLoS ONE, 10(8), 1–7. https://doi.org/10.1371/ journal.pone.0133214

Colombo, P., Civati, F., Mani, E., Gandossini, S., Brighina, E., Comi, G. P.,

Bresolin, N., Turconi, A. C., Molteni, M., Nobile, M., D'Angelo, M. G.,

Medea, E. & Bosisio, P. (2014). Behavioral and neurocognitive profile

in Duchenne Muscular Dystrophy. Neuromuscular Disorders, 24(9-10),

858–858. Demetriou, A., & Kazi, S. (2006). Self-awareness in g (with processing effi-

ciency and reasoning). Intelligence, 34, 297–317. https://doi.org/10. 1016/j.intell.2005.10.002

Department for Communities and Local Government. (2014). English indi-

ces of deprivation. London: Department of Communities and Local

Government.

Ebbink, B., Aarsen, F., Van Gelder, C., & Van Den Hout, J. (2013). Cognitive

outcome of classic infantile pompe patients receiving enzyme therapy.

BMC Musculoskeletal Disorders, 14(suppl. 2), 14. https://doi.org/10.

1186/1471-2474-14-S2-P14

European Federation of Psychologists' Associations (EFPA). (2013a).

EFPA review model for the description and evaluation of

psychological and educational tests: Test review form and notes

for reviewers (Version 4.2.6). Retrieved from http://www.efpa.

eu/download/650d0d4ecd407a51139ca44ee704fda4

European Federation of Psychologists' Associations (EFPA). (2013b).

Performance requirements, context definitions and knowledge &

skill specifications for the three EFPA levels of qualifications in

psychological assessment. Retrieved from http://www.efpa.

eu/download/1b272a998e297c248413fbb761134697

Flynn, J. R. (1984). The mean IQ of Americans: Massive gains 1932–1978. Psychological Bulletin, 95, 29–51. https://doi.org/10.1037/0033-2909. 95.1.29

Fuchs, L., Geary, D., Compton, D., Fuchs, D., Hamlett, C., & Bryant, J.

(2010). The contributions of numerosity and domain-general abilities

to school readiness. Child Development, 81, 1520–1533. https://doi. org/10.1111/j.1467-8624.2010.01489.x

Gregory, R. J. (2015). Psychological testing: History, principles, and applica-

tions (7th ed.). Harlow, UK: Pearson.

Griffiths, R. (1986). The abilities of babies. A study in mental measurement.

High Wycombe, Bucks: The Test Agency Ltd.

Griffiths, R. (1970). The abilities of young children. London, England: Child

Development Research Centre.

Griffiths, R. (1954). The Abilities of Babies. New York, NY: McGraw-Hill.

Haden, C., Ornstein, P., O'Brien, B., Elischberger, H., Tyler, C., &

Burchinal, M. (2011). The development of children's early memory

skills. Journal of Experimental Child Psychology, 108, 44–60. https://doi. org/10.1016/j.jecp.2010.06.007

Hemels, M., Nijman, J., Leemans, A., van Kooij, B., van den Hoogen, A.,

Benders, M., … Groenendaal, F. (2012). Cerebral white matter and neurodevelopment of preterm infants after coagulase-negative staph-

ylococcal sepsis. Archives of Disease in Childhood, 97(suppl. 2),

678–684. https://doi.org/10.1097/PCC.0b013e3182455778 Huntley, M. (1996). Griffiths Mental Development Scales from birth to

2 years—Manual. Oxford, UK: ARICD. https://doi.org/10.1037/ t03301-000

International Test Commission. (2015). Guidelines for practitioner use of

test revisions, obsolete tests, and test disposal. Retrieved from

https://www.intestcom.org/files/guideline_test_disposal.pdf

International Test Commission. (2017). The ITC Guidelines for Translating

and Adapting tests (2nd ed.). Retrieved from https://www.InTest.org

Kerlinger, F. N., & Lee, H. B. (2000). Foundations of Behavioral Research.

Orlando, FL: Harcourt College Publishers.

King, M. C. (2006). Adopting revised versions of psychological tests. The

CAP Monitor, 23, 6–7. Laughton, B., Cornell, M., Grove, D., Kidd, M., Springer, P., Dobbels, E., …

Cotton, M. (2012). Early antiretroviral therapy improves neu-

rodevelopmental outcomes in infants. AIDS (London, England), 26,

1685–1690. https://doi.org/10.1097/QAD.0b013e328355d0ce Leaders Project. (2013). Test Review: Bayley III. New York, NY: Teachers

College, Columbia University.

Leung, C. V. V., & Barnett, J. E. (2008). Multicultural assessment and ethi-

cal practice. The Colorado Psychologist. Retrieved from: http://www.

geocities.ws/dr_

charlton/MulticulturalAssessmentandEthicalPractice.pdf

Lowick, S., Sawry, S., & Meyers, T. (2012). Neurodevelopmental delay

among HIV-infected preschool children receiving antiretroviral therapy

and healthy preschool children in Soweto, South Africa. Psychology,

Health & Medicine, 17, 599–610. https://doi.org/10.1080/13548506. 2011.648201

Luiz, D. M., Faragher, B., Barnard, A., Knoesen, N., Kotras, N.,

Burns, L. E., & Challis, D. (2006). GMDS-ER: Griffiths mental develop-

ment scales—Extended revised analysis manual. Oxford, UK: Hogrefe— TheTest Agency Ltd.

Marlow, N. (2018). Outcomes of preterm birth and evidence synthesis.

Developmental Medicine and Child Neurology, 60, 330–330. https://doi. org/10.1111/dmcn.13672

McDonald, J., Comino, E., Knight, J., & Webster, V. (2012). Developmental

progress in urban Aboriginal infants: A cohort study. Journal of Paediat-

rics and Child Health, 48, 114–121. https://doi.org/10.1111/j.1440- 1754.2011.02067.x

McDonald, J., Webster, V., Knight, J., & Comino, E. (2014). The Gudaga

study: Development in 3-year-old urban Aboriginal children. Journal of

Paediatrics and Child Health, 50, 100–106. https://doi.org/10.1111/ jpc.12476

Newborg, J. (2006). Batelle Developmental Inventory, Second Edition (BDI-

2). Boston: Riverside Publishing.

Mullen, E. M. (1995). Mullen Scales of Early Learning. San Antonio, TX:

Pearson.

Nowell, L. S., Norris, J. M., White, D. E., & Moules, N. J. (2017). Thematic

analysis: Striving to meet the trustworthiness criteria. International

Journal of Qualitative Methods, 16, 1–13. https://doi.org/10.1177/ 1609406917733847

690 GREEN ET AL.

Oliveri, M. E., Lawless, R., & Young, J. W. (2015). A validity framework for

the use and development of exported assessments. Princeton, NJ: Educa-

tional Testing Service.

Ostrea, E., Reyes, A., Villanueva-Uy, E., Pacifico, R., Benitez, B., Ramos, E.,

… Ager, J. (2012). Fetal exposure to propoxur and abnormal child neu- rodevelopment at 2 years of age. Neurotoxicology, 33, 669–675. https://doi.org/10.1016/j.neuro.2011.11.006

Pane, M., Berardinelli, A., D'Angelo, G., Ricotti, V., Baranello, G.,

Morandi, L., … Mercuri, E. (2012). Early neurodevelopmental findings in young children with Duchenne muscular dystrophy. Neuromuscular

Disorders, 22, 886–886. https://doi.org/10.1016/j.nmd.2012.06.274 Pane, M., Scalise, R., Berardinelli, A., D'Angelo, G., Ricotti, V., Alfieri, P., …

Mercuri, E. (2013). Early neurodevelopmental assessment in Duchenne

muscular dystrophy. Neuromuscular Disorders, 23, 451–455. https:// doi.org/10.1016/j.nmd2013.02.012

Pellegrini, A. (2009). Research and policy on children's play. Child Develop-

ment Perspectives, 3, 131–136. https://doi.org/10.1111/j.1750-8606. 2009.00092.x

Perez, E., Carrara, H., Bourne, L., Berg, A., Swanevelder, S., &

Hendricks, M. (2015). Massage therapy improves the development of

HIV-exposed infants living in a low socio-economic, peri-urban com-

munity of South Africa. Infant Behavior & Development, 38, 135–146. https://doi.org/10.1016/j.infbeh.2014.12.011

Peroni, E., Vigone, M. C, Mora, S., Bassi, L. A., Pozzi, C., Passoni, A., &

Weber, G. (2014). Congenital hypothyroid treatment in infants: a com-

parative study between liquid and tablet formulations of lev-

othyroxine. Hormone Research in Paediatrics, 81(1), 50–54. Rahkonen, P., Heinonen, K., Lano, A., Räikkönen, K., Metsäranta, M.,

Andersson, S., & Pesonen, A. (2012). Mother–child interaction is asso- ciated with developmental outcome in extremely low gestational age

children. Archives of Disease in Childhood, 97(suppl. 2), A361–A361. https://doi.org/10.1136/archdischild-2012-302724.1264

Sharma, A. (2011). Developmental examination: Birth to 5 years. Archives

of Disease in Childhood. Education and Practice Edition, 96, 162–175. https://doi.org/10.1136/adc.2009.175901

Springer, P., Laughton, B., Tomlinson, M., Harvey, J., & Esser, M. (2012).

Neurodevelopmental status of HIV-exposed but uninfected children:

A pilot study. South African Journal of Child Health, 6(2), 51–55. Strauss, E., Spreen, O., & Hunter, M. (2000). Implications of test revisions

for research. Psychological Assessment, 12, 237–244. https://doi.org/ 10.1037/1040-3590.12.3.237

Stroud, L., Foxcroft, C., Green, E., Bloomfield, S., Cronje, J., Hurter, K., … Venter, D. (2016). Griffiths scales of child development 3rd ed. Part I:

Overview, development and psychometric properties. Oxford, UK:

Hogrefe.

Sun, L., Sabanthan, S., Thanh, P. N., Kim, A., Doa, T. T. M., Thwaites, C. L.,

… Wills, B. (2019). Bayley III in Vietnamese children: Lessons for cross- cultural comparisons. Wellcome Open Research, 4, 1–13. https://doi. org/10.12688/wellcomeopenres.15282.1

Trahan, L. H., Stuebing, K. K., Fletcher, J. M., & Hiscock, M. (2014). The

Flynn effect: A meta-analysis. Psychological Bulletin, 140, 1332–1360. https://doi.org/10.1037/a0037173

Tsekoura, E., Beli, A., Boutopoulou, B., & Orfanidou, I. (2012).

Neurodevelopmental outcome of triplets after in vitro fertilization

or natural conception. Archives of Disease in Childhood, 97(suppl. 2),

A356–A356. https://doi.org/10.1136/archdischild-2012-302724. 1246

Van Der Aa, N., Van Buuren, L., Dekker, H., Vermeulen, R., Van

Nieuwenhuizen, O., Van Schooneveld, M., & De Vries, L. (2013). Cog-

nitive outcome in childhood following unilateral perinatal brain injury.

European Journal of Paediatric Neurology, 17(suppl. 1), S25–S25. https://doi.org/10.1016/S1090-3798(13)70082-5

Van Dyk, J., Ramanjam, V., Church, P., Koren, G., & Donald, K. (2014).

Maternal methamphetamine use in pregnancy and long-term

neurodevelopmental and behavioral deficits in children. Journal of

Population Therapeutics and Clinical Pharmacology, 21, e185–e196.

SUPPORTING INFORMATION

Additional supporting information may be found online in the

Supporting Information section at the end of this article.

How to cite this article: Green EM, Stroud L, Marx C, Cronje J.

Child development assessment: Practitioner input in the

revision for Griffiths III. Child Care Health Dev. 2020;46:

682–691. https://doi.org/10.1111/cch.12796

GREEN ET AL. 691

This document is a scanned copy of a printed document. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material.

  • Child development assessment: Practitioner input in the revision for Griffiths III
    • INTRODUCTION
    • METHODS
      • Phase 1: Literature review, stakeholder feedback and market research
        • Stage 1: Avenues of learning workshop
        • Stage 2: Qualitative interviews
        • Stage 3: Questionnaire
      • Phase 2: Construct development and task review
      • Phase 3: Item design
      • Phase 4: Piloting and standardization
      • Phase 5: Production and release of Griffiths III
      • Phase 6: Training
    • RESULTS
      • Phase 1 results
      • Phase 3 results
      • Phase 4 results
    • DISCUSSION
      • Strengths and limitations
      • Summary
    • CONFLICT OF INTERESTS
    • ACKNOWLEDGEMENTS
    • REFERENCES