Evaluation Research Report

profilejhh19970724
Evaluationcritiquesample1.pdf

1

Assignment Cover Sheet

Program: BSSAS - BSocSc Applied Sociology

Course Code: SS3422

Course Title: Programme Evaluation

(Student Number) (51588988)

Lecturer: CHAN Wing Tai JP

Submission Date: 30 April, 2010

A Critical Review on “An Evaluation Report on Third

Path Man Services Batterer Intervention Program”

2

Logic Model

An Evaluation Report on Third Path Man Services Batterers Intervention Programme

*URICA-DV: University of Rhode Island Change Assessment – Domestic

violence, which is developed to assess the readiness of batterers to end their

violence behaviors.

Inputs Activities Outputs Outcomes

(Immediate)

No one showed

regression at the

stage of

measurement.

19 of the 21

remaining

participants showed

progression.

14 among the 19

participants, who

showed progression,

reached the highest

stage of URICA-

DV*.

Average number of

stage the 21

participants

progressed was

2.72.

49 male batterers

joined the programme.

28 of them stayed for

less than 3 months in

the programme.

21 stayed in the

programme for more

than 3 months.

5 participants of the

remaining 21 stayed

for 6 months.

2 stayed of the

remaining 21for more

than 10 months.

Average duration of

intervention for 21

participants was 4.8

months.

Individual meeting - Social

worker met with male batterers to

discuss their problems.

Counseling service - according

to each batterer’s particular need.

Group meeting - participants

were invited to join voluntarily.

Contents of the individual and

group meeting include:

1. Knowledge of domestic violence

2. Beliefs and myths contributing to domestic

violence

3. Awareness of gender beliefs and stereotypes

4. Anger management skills 5. Effective communication

skills

6. Problem-solving skills

Staff and social

workers – from

Harmony House

Funding – from

Hong Kong Jockey

Club

Time – 14 months

Partner – The

Hong Kong

Polytechnic

University for the

evaluation section.

3

Chan, Y. C. (2007). An evaluation report on third path man services batterers intervention

program. Hong Kong: Harmony House Limited

A critique on “An Evaluation Report on Third Path Man Services Batterers

Intervention Program”

Abstract

The Evaluation Report on Third Path Man Services Batterers Intervention Programme was

conducted from 2004 to 2005 and published in 2007. Several problems of the evaluation

report was recognized: ambiguous objectives and usage of the evaluation; lacking indicator or

standards on the successful level of the participants; and high dropout rate. Need assessment,

process evaluation, factorial evaluation design, survey, in-depth interview were suggested to

make improvement on the programme. The reliability of the evaluation report can be

strengthened by reporting the alpha value or test-retest result; and the internal validity of the

evaluation report should be taken care of by keeping logbook or journal of the participants.

4

Introduction

Programme details

The Batterers Intervention Programme aimed at stopping male batterers’ violence behaviors

on their partners. At first, the programme was put forth in 2000 with a three-year pilot project

from 2000 to 2003. An evaluation on the pilot project was conducted in 2004. The result of

the evaluation showed the success and usefulness of the pilot project. Thus, the project was

extended for another two year, from 2004 to 2005. Data in this report were mainly from 2004

to 2005.

Male batterers who came to the service would meet with the social worker would meet with

the social worker individually to discuss their problems. Then, counseling service would be

rendered according to each batterer’s needs. Lastly, participants were invited to join group

meeting voluntarily. The contents of the individual and group meeting included (1)

Knowledge of domestic violence, (2) Beliefs and myths contributing to domestic violence, (3)

Awareness of gender beliefs and stereotypes, (4) Anger management skills, (5) Effective

communication skills and (6) Problem-solving skills.

Evaluation Background

Dr. Chan Yuk Chung, the assistant professor of Department of Applied Social Sciences, The

Hong Kong Polytechnic University, conducted the evaluation on “The Third Path - Man

Services – Batterers Intervention Programme”. The evaluation aimed at assessing the changes

made by the participants before, during and at the end of the programme. All data collected

and evaluated were from November 2004 to December 2005.

Evaluation target

Male batterers who joined the Batterer Intervention Program from November 2004 to

December 2005 could join the evaluation voluntarily. No decision makers, social workers or

staff involved in the intervention were interviewed nor evaluated.

Measuring instrument

The Transtheoretical Model of Change was used. It provided a framework for the evaluation.

This model also showed directional change of the services users and thus, provided a clear

5

effect of the programme. Based on this model, a 20-item University of Rhode Island Change

Assessment – Domestic Violent (URICA - DV) developed by Levesque, Gelles & Velicer

was used. 7 stages of progression were recognized as (1) reluctant, (2) immotive, (3) non-

reflective action, (4) unprepared action, (5) pre-participation, (6) decision-making and (7)

participation. Stages (1) to (3) are grouped as earlier stage and stages (5) to (7) are grouped as

advanced stage.

Baseline was set by two raters, rating the score of every participant before the treatment

started. The inter-rater reliability was 80%.

Data collection

Before joining the treatment, participants had to complete the 20-item Chinese version of

URICA-DV. At the end of the 3 rd

and 6 th

month, participants had to complete the same

questionnaire for the second and last assessment. 49 participants completed the URIC-DV

voluntarily. 28 of them had completed the scale once before leaving the programme. 21 of

them who stayed for more than 3 months had completed the scale at least once.

Output

49 male batterers joined the programme, 28 of them stayed in the programme for less than 3

months and the remaining 21 stayed more than 3 months. Out of the 21 participants, 5 stayed

for 6 months and 2 stayed for more than 10 months. The average duration of intervention for

the 21 participants was 4.8 months.

Outcomes

No one showed regression at the stage of measurement. 19 out of 21 participants showed

progression. 14 of them even reached the highest stage of URICA-DV, in which stating that

participants are actively working on ending their violence action. The average number of

stage the 21 participants progressed was 2.72.

(Logic Model to be inserted here)

6

Critique

Objective of the evaluation report

This was an outcome evaluation. It focused on the extent the outcomes have achieved and

gave every detail on it. It also emphasized on the effectiveness of the programme on the

service users. Although the evaluator did not mention any reasons for conducting the

evaluation, there seems to have two latent reasons: to fulfill accreditation requirement and

account for funds.

This Batterers Intervention Programme had been carried out in 2000 to 2004; coming to its

second phase (from 2004 to 2005), showing its effectiveness may serve as an indicator for

decision makers and sponsors to continue offering support and funds to the programme. The

original objective of the evaluation report is to show the improvement made by the

intervention programme to its service users. Yet, when the latent reasons outweigh the main

objective, the evaluation report becomes the means to determine the survival of the

programme. The effectiveness and usefulness of the evaluation become very doubtful and

may be biased.

Stufflebeam’s advocated that the purpose of evaluation is not to prove the effectiveness of the

programme, but to improve it (陳永泰, 1991, p.44). This evaluation focused too much on

proving the effectiveness of the programme to its services users and ignoring other problems

which improvement should be made. Although the evaluation demonstrated the changes of

the services users after joining the programme, it did not collect comments or opinions from

the decision makers, staff, social workers and even the services users on the contents of the

programme. This showed that the evaluation only proved the effectiveness of the programme

but failed to recognize any content or procedure problem of the programme.

Also, the usage of the evaluation is not clearly defined. It seems like the evaluation was

conducted based on the needs the interested groups like the sponsors, not the main services

users. Since the evaluation showed the effectiveness of the programme so that it could gain

support or fund again and the programme can be extended in the future. This could make the

evaluation inadequate by obtaining favorable information and hiding its weaknesses. The

7

objectives of the evaluation and the main users of the evaluation report should be clearly

defined.

Measuring instrument

The reliability of the Chinese version of the URICA-DV was not mentioned. Without showing

the reliability of the scale, the internal consistency of the scale is not known and it is hard to

determine whether the items in the scale are reliable. The alpha value or the test-retest value

should be included.

Baseline was set by two raters, which was important and useful when comparing the result of

the participants in later stage. However, no standards or indicators were made for the success

level. For a better evaluation, there should have at least an indicator stating what should be

achieved by when. The application SMART goal helps determine the progress of the

participants, whether they are above or below standards. SMART GOAL stated that any goal

or objective should be specific, measurable, achievable, relevant and within a given time

bound. Without it, the success level will be as ambiguous as its objective – participants would

show progression and make no regression. Since the time bound and the extent of standard

progression are not mentioned. It only benefits the evaluator by allowing him to give flexible

explanation on the evaluation outcomes but failed to give a deeper understanding on the

progress of the participants.

Data collection

This is a quasi-experimental, Single Group Time Series Design, in which data were collected

in different condition, the progress of the participants could be observed and compared easily.

However, referring to the MAXMINCON principle, this evaluation design failed to minimize

the error variance and control extraneous variance. Without control group, the power to detect

the real impact of the programme on its services users is weak, and the extraneous variance,

such as religious background and personality of the participants, cannot be ruled out.

Individual variance or third variables may change the participants’ attitude or behaviors,

instead of the programme. It is difficult to determine whether the improvement of the

participants is really due to the programme. Its small sample size and high mortality rate

(57%) also make the conclusion less convincing and sound.

8

Besides, there are other threats to its internal validity such as repeated testing, history,

maturation and regression to mean:

Some participants completed the scale for three times or even four times in this evaluation, it

is possible that participants may give out expected answers in the later stage of assessment

after gaining familiarity in the testing area or being aware of the evaluation purpose. Biased

answers may be obtained.

History refers to the possibility that the changes of the participants are made by other current

events rather than the intervention itself. Since the participants were not totally isolated in a

testing environment, other events such as the encouragement of his family members or

recently joining any religious group can cause the improvement of the participants, instead of

the programme. An effective use of logbook or journal, such as asking the participants to

write down their relationship with their partners once a week and record down any special

event happened between them, can eliminate this threat.

In addition, participants’ changes of behaviors or attitudes may be caused by their normal

development such that they learned their lesson and were determined to make changes. This is

the maturation of the participants which makes the results in the evaluation report unrelated to

the programme effectiveness.

Last but not least, regression to mean affects the significance of the evaluation. Since all the

49 participants had battering record before joining the programme. It is not surprised to see

them making changes. Thus, the intervention may just play one part in improving

participants’ attitude and behaviors but giving all the credits to the programme may make the

evaluation doubtful. Random assignment of the participants into experimental group and

control group can help rule out the maturation and regression to mean effect. However, the

small sampling size may make the process more difficult to carry out.

Outputs and Outcomes

Among 21 participants stayed for more than 3 months, 5 stayed for more than 6 months and 2

stayed for more than 10 months. The evaluation makes no explanation on this phenomenon.

Since no standard was made, the appropriate treatment period was not defined. Thus, it cannot

be concluded that the longer the participants stayed in the programme, the better or vice versa.

9

The high dropout rate (57%) remains the most critical problem in this evaluation. It may

indicate the weakness of the programme and imply for the needs of amendment. Most of the

participants leave the programme after reaching the highest stage - “participation”. It is logical

that participants will leave the programme when they perceive themselves as “recovered” or

“cured”. These participants progressed faster than others and could complete the programme

within 3 months. On the other hand, there were 5 participants who stayed in the programme

for more than 6 months, the great variation among them makes it hard to look for patterns or

characteristics of the participants.

The usefulness of the contents of the individual and group meeting was also questioned. The

topics included: (1) Knowledge of domestic violence, (2) Beliefs and myths contributing to

domestic violence, (3) Awareness of gender beliefs and stereotypes, (4) Anger management

skills, (5) Effective communication skills and (6) Problem-solving skills. However, the

evaluation report did not specify the sequence of the topic covered, then, it is possible that

different participants would cover different topics during different time phase during the

treatment. Otherwise, if the topics covered are in a systematic and logical sequence, then a

process evaluation may be needed to identified the advantage and disadvantages of the current

treatment process as well as to make amendment. As some topics covered were quite effective

to certain group of people so that they completed the treatment earlier and some topics were

not very effective so that some of them have o stayed for longer treatment. The 49 cases

would then become very useful in determining the effectiveness of certain topics.

Moreover, it revealed another weakness of the evaluation, which is the lacking of needs

assessment. Without a proper needs assessment on the needs of the services users, their deep-

rooted problem or obstacles are not known and the steps for planning a programme may be

too messy to be effectively organized. Needs assessment may even help in reducing the high

dropout rate and setting an achievable standard for the completion time of the participants.

If resources are available and the sample size is large enough, factorial design can determine

what kind of programme is suitable to what kind of participants; as well as the sequence of the

treatment:

Group Time 

Group 1a O1 T1 + T2 O2 O3

10

Group 1 Group 1b O1 T2 + T3 O2 O3

Group 1c O1 T1 + T3 O2 O3

Group 2

Group 2a O1 T1 + T2 O2 O3

Group 2b O1 T2 + T3 O2 O3

Group 2c O1 T1 + T3 O2 O3

Participants can be participants are assigned to tow different groups according to their stage

before joining the evaluation. The six stages are originally subdivided into two groups: stage

(1) to stage (3) is the earlier stage and stages (5) to (7) are grouped as advanced stage. The

participants in group 1 can further divide into 2 sub-groups randomly or according to their

actual stage. Different combination of the treatment can be applied to different groups. Pretest,

posttest and maintenance test can be made to assess the progress of the participants. Thus, the

contents and sequence of the topics covered can be assessed; and the change pattern of the

participants in different group can be observed. A more flexible, useful and suitable treatment

combination can be planned according to the needs and stage of the participants in the future.

Discussion

The report suggested that the high dropout rate was due to inappropriate intake procedure in

which some cases were selected inappropriately. However, it did not mention the meaning of

“inappropriate cases”, and the criteria of selecting a case were not mentioned in the report.

Moreover, there is no standard for the time that participants should stay in the programme. It

is possible that completion within 3 months is the standard and the 21 participants who spent

more than 3 months were below standard. Thus, without this crucial indicator, the role of the

high dropout rate remains undefined, whether it is a big weakness of the programme or a

satisfactory result is unknown yet.

Second, ineffective engagement skills of the staff were also reported as a failure to keep

batterers in treatment. However, the cases showed that 66.7% of the participants who leave

the treatment have already reached the highest stage - “participation”. Thus, it is possible that

the high dropout rate is due to the self-perceived completion of treatment by the participants,

instead of lacking effective engagement skills of the staff.

The report also suggested have “court-ordered treatment” so that the participants will have to

join the programme compulsively and thus, reduces the number of dropout cases. Since the

11

evaluator has put much emphasis on improving the high dropout rate, I would suggest have a

services users opinion survey so as to collect participants’ opinion on the programme. The

data collected reflect the real needs of the participants and their reasons of early completion;

this data will be useful for making needs assessment. Decision makers can plan for a more

effective and efficient programme and evaluation to suit the needs of the participants.

If available, an in-depth interview with the participants can find out the reasons for staying in

the treatment and early dropout of the treatment. It can be used to double-check the

participants to see whether their progression achieved is real or just an effect of the repeated

testing.

Conclusion

As an outcome evaluation, the report has given a very detailed assessment on the outcomes

and effectiveness of the programme, the level of progression each participant has achieved

and the time taken for the achievement. The evaluation was also well constructed with a

sound theoretical framework. Thus, the evaluator could stay focus on the theoretical model

and avoid distraction. The framework also offers an observable changes of the participants so

that their progress over time.

However, the objectives and the main user of the evaluation report are not clear, causing

doubts on the favorable results obtained or even failing to reflect the effectiveness of the

whole programme to its services users.

Furthermore, this evaluation emphasized on the effectiveness of the intervention programme,

and so relied heavily on the assessment results before and after the programme. It overlooked

the contents of the intervention programme. The great variance among the 21 cases makes it

hard to find out the characteristics and pattern of the participants, such like which groups of

people are more likely to complete the programme within three months and which groups will

stay for six months or even more. Lacking of indicator and standards of the evaluation causes

a failure in the report – there is no way to decide an acceptable length that participants should

be staying in the programme. With this piece of information, instead of giving out ambiguous

conclusion – 66.7% of the participants reached the highest stage – a more meaningful and

useful result can be drawn.

12

The evaluator perceived the high dropout rate as a problem to the evaluation. Yet, the

meaning of the high dropout rate was undefined because there is no indicator for the length

that participants should stay in the programme. It is possible that the dropout rate is within an

acceptable range. The answer in this area remains open. On one hand, need assessment may

be conducted beforehand to evaluate the needs of such kind of programme and the needs of

the participants so that a more flexible and useful programme can be planned for the services

users. On the other hand, process evaluation can be another option for conducting an

evaluation. It reveals the strengths and weakness of the programme process. Decision makers

can make use of the results in the process evaluation for amendment and improvement of the

programme. Survey or in-depth interview with the participants, decision makers, staff and the

social workers could also help collect opinions about the process of the programme and thus,

made appropriate amendment.

Moreover, if more resources are available and the sample size is large enough, factorial

evaluation design can be conducted so that the most appropriate sequence of the treatment can

be tested and found. The patterns of participants can also be found such like which group of

participants is suitable to which type of treatment or which type of treatment works most

effective on which type of participants. This can increase the usefulness and flexibility of the

programme.

Other recommendation such as reporting the reliability of the scale and asking the participants

to write journal once a week can help increase the reliability and internal validity of the

evaluation, internal threats such as history and maturation can be ruled out.

Yet, it is also noticed that to implement the above suggestions would require much resources

and may not be possible to carry out at once. Step by step, the objectives and usage of the

report, standard or indicator, reliability and internal validity must be stated and well

considered in the future evaluation report; need assessment can be conducted before planning

any batterers intervention programme; process evaluation can be conducted instead of

outcome evaluation. Survey or in-depth interview can be treated as a subset of information to

the decision makers for improvement of the programme; lastly, factorial design can be

considered if the sample size is large enough and the resources are available. Then, the

evaluation can be more convincing and sound.

13

References:

Chan, Y. C. (2007). An evaluation report on third path man services batterers intervention

program. Hong Kong: Harmony House Limited

陳永泰 (1991)。社會服務評估法。香港:香港基督教服務處。