Evaluation Research Report
1
Assignment Cover Sheet
Program: BSSAS - BSocSc Applied Sociology
Course Code: SS3422
Course Title: Programme Evaluation
(Student Number) (51588988)
Lecturer: CHAN Wing Tai JP
Submission Date: 30 April, 2010
A Critical Review on “An Evaluation Report on Third
Path Man Services Batterer Intervention Program”
2
Logic Model
An Evaluation Report on Third Path Man Services Batterers Intervention Programme
*URICA-DV: University of Rhode Island Change Assessment – Domestic
violence, which is developed to assess the readiness of batterers to end their
violence behaviors.
Inputs Activities Outputs Outcomes
(Immediate)
No one showed
regression at the
stage of
measurement.
19 of the 21
remaining
participants showed
progression.
14 among the 19
participants, who
showed progression,
reached the highest
stage of URICA-
DV*.
Average number of
stage the 21
participants
progressed was
2.72.
49 male batterers
joined the programme.
28 of them stayed for
less than 3 months in
the programme.
21 stayed in the
programme for more
than 3 months.
5 participants of the
remaining 21 stayed
for 6 months.
2 stayed of the
remaining 21for more
than 10 months.
Average duration of
intervention for 21
participants was 4.8
months.
Individual meeting - Social
worker met with male batterers to
discuss their problems.
Counseling service - according
to each batterer’s particular need.
Group meeting - participants
were invited to join voluntarily.
Contents of the individual and
group meeting include:
1. Knowledge of domestic violence
2. Beliefs and myths contributing to domestic
violence
3. Awareness of gender beliefs and stereotypes
4. Anger management skills 5. Effective communication
skills
6. Problem-solving skills
Staff and social
workers – from
Harmony House
Funding – from
Hong Kong Jockey
Club
Time – 14 months
Partner – The
Hong Kong
Polytechnic
University for the
evaluation section.
3
Chan, Y. C. (2007). An evaluation report on third path man services batterers intervention
program. Hong Kong: Harmony House Limited
A critique on “An Evaluation Report on Third Path Man Services Batterers
Intervention Program”
Abstract
The Evaluation Report on Third Path Man Services Batterers Intervention Programme was
conducted from 2004 to 2005 and published in 2007. Several problems of the evaluation
report was recognized: ambiguous objectives and usage of the evaluation; lacking indicator or
standards on the successful level of the participants; and high dropout rate. Need assessment,
process evaluation, factorial evaluation design, survey, in-depth interview were suggested to
make improvement on the programme. The reliability of the evaluation report can be
strengthened by reporting the alpha value or test-retest result; and the internal validity of the
evaluation report should be taken care of by keeping logbook or journal of the participants.
4
Introduction
Programme details
The Batterers Intervention Programme aimed at stopping male batterers’ violence behaviors
on their partners. At first, the programme was put forth in 2000 with a three-year pilot project
from 2000 to 2003. An evaluation on the pilot project was conducted in 2004. The result of
the evaluation showed the success and usefulness of the pilot project. Thus, the project was
extended for another two year, from 2004 to 2005. Data in this report were mainly from 2004
to 2005.
Male batterers who came to the service would meet with the social worker would meet with
the social worker individually to discuss their problems. Then, counseling service would be
rendered according to each batterer’s needs. Lastly, participants were invited to join group
meeting voluntarily. The contents of the individual and group meeting included (1)
Knowledge of domestic violence, (2) Beliefs and myths contributing to domestic violence, (3)
Awareness of gender beliefs and stereotypes, (4) Anger management skills, (5) Effective
communication skills and (6) Problem-solving skills.
Evaluation Background
Dr. Chan Yuk Chung, the assistant professor of Department of Applied Social Sciences, The
Hong Kong Polytechnic University, conducted the evaluation on “The Third Path - Man
Services – Batterers Intervention Programme”. The evaluation aimed at assessing the changes
made by the participants before, during and at the end of the programme. All data collected
and evaluated were from November 2004 to December 2005.
Evaluation target
Male batterers who joined the Batterer Intervention Program from November 2004 to
December 2005 could join the evaluation voluntarily. No decision makers, social workers or
staff involved in the intervention were interviewed nor evaluated.
Measuring instrument
The Transtheoretical Model of Change was used. It provided a framework for the evaluation.
This model also showed directional change of the services users and thus, provided a clear
5
effect of the programme. Based on this model, a 20-item University of Rhode Island Change
Assessment – Domestic Violent (URICA - DV) developed by Levesque, Gelles & Velicer
was used. 7 stages of progression were recognized as (1) reluctant, (2) immotive, (3) non-
reflective action, (4) unprepared action, (5) pre-participation, (6) decision-making and (7)
participation. Stages (1) to (3) are grouped as earlier stage and stages (5) to (7) are grouped as
advanced stage.
Baseline was set by two raters, rating the score of every participant before the treatment
started. The inter-rater reliability was 80%.
Data collection
Before joining the treatment, participants had to complete the 20-item Chinese version of
URICA-DV. At the end of the 3 rd
and 6 th
month, participants had to complete the same
questionnaire for the second and last assessment. 49 participants completed the URIC-DV
voluntarily. 28 of them had completed the scale once before leaving the programme. 21 of
them who stayed for more than 3 months had completed the scale at least once.
Output
49 male batterers joined the programme, 28 of them stayed in the programme for less than 3
months and the remaining 21 stayed more than 3 months. Out of the 21 participants, 5 stayed
for 6 months and 2 stayed for more than 10 months. The average duration of intervention for
the 21 participants was 4.8 months.
Outcomes
No one showed regression at the stage of measurement. 19 out of 21 participants showed
progression. 14 of them even reached the highest stage of URICA-DV, in which stating that
participants are actively working on ending their violence action. The average number of
stage the 21 participants progressed was 2.72.
(Logic Model to be inserted here)
6
Critique
Objective of the evaluation report
This was an outcome evaluation. It focused on the extent the outcomes have achieved and
gave every detail on it. It also emphasized on the effectiveness of the programme on the
service users. Although the evaluator did not mention any reasons for conducting the
evaluation, there seems to have two latent reasons: to fulfill accreditation requirement and
account for funds.
This Batterers Intervention Programme had been carried out in 2000 to 2004; coming to its
second phase (from 2004 to 2005), showing its effectiveness may serve as an indicator for
decision makers and sponsors to continue offering support and funds to the programme. The
original objective of the evaluation report is to show the improvement made by the
intervention programme to its service users. Yet, when the latent reasons outweigh the main
objective, the evaluation report becomes the means to determine the survival of the
programme. The effectiveness and usefulness of the evaluation become very doubtful and
may be biased.
Stufflebeam’s advocated that the purpose of evaluation is not to prove the effectiveness of the
programme, but to improve it (陳永泰, 1991, p.44). This evaluation focused too much on
proving the effectiveness of the programme to its services users and ignoring other problems
which improvement should be made. Although the evaluation demonstrated the changes of
the services users after joining the programme, it did not collect comments or opinions from
the decision makers, staff, social workers and even the services users on the contents of the
programme. This showed that the evaluation only proved the effectiveness of the programme
but failed to recognize any content or procedure problem of the programme.
Also, the usage of the evaluation is not clearly defined. It seems like the evaluation was
conducted based on the needs the interested groups like the sponsors, not the main services
users. Since the evaluation showed the effectiveness of the programme so that it could gain
support or fund again and the programme can be extended in the future. This could make the
evaluation inadequate by obtaining favorable information and hiding its weaknesses. The
7
objectives of the evaluation and the main users of the evaluation report should be clearly
defined.
Measuring instrument
The reliability of the Chinese version of the URICA-DV was not mentioned. Without showing
the reliability of the scale, the internal consistency of the scale is not known and it is hard to
determine whether the items in the scale are reliable. The alpha value or the test-retest value
should be included.
Baseline was set by two raters, which was important and useful when comparing the result of
the participants in later stage. However, no standards or indicators were made for the success
level. For a better evaluation, there should have at least an indicator stating what should be
achieved by when. The application SMART goal helps determine the progress of the
participants, whether they are above or below standards. SMART GOAL stated that any goal
or objective should be specific, measurable, achievable, relevant and within a given time
bound. Without it, the success level will be as ambiguous as its objective – participants would
show progression and make no regression. Since the time bound and the extent of standard
progression are not mentioned. It only benefits the evaluator by allowing him to give flexible
explanation on the evaluation outcomes but failed to give a deeper understanding on the
progress of the participants.
Data collection
This is a quasi-experimental, Single Group Time Series Design, in which data were collected
in different condition, the progress of the participants could be observed and compared easily.
However, referring to the MAXMINCON principle, this evaluation design failed to minimize
the error variance and control extraneous variance. Without control group, the power to detect
the real impact of the programme on its services users is weak, and the extraneous variance,
such as religious background and personality of the participants, cannot be ruled out.
Individual variance or third variables may change the participants’ attitude or behaviors,
instead of the programme. It is difficult to determine whether the improvement of the
participants is really due to the programme. Its small sample size and high mortality rate
(57%) also make the conclusion less convincing and sound.
8
Besides, there are other threats to its internal validity such as repeated testing, history,
maturation and regression to mean:
Some participants completed the scale for three times or even four times in this evaluation, it
is possible that participants may give out expected answers in the later stage of assessment
after gaining familiarity in the testing area or being aware of the evaluation purpose. Biased
answers may be obtained.
History refers to the possibility that the changes of the participants are made by other current
events rather than the intervention itself. Since the participants were not totally isolated in a
testing environment, other events such as the encouragement of his family members or
recently joining any religious group can cause the improvement of the participants, instead of
the programme. An effective use of logbook or journal, such as asking the participants to
write down their relationship with their partners once a week and record down any special
event happened between them, can eliminate this threat.
In addition, participants’ changes of behaviors or attitudes may be caused by their normal
development such that they learned their lesson and were determined to make changes. This is
the maturation of the participants which makes the results in the evaluation report unrelated to
the programme effectiveness.
Last but not least, regression to mean affects the significance of the evaluation. Since all the
49 participants had battering record before joining the programme. It is not surprised to see
them making changes. Thus, the intervention may just play one part in improving
participants’ attitude and behaviors but giving all the credits to the programme may make the
evaluation doubtful. Random assignment of the participants into experimental group and
control group can help rule out the maturation and regression to mean effect. However, the
small sampling size may make the process more difficult to carry out.
Outputs and Outcomes
Among 21 participants stayed for more than 3 months, 5 stayed for more than 6 months and 2
stayed for more than 10 months. The evaluation makes no explanation on this phenomenon.
Since no standard was made, the appropriate treatment period was not defined. Thus, it cannot
be concluded that the longer the participants stayed in the programme, the better or vice versa.
9
The high dropout rate (57%) remains the most critical problem in this evaluation. It may
indicate the weakness of the programme and imply for the needs of amendment. Most of the
participants leave the programme after reaching the highest stage - “participation”. It is logical
that participants will leave the programme when they perceive themselves as “recovered” or
“cured”. These participants progressed faster than others and could complete the programme
within 3 months. On the other hand, there were 5 participants who stayed in the programme
for more than 6 months, the great variation among them makes it hard to look for patterns or
characteristics of the participants.
The usefulness of the contents of the individual and group meeting was also questioned. The
topics included: (1) Knowledge of domestic violence, (2) Beliefs and myths contributing to
domestic violence, (3) Awareness of gender beliefs and stereotypes, (4) Anger management
skills, (5) Effective communication skills and (6) Problem-solving skills. However, the
evaluation report did not specify the sequence of the topic covered, then, it is possible that
different participants would cover different topics during different time phase during the
treatment. Otherwise, if the topics covered are in a systematic and logical sequence, then a
process evaluation may be needed to identified the advantage and disadvantages of the current
treatment process as well as to make amendment. As some topics covered were quite effective
to certain group of people so that they completed the treatment earlier and some topics were
not very effective so that some of them have o stayed for longer treatment. The 49 cases
would then become very useful in determining the effectiveness of certain topics.
Moreover, it revealed another weakness of the evaluation, which is the lacking of needs
assessment. Without a proper needs assessment on the needs of the services users, their deep-
rooted problem or obstacles are not known and the steps for planning a programme may be
too messy to be effectively organized. Needs assessment may even help in reducing the high
dropout rate and setting an achievable standard for the completion time of the participants.
If resources are available and the sample size is large enough, factorial design can determine
what kind of programme is suitable to what kind of participants; as well as the sequence of the
treatment:
Group Time
Group 1a O1 T1 + T2 O2 O3
10
Group 1 Group 1b O1 T2 + T3 O2 O3
Group 1c O1 T1 + T3 O2 O3
Group 2
Group 2a O1 T1 + T2 O2 O3
Group 2b O1 T2 + T3 O2 O3
Group 2c O1 T1 + T3 O2 O3
Participants can be participants are assigned to tow different groups according to their stage
before joining the evaluation. The six stages are originally subdivided into two groups: stage
(1) to stage (3) is the earlier stage and stages (5) to (7) are grouped as advanced stage. The
participants in group 1 can further divide into 2 sub-groups randomly or according to their
actual stage. Different combination of the treatment can be applied to different groups. Pretest,
posttest and maintenance test can be made to assess the progress of the participants. Thus, the
contents and sequence of the topics covered can be assessed; and the change pattern of the
participants in different group can be observed. A more flexible, useful and suitable treatment
combination can be planned according to the needs and stage of the participants in the future.
Discussion
The report suggested that the high dropout rate was due to inappropriate intake procedure in
which some cases were selected inappropriately. However, it did not mention the meaning of
“inappropriate cases”, and the criteria of selecting a case were not mentioned in the report.
Moreover, there is no standard for the time that participants should stay in the programme. It
is possible that completion within 3 months is the standard and the 21 participants who spent
more than 3 months were below standard. Thus, without this crucial indicator, the role of the
high dropout rate remains undefined, whether it is a big weakness of the programme or a
satisfactory result is unknown yet.
Second, ineffective engagement skills of the staff were also reported as a failure to keep
batterers in treatment. However, the cases showed that 66.7% of the participants who leave
the treatment have already reached the highest stage - “participation”. Thus, it is possible that
the high dropout rate is due to the self-perceived completion of treatment by the participants,
instead of lacking effective engagement skills of the staff.
The report also suggested have “court-ordered treatment” so that the participants will have to
join the programme compulsively and thus, reduces the number of dropout cases. Since the
11
evaluator has put much emphasis on improving the high dropout rate, I would suggest have a
services users opinion survey so as to collect participants’ opinion on the programme. The
data collected reflect the real needs of the participants and their reasons of early completion;
this data will be useful for making needs assessment. Decision makers can plan for a more
effective and efficient programme and evaluation to suit the needs of the participants.
If available, an in-depth interview with the participants can find out the reasons for staying in
the treatment and early dropout of the treatment. It can be used to double-check the
participants to see whether their progression achieved is real or just an effect of the repeated
testing.
Conclusion
As an outcome evaluation, the report has given a very detailed assessment on the outcomes
and effectiveness of the programme, the level of progression each participant has achieved
and the time taken for the achievement. The evaluation was also well constructed with a
sound theoretical framework. Thus, the evaluator could stay focus on the theoretical model
and avoid distraction. The framework also offers an observable changes of the participants so
that their progress over time.
However, the objectives and the main user of the evaluation report are not clear, causing
doubts on the favorable results obtained or even failing to reflect the effectiveness of the
whole programme to its services users.
Furthermore, this evaluation emphasized on the effectiveness of the intervention programme,
and so relied heavily on the assessment results before and after the programme. It overlooked
the contents of the intervention programme. The great variance among the 21 cases makes it
hard to find out the characteristics and pattern of the participants, such like which groups of
people are more likely to complete the programme within three months and which groups will
stay for six months or even more. Lacking of indicator and standards of the evaluation causes
a failure in the report – there is no way to decide an acceptable length that participants should
be staying in the programme. With this piece of information, instead of giving out ambiguous
conclusion – 66.7% of the participants reached the highest stage – a more meaningful and
useful result can be drawn.
12
The evaluator perceived the high dropout rate as a problem to the evaluation. Yet, the
meaning of the high dropout rate was undefined because there is no indicator for the length
that participants should stay in the programme. It is possible that the dropout rate is within an
acceptable range. The answer in this area remains open. On one hand, need assessment may
be conducted beforehand to evaluate the needs of such kind of programme and the needs of
the participants so that a more flexible and useful programme can be planned for the services
users. On the other hand, process evaluation can be another option for conducting an
evaluation. It reveals the strengths and weakness of the programme process. Decision makers
can make use of the results in the process evaluation for amendment and improvement of the
programme. Survey or in-depth interview with the participants, decision makers, staff and the
social workers could also help collect opinions about the process of the programme and thus,
made appropriate amendment.
Moreover, if more resources are available and the sample size is large enough, factorial
evaluation design can be conducted so that the most appropriate sequence of the treatment can
be tested and found. The patterns of participants can also be found such like which group of
participants is suitable to which type of treatment or which type of treatment works most
effective on which type of participants. This can increase the usefulness and flexibility of the
programme.
Other recommendation such as reporting the reliability of the scale and asking the participants
to write journal once a week can help increase the reliability and internal validity of the
evaluation, internal threats such as history and maturation can be ruled out.
Yet, it is also noticed that to implement the above suggestions would require much resources
and may not be possible to carry out at once. Step by step, the objectives and usage of the
report, standard or indicator, reliability and internal validity must be stated and well
considered in the future evaluation report; need assessment can be conducted before planning
any batterers intervention programme; process evaluation can be conducted instead of
outcome evaluation. Survey or in-depth interview can be treated as a subset of information to
the decision makers for improvement of the programme; lastly, factorial design can be
considered if the sample size is large enough and the resources are available. Then, the
evaluation can be more convincing and sound.
13
References:
Chan, Y. C. (2007). An evaluation report on third path man services batterers intervention
program. Hong Kong: Harmony House Limited
陳永泰 (1991)。社會服務評估法。香港:香港基督教服務處。