psyc164

Module4Fall2020OperantConditioningandReinforcement.pdf

Introduction to Operant Conditioning and Reinforcement

This week we’re going to begin looking at some of the foundational procedures used in behavioural psychology. These procedures are all aimed at increasing behaviour, by changing the consequence that follows a particular behaviour. The three we will explore for this module are:

• Positive Reinforcement

• Escape

• Avoidance

We’ll also discuss Sick Social Cycle, which happens when two of these procedures are combined (more on that later!) First, I need to acknowledge something…sometimes, learning behaviour analytic terminology can seem like you’re learning a new language. If at first you’re confused, rest assured that this is normal. Like any new language, it takes practice with the material to become fluent. You’ll be given lots of opportunity to think about these procedures in practice examples and in relating them to your own lives and in no time, you’ll find yourself using this terminology with ease.

First things First: What is Operant Conditioning?

Operant Conditioning: conditioning that influences behaviour change through the consequence that follow the behaviour

• E.g. you press a button (behaviour) on a beverage machine with the immediate consequence that your favourite drink appears. (consequence)

• E.g. you run out of the room (behaviour) when you see a spider, with the result that the spider is no longer in sight (consequence)

• E.g. you press your alarm button a minute before it’s scheduled (behaviour) to go off and as a result it doesn’t ring (consequence) in that annoying way wake up alarms ring.

Operant conditioning can occur as a result of a specific program set up to create behaviour change (remember with CJ? If he at least attempted to play for a little bit outside each day, he would be able to bring indoor items outside), but often also happens as a natural part of learning.

• Imagine you’re a baby, around 6-8 months old. You’re hanging out, swatting at a mobile, enjoying life, and making lots of sounds – new and interesting sounds, too, because your muscles and brain are working hard to make that happen (although in all fairness, you don’t know any of this because you’re a baby)  Anyway, one of the adults happens to be in the room while you’re trying out all these new sounds. You happen upon “ma-ma-ma” and this adult suddenly gets very excited! Smiling at you, kissing you, repeating your sound – it’s all very exciting! You try it again – this person is ECSTATIC! Wow…this set of sounds must be special because when you hit her with “wbbgffff” earlier she just wasn’t that impressed! So, when this adult is around you start to use “ma-ma-ma” a lot more often. Now, it’s unlikely you’ve gone through this whole thought process of analytical thinking, but even as babies, we

are quick to repeat behaviours that get certain consequences, and less likely to repeat others. Through this process, a lot of learning occurs as a result of operant conditioning in our natural settings.

So, there are three important questions when we’re analyzing operant conditioning…

1. What is the behaviour of interest? (This is the one we decide to focus on in the moment)

2. What is the immediate consequence of that behaviour? (What change happened directly after the behaviour?)

3. Is the behaviour more or less likely to happen in future as a result of the relationship between 1 and 2?

This week, we’re going to discuss three types of operant conditioning. All three of these INCREASE the frequency of future behaviour: Positive Reinforcement, Escape and Avoidance.

Positive Reinforcement

Positive Reinforcement is one of the better known operant conditioning procedures. It involves the addition of a consequence contingent on a behaviour with the result that the future frequency of behaviour increases.

Three rules?

1. Behaviour is followed by addition of a consequence

2. There is a contingency between the two

3. As a result of that contingency, the likelihood of the behaviour happening again in future

increases.

Contingency is simply the degree to which the consequence is more likely to happen following the behaviour than at any other time. For example, if I kick a can, the can moves, makes a sound on the ground, etc. The can is not likely to do any of those things if I don’t kick it. So there is a contingency between me kicking the can (behaviour) and the can moving (consequence).

All three of these rules must be in place in order for us to have an example of positive reinforcement. So in the example above, we’ve met two of the rules for positive reinforcement, but we’d need to know if I was now more likely to kick the can in future as a result of this contingency. 

Here is a small video example of a K-9 trainer with an eventual member of the canine police force:

https://www.youtube.com/watch?v=Rcjn8v5nfDU

Notice, in the example, that she has chosen a food reward that is highly reinforcing for the dog.

Behaviour of interest: Dog sits

https://www.youtube.com/watch?v=Rcjn8v5nfDU

Consequence: Treat added

Future Frequency? Dog more likely to sit when asked

In this case we’ve met all of the criteria for positive reinforcement. Try this one:

https://www.youtube.com/watch?v=JA96Fba-WHk

Take a second to try and isolate the three rules of positive reinforcement….

Okay, so in this one you might have said…

Behaviour of interest: Penny uses lower voice

Consequence: Chocolate added

Future Frequency? Penny uses lower voice more often

One more example:

Adam is eighteen. He asks his dad if he can borrow the car for the night to go out with friends. His dad says “no” because he needs it for a meeting. Adam becomes enraged, grabs his dad’s shirt and throws him up against the wall, demanding the keys. Shaking, Adam’s dad hands him the keys and Adam lets the dad down from the wall. Over time we notice that Adam is more and more likely to use threats of violence when requesting items from his dad.

Behaviour of Interest: throw Dad against wall, demanding keys

Consequence: the keys are provided to Adam as a consequence of this behaviour.

Future Frequency? As a result of that dependent relationship (contingency) Adam is now more likely to throw dad against the wall and demand things.

From this example we can see that positive reinforcement isn’t always all that “positive.” Which leads me to an important point: In behavioural psychology “positive” and “negative” are NOT equivalent to “good” and “bad.” Rather, they can be thought of as “addition” and “subtraction”. So when we look at the consequence, if the immediate change following the behaviour is that something new was ADDED to the environment (for e.g. the keys, the chocolate or the dog treat), then we use the term “positive”. If something is subtracted from the person’s environment immediately following the behaviour, then we use the term “negative.” Some tips to increase the effectiveness of Positive Reinforcement: 1. Make sure you choose the right reinforcer: Reinforcers are highly individual. What is important to

https://www.youtube.com/watch?v=JA96Fba-WHk

one person will be meaningless for another. It makes sense to take the time to get to know the client’s reinforcer preferences. If your use of positive reinforcement isn’t actually making a change in your behaviour or the behaviour of the client, then it isn’t positive reinforcement. The most likely culprit is the choice of reinforcer. We need something that “worth” it and that you genuinely enjoy. Let’s say reading is a huge reinforcer for you. You make a plan that if you do an exercise program every morning for 2 weeks, you get to buy a new book. You find yourself not following through. It might be that there is way too much behaviour expected for the pay-off of the book at the end of two weeks. It may be that reading just isn’t the right reinforcer for this particular set of behaviours, or it may mean that you need to adjust it to make it more immediate. For example, you could buy the new book up front, and then for each day you exercise you get to read a couple of chapters from your book. That way, the reinforcer is immediate, potent (strong) and equal to the effort you’re putting in on a daily basis. It’s sometimes helpful to have a variety of reinforcers to use when first establishing a behaviour. So, immediately reinforcing behaviour, especially when it is first being established, is really important, as is making sure it is balanced with the effort of the behaviour and the potency of the reinforcer. 2. Try to avoid the pitfall of accidentally reinforcing undesirable behaviours in others: Our brains and sensory systems are wired to spot discrepancies (differences) in the environment. We’re tuned into breaks in routine, challenges, potential threats, things that are going wrong. As a result, we tend to under reinforce great behaviour in other people and in ourselves! So the result is that we sometimes accidentally reinforce undesirable behaviours in others. Some examples?

• Child has been playing well while you work. After about 20 minutes, comes over and starts to whine. You give attention.

Here, you’ve just ignored 20 minutes of really awesome, helpful behaviour! And reinforced a less awesome behaviour. One thing you could do is get up at the ten minute mark and say, “Wow! You’ve been playing so nicely over here, and as a result, I’m getting my work done so quickly! I’m going to take a little break…can I play with you for a bit?” And then after the timer goes off, back to work…reiterating to the child how much you appreciated their quiet play.

• Partner, family member or roommate has been great about doing dishes, keeping things generally tidy, etc. all week. It’s their turn to take out recycling and they forget – you get irritated with them.

In this case, it’s just an accidental omission of reinforcing a great set of behaviours. If you had been reinforcing that person for all of the cleaning and dishes they’d been doing throughout the week, it would have been more likely they would have spontaneously taken out the recycling, and been feeling less irritated when the recycling is all you concentrated on. We miss out on opportunities to reinforce each other all the time. And it’s important to be acknowledged on a regular basis. Just something to think about.

Let’s think about a couple of real life applications… On what basis are you paid? Many students are paid by the hour…but what is this really reinforcing? In many cases, you’re being reinforced with money for being at your place of work for an hour long period. But is that what we’re really wanting to reinforce? This kind of pay system often creates interpersonal conflict because inevitably there are the genuinely hard workers, who come in, do a lot of work and make sure the place is running well. Then there are the “others.” They come in, wander around, do as little work as possible, and yet still get the same pay. I can understand why the hard workers get pretty irritated, but really, it’s the system that is letting you down. If the employer changed things so that employees were paid by item completed (number of sales, number of items inventoried/folded/shelved, etc. ) the hard workers would naturally be paid more than the slackers.  This system decreases interpersonal conflict because the reinforcer is now in better alignment with the behaviour completed. Why do you watch Netflix rather than study? If we look at the potential reinforcers, it makes sense. The minute you click on the icon for Netflix you are bombarded with endless entertainment possibilities. The reinforcer is immediate, powerful and equal to your behaviour. Now look at studying for a test. The ultimate reinforcer is likely grades. So you study for quite a long period of time, then take the test, and then wait for the test to be graded. It’s delayed, may or may not be potent and the act of studying may seem out of sync with the grade you receive at the end. So…all other things being equal, it makes sense you’d watch Netflix rather than study, doesn’t it? (We’ll talk about ways to change that in future classes.) Unconditioned vs. Conditioned Reinforcers Unconditioned: things that reinforce us with no prior learning (food when we’re hungry, water when we’re thirsty, warmth when we’re cold) Conditioned: things that reinforce us due to an association or pairing with another reinforcer(e.g. a movie ticket or money – these are just rectangular pieces of paper and not particularly reinforcing on their own, but because you can use them to access a favourite activity or buy other reinforcers, they become reinforcing due to that pairing. (just think, if tomorrow money was declared useless, it would be unlikely that you would work an entire week just to receive a bunch of useless rectangles of paper.) Types of Conditioned Reinforcers Simple versus Generalized Conditioned Reinforcers Simple: paired with just one back-up reinforcer (e.g. our movie ticket) Generalized: paired with many different back-up reinforcers (e.g. our money) Practice Examples For each of the following, identify the behaviour of interest, the consequence and the future frequency of the behaviour. Then determine if it’s an example of positive reinforcement.

Movie theater patrons are given coupons for the snack bar for every piece of trash they deposit in the proper receptacles. This has increased the number of trash items deposited. Young Hailey is whining “I’m huuunngggry” while her dad tries to finish a reading for a class. He tries to ignore her, but she whines louder and he says “Fine, here” and give her a granola bar. Hailey is now more likely to whine loudly in the future when her dad is studying. Travis spray paints a building window while his friends watch, smiling and laughing. He becomes more and more likely to engage in similar behaviours in future. Answers Behaviour of Interest: depositing trash Consequence: coupons for snack bar (added) Future Frequency? More likely to deposit trash in future = Positive Reinforcement Behaviour of Interest: whining “I’m hungry” Consequence: attention and granola bar Future Frequency? More likely to whine in future =Positive Reinforcement Behaviour of Interest: spray paint a building window Consequence: friends laugh and smile Future Frequency? More likely to in future = Positive Reinforcement Okay, so that covers positive reinforcement…now we head over to Negative Reinforcement. These are essentially broken down into two types: Escape and Avoidance. Escape Escape is another operant conditioning procedure in behavioural psychology, but in escape, a stimulus is SUBTRACTED contingent on the behaviour with the result that the behaviour’s frequency increases in future. Escape, like positive reinforcement, has three rules:

• A stimulus is removed (subtracted) from the environment following the organism’s behaviour • The stimulus removal is contingent on the behaviour • The behaviour increases in frequency as a result of that contingency

Some examples:

https://www.youtube.com/watch?v=x3hcBHmi1lM Behaviour of Interest: taking pain killers Consequence: headache (subtracted) Future Frequency? More likely to take pain killer when have headache in future Another example: You have a mosquito bite that itches (A LOT)…especially the bites on the ankle bone…what IS that?!? You scratch the mosquito bite and the itching stops (for now!) Because of that consequence, in future when your bite is itchy, you are more likely to scratch it. Behaviour of Interest: scratch bite Consequence: itching (subtracted) Future Frequency? Now more likely to scratch the bites next time they itch. One more: You have extreme exam anxiety. As you sit down to write you start to feel hot and sweaty and your heartbeat gets really fast making you feel nauseous. You get up and leave the exam. You no longer feel hot/sweaty/nauseous. In future exams, you are now more likely to get up and leave the room if you are feeling this way. Behaviour of Interest: get up and leave Consequence: no longer feel sweaty and nauseous Future Frequency? Now more likely to get up and leave if feeling this way in exam. So…in each case, the behaviour results in the removal of something from the environment. In the first, the action removes a headache, in the second, an itch, and in the third, some anxiety and nausea. So negative reinforcement increases our behaviour, like positive reinforcement. Unlike positive reinforcement, though, it accomplishes this through subtraction, rather than addition of a stimulus. Let’s think of a couple of real life applications… Think of something you might be slightly fearful of. Might be a spider, seeing your ex, clowns….or your ex dressed as a clown holding a spider…anything, really.  Our behaviours around things we find unpleasant or fearful are often maintained by escape. We leave the room if we see a spider, or if we see our ex at a friend’s party, or a clown approaching us during a parade – and the consequence that is maintaining that behaviour is that in each case, it allows us to successfully remove that individual from our immediate environment, and as a result, the discomfort or fear associated with that person. Ever hear of misophonia? Here is a short video about it…

https://www.youtube.com/watch?v=x3hcBHmi1lM

https://www.youtube.com/watch?v=1LFBQ3EP3Pg Since last year when this video was made, further research has been done and there are now practitioners across North America who can help those who suffer. Misophonia appears to be under the larger umbrella involving sensory integration dysfunction. When the sounds trigger, they trigger a rage and stress reaction that much of the time can only be dealt with by leaving the source of the trigger. While sound therapy can help reduce a person’s sensitivity over time, much of therapy involves learning how to cope with these triggers in a functional way. Leaving the scene is an example of escape. One other example: Think of the last time you were in an argument with a friend, loved one or family member. The argument is going on a long time and the other person is getting visibly angry and there is no end in sight. You decide to give up your position, and the fight is immediately over. You might be likely to give in your position with this person more in future when you are arguing with them, as a result of this consequence. This is another example of escape. (we’ll revisit this example in sick social cycle) So, in summary, escape works to maintain or increase our behaviours because it allows us to successfully remove unpleasant stimuli from our environments. Let’s now look at our third operant conditioning procedure….Avoidance. Avoidance The avoidance procedure allows us to use our behaviour to prevent or delay the stimulus consequence from coming or going in the first place. Its rules:

• A stimulus WILL be added or removed if the organism does not engage in behaviour. • By engaging in the behaviour, the organism prevents or delays the stimulus from being

added/removed • The organism’s behaviour increases as a result. •

Okay….WHAT?!? If you’re confused, it’s okay…this procedure makes more sense when we think of an example. Example: Your ex works at Starbucks. You love a particular Starbucks and used to go there all the time, but now you go to a different one because you don’t want to run into your ex. This is avoidance. Because… 1. If you GO to the ex-Starbucks, you WILL see your ex. 2. Because you go to the NEW Starbucks, you prevent yourself from seeing your ex. 3. Now you are more likely to go to the NEW Starbucks because you successfully avoided seeing the ex.

https://www.youtube.com/watch?v=1LFBQ3EP3Pg

Another example: I have an older brother by two years. Generally we got along fine, but for a period of time when I was about five, he was a royal pain in my life. He went through this phase where he would wrestle with me, then fold my legs so I was in a ball, and then sit on me to watch TV. I was pretty tough as a little kid, but it would eventually start to hurt and then I would yell my lungs out for my mom. Now, if she came down and saw me crying and upset and heard the story, my brother would get in trouble and get sent to his room, leaving me to watch TV in peace and quiet. He knew this. So, when he’d hear her coming down the stairs he would jump off of me, slam his body into a wall, wail “Stop it! That hurts!” and then drop to the floor rolling around, as if in pain. My mom would arrive to two children rolling around on the floor accusing each other of wrong-doing. Having no other evidence to go by, she would turn off the TV and both of us would be told to be more careful with each other, but no one was sent to his/her room. So, my brother successfully avoided getting sent to his room by pretending he was injured. He did this for months until one day he miscalculated and didn’t realize mom was right outside the room and she saw his whole act. Vindi-CATION!!!!!  Behaviour of interest: pretending to have a fake injury caused by me Consequence: going to his room (prevented) Future Frequency? More likely to do it in future because it was super successful (for a while!)  One more: Erin had a really obnoxious psychology stats professor. The first day he sat all the students in alphabetical order, then would teach a difficult concept, put up a question for them to try, call on a specific person to give a detailed answer after 2 minutes, and then proceed to ridicule them in front of the whole class for not getting it right. Erin dreaded going to that class because each class she got called on at least once and was made to feel stupid for not getting the question right. The thought of even going to class was really stressful. So Erin started skipping class. She met with a study group of her peers, instead, and only went on exam days. Interestingly, this class started off with 110 members, which by midterm was down to 40, and by the end down to 12. Almost the entire class was learning the material from each other, outside of class. So…what’s happening here? If Erin goes to class she is almost certainly ridiculed in front of her peers. So, by skipping class, she prevents the ridicule from occurring. It’s so successful that she ends up skipping the rest of the term and only going for exams. Behaviour of Interest: skipping class Consequence: ridicule (prevented) Future Frequency? More likely to skip because it successfully prevents the ridicule. Okay, so at this point, it would be good to talk about the difference between escape and avoidance because sometimes that can be confusing. With escape, the unpleasant stimulus is already occurring and your behaviour subtracts it from the environment. With avoidance, the unpleasant stimulus is prevented

from occurring at all due to your behaviour. So in the above example, if Erin was already in class and being ridiculed and then she walked out of class, that would be an example of escape. So…those are the three operant conditioning procedures that increase behaviour. Sometimes, in our interactions with others, different types of reinforcers are maintaining each person’s behaviour. We’ll talk about this next…it’s called Sick Social Cycle. Sick Social Cycle You may have noticed, as you worked through examples of positive reinforcement and negative reinforcement, that the other person’s behaviour in the example wasn’t analyzed. The reason? I wanted to keep things simple as you learned the concepts. In many examples where there is some kind of conflict (for example, Adam and his father; you arguing with your friend) one person’s behaviour is reinforced by positive reinforcement and the other person’s behaviour is reinforced by escape. This particular combination sets up a rather unfortunate set of interactions that are difficult to get out of, because both parties are reinforced for their behaviours in the interaction. This pattern is known as the Sick Social Cycle. Let’s revisit Adam and his Dad… In our previous analysis we established that Adam’s behaviour of throwing his Dad against the wall was positively reinforced by the addition of the keys contingent on that behaviour. What about Dad’s behaviour? Behaviour of Interest: handing over keys Consequence: being held painfully against wall (subtracted) Future Frequency? More likely to hand over keys in that situation in future = Escape Notice how in the interaction, one person’s behaviour becomes the stimulus for the other Adam Dad B -------------------------C (+) B----------------------------------------C(-) Threaten keys keys threaten Let’s look at another example: You were arguing with family member and eventually gave in, right? Let’s look at both parties’ behaviour You Family Member

B---------------------------C(-) B---------------------------------------C(+) “you’re right” arguing arguing “you’re right” So, when you are arguing, you giving in and saying “you’re right” results in the argument stopping. It’s removed, so if you use this strategy again in the future, it fits the definition of escape. For your family member, continuing to argue is followed by hearing you say “you’re right.” This is added to their environment and so if they use this strategy again in future, it fits the definition of positive reinforcement. Together, this interaction is a sick social cycle. One more example… We have two very cute but sassy dogs named Nula and Balla. Balla is a big belly rub fan. When she was younger, she would flip on her back and my daughter would give her a long, happy belly rub. But when my daughter would stop, Balla would jump up and “hit” my daughter with her paw. She would continue to “hit” with increasing force, until my daughter would start giving her belly rubs again, at which point she’d lay down and calmly receive the belly rub. Daughter Behaviour of interest: start belly rubbing again Consequence: hitting (stopped) Future Frequency? More likely to do it in future to stop Balla’s incessant hitting = escape Balla Behaviour of interest: hitting Consequence: belly rubs (added) Future Frequency? More likely to do it in future because it leads to belly rubs! =pawsitive reinforcement (Ah? Ah? See what I did there?)  So, for sick social cycle, there is an interaction between two organisms. The behaviour of one is the stimulus for the other. One is reinforced through positive reinforcement and the other through escape. We’ll talk about how to break sick social cycle next time.