Sniffy Assignments - 2) Shaping

Back to the Index

Introduction
You can probably skip or skim over this introduction if you already understand shaping.

The purpose of this assignment is to give you some experience shaping (reinforcing successive approximations to a target response).  The following is Catania's (1998) definition of shaping from his book Learning:

Shaping: gradually modifying some property of responding (often but not necessarily topography) by differentially reinforcing successive approximations to a target operant class.  Shaping is used to produce responses that, because of low operant levels and / or complexity, might not otherwise be emitted or might be emitted only after a considerable time.  The variability of the responding that follows reinforcement usually provides opportunities for reinforcing further responses that still more closely approximate the criteria that define the target  operant class. (p. 410)

For this assignment, you are going to teach Sniffy to press the lever.  So, using Catania's terminology, we would say that the target response you're going to shape up is lever pressing. If you've played around with the Sniffy software a bit, you may have noticed that he can already press the lever and does so about once every five minutes.  There might be occasions when you want a particular organism to emit a response but you don't feel like waiting around forever.  One solution is to use shaping, a process of reinforcing closer and closer approximations of the target response.

In order to better understand shaping, it might be important to define what I mean by reinforcement.

Reinforcement:  (this definition contains three important components)
  • an event or change in the environment
  • that occurs following a response
  • that also increases the future probability of similar responses

That's a lot of information to digest.  You might ask, why does reinforcement work?  That's not important; well, it's not important to me at this moment, so don't worry about it.  You might also ask, how do you know if something is a reinforcer?  To answer that question, we would have to deliver the reinforcer such that it involved a change in the organism's environment following a response, and that it led to future increases in occurrences of that response.  The definition does NOT suggest things that could or could not be reinfocers.

Now back to shaping.  In general, shaping involves at least two phases.  The first phase is magazine training.  The purpose of the first phase is to ensure that when we deliver reinforcement, we make it as close in time to the reponse that we're trying to reinforce as possible.  For Sniffy, and most rats that are first placed in operant chambers, this process involves simply delivering food over and over again until the organism learns to approach and consume reinforcers as soon as they are delivered.  Once this happens reliably, you can move on to phase two.

Rule number one:  always work with what the organism gives you.  Now, you could just wait until Sniffy presses the lever but he might not ever press it or he might press it so infrequently so as to never successfully compete with his exploratory or grooming behavior.  So instead, you've got to pick something that Sniffy does without your help.  In addition to occassionally pressing the lever, Sniffy grooms, rears on his hind legs, explores the cage, and generally wanders about aimlessly.

We'll start out with a simple description of a dependency between responses and subsequent reinforcement.  For example, everytime Sniffy rears on his hind legs we'll deliver reinforcement (by clicking on the lever ourselves with the mouse button).  Notice that we didn't say anything about where Sniffy needs to rear.  In fact, he can rear anywhere and receive reinforcement.

Once you have Sniffy rearing fairly often, you can change the dependency to now include a location.  Changing the dependency, or better put, the descriptive operant, introduces another very important term, Extinction.

Extinction:  (this definition usually refers to one of two things)
  • a change in the dependency between responses and reinforcement such that at least some of the responses that were previously reinforced no longer produce reinforcement
  • a reduction in response rate due to the above change in dependency

When you change the descriptive operant from rearing anywhere in the cage to rearing that occurs near the lever, you're implementing extinction for any rearing that occurs away from the lever.  So, because those responses are contacting extinction, two things should happen.  One, those responses should extinguish (the second part of our definition of extinction; they should occur less often).  Two, the organism's behavior should become more variable.  Now, inevitably, while the organism is emitting these 'newer' responses, one will occur that is a little closer to lever pressing than the behavior you were just reinforcing.  In this example, Sniffy should be more likely to rear near the lever.

Shaping relies on the interactions between the effects of reinforcement (strengthening behavior) and extinction (increasing the variability of behavior).  You might be able to guess how one might get Sniffy to press the lever.  Once you've got him rearing toward the front of the cage consistently, you could only reinforce rearing immediately on top of the lever.  Soon, Sniffy should start to press the lever on his own, quite frequently.

So the general idea is to reinforce something that looks like what we want the organism to eventually do and then when we see it occurring often, we pick something else that looks even more like the target response, until the organism eventually engages in the target response.

To get a better idea of how to specifically shape Sniffy's behavior you can read pages 20-26 in Sniffy's book.  This assignment is similar to Exercises 1 and 2 from that section, though, not exactly.

Instructions
1.  Start Sniffy Lite.

2.  Close the "Operant Associations" window.  It's the one residing in the upper right-hand corner of the screen and is highlighted in the picture below.  Email your TA if you want to know why he thinks you should close this window.

Operant Associations window
Figure 1.  The Operant Associations window.

3.  Sit back, relax, watch Sniffy scurry about, and get some ideas about responses that you could reinforce that also look something like our target response (lever pressing).  It really doesn't matter what you pick.  The instructions that came with Sniffy suggest that you choose rearing (p. 24) but, if you're feeling adventurious, feel free to pick something else.  It probably doesn't matter what you choose, as long as it resembles the target response (or a part of the target response).  For example, you might try to reinforce movement toward the lever.

4.  Reinforce those responses as soon as you see them occur.  If you've chosen to reinforce rearing, as soon as you see Sniffy rear up, deliver a reinforcer by clicking on the lever/bar in the center of the chamber wall.  There's a diagram of the chamber on page 12 if you get stuck and can't find the lever.  It's okay if Sniffy doesn't run to the hopper when you first deliver a reinforcer; he will after several tries.

5.  Continue to deliver reinforcers when you see Sniffy engage in your chosen 'approximation' until it begins to occur frequently.  How frequent is frequently?  There is no straight answer.  For a discussion on this topic, check out the section on shaping found on page 24 of the Sniffy book.  Generally, I keep the same criteria and at the same time I keep an eye open for occurrences of what could be the next criteria.  For example, if I am currently reinforcing rearing that happens anywhere in the chamber, and I see Sniffy rear up in the vicinity of the lever a few times, I might then start reinforcing rearing that only happens near the lever.  It's all pretty subjective.  Just pick something, and stick with it.  If you picked something but you haven't seen it happen in the last 10 minutes, go ahead and change it to something that happens more often.

6.  Repeat step 5 until Sniffy is pressing the lever consistently.  How consistent is consistent?  I'm glad you asked!  If you haven't noticed, there's a window near the bottom of the screen labeled, "Cumulative Record : 1" (check out the picture below).  This is a cumulative record.  Check out page 27 in Sniffy's book for a better description than the one I'm about to give you.  Basically, each time Sniffy presses the lever, the line moves up.  If you just let Sniffy play by himself without shaping up any lever pressing, you'll get a pretty straight line (indicating low rates of responding).  Reinforcer deliveries are marked on the cumulative record by what's called a pip (a slash).  When you start Sniffy's program, it's setup to deliver reinforcers whenever Sniffy presses the lever (that's good if we're trying to teach him to press the lever).  Figure 3 (below) shows what the cumulative record might look like after a lot of training.  When Sniffy has emitted 75 lever presses, the cumulative record resets and the line moves back to the bottom of the record.  Three resets can been seen in Figure 4.  Figure 4 also shows what the cumulative record will look like when you're finished.

A screenshot of Sniffy
Figure 2.  The Cumulative Record window.

This is what it looks like when you're almost done.
Figure 3.  After a lot of training.

This is what it should look like when you're done.
Figure 4.  It should look something like this when you're done.

7.  You're done when Sniffy has emitted enough responses for the record to have reset at least twice.

8.  Save your data file!!! You'll need to use it for just about every future Sniffy assignment.

Assessment
Print out your cumulative record:
1.  Click on the title of the cumulative record window.
2. Click once on "File" on menu on the top of the screen.
3.  Then click on "Print Window."
4.  Write your name and the date on the print out.
5.  Turn in your cumulative record at the beginning of class.