Shaping: Gradual Graduation!
Shaping is used to some degree whenever you apply operant conditioning and learning how shaping works will give you a wonderful tool in your training tool belt.
In this article, we’re going to review what shaping is, and how to use shaping in your training sessions.
What is Shaping?
The exact technical definition of shaping is “the reinforcement of successive approximations toward a target behavior.”
While that sounds complex, let’s translate it into a more digestible form. Shaping happens when we reward behaviors that come close to what we want. We then reward the behavior for getting closer and closer to what we want until it becomes what we want.
Alright, I will admit, that still sounds a little complex. But this entire idea works off of Successive Approximations. Successive Approximations are behaviors that are close to or similar to the behavior we’re looking for, but it’s still not exactly it. Think of Successive Approximations as the steps that lead to a behavior. All we do in shaping is provide a reinforcer for each of these steps.
To put it simply, Shaping with Successive Approximations is the scientific version of the “Hot/Cold” game – you’re telling your subject “hot” whenever they get closer to the behavior you want, and “cold” when they are not getting closer.
I think an example will make all of this a little bit more clear. But before we get to that…
Breaking Down the Behavior
This is personal experience speaking, but before you use shaping to teach a behavior, I recommend creating a breakdown of the behavior.
So let’s say we’re teaching a child to brush their teeth. Ask yourself, what are the different parts to the behavior of “brushing your teeth?”
An analysis in this scenario would look something like this:
- Pick up toothbrush
- Wet brush (if you don’t, then you’re a monster)
- Put toothbrush in mouth
- Move toothbrush in gentle circles around teeth and gums
- Remove toothbrush from mouth
- Rinse mouth and spit
Breaking down behaviors into small steps like this will give you a checklist of all the successive approximations you need your subject to do before they perform the behavior you want.
If there is ever a time where your subject doesn’t understand what you are reinforcing for, then try further breaking down each task. In this brushing example, if your child is having difficulty picking up the brush, then you can break the steps into moving their hand toward the brush, then holding the brush in their hands, etc.
Now that we understand this kind of behavior analysis, we can move on to the example of Shaping.
Let’s say we’re training a dog to pick up their water bowl and place it in the dishwasher.
To use shaping, we first start by placing the adorable furball in the middle of the kitchen and watching how they react.
Now, in this scenario, most dogs are going to pace around or start sniffing everything. The next step is, whenever the dog goes anywhere near or even looks at the bowl, then we click and give a reinforcer. Most methods I’ve learned will have you reset your dog by putting them in the original position and then watching how they react. This technique ensures that your animal understands what exactly is giving them a reinforcer. I’ll go more into this in a moment.
Looking at the bowl, or even taking a step toward the bowl in this example would be known as a successive approximation – it’s close to the behavior we want, but it’s not exactly it since we want the dog to pick up the bowl. Once the dog learns that every time it goes near or toward the dog bowl, it gets a reward, then next time, the dog will immediately go toward the dog bowl for a reinforcer.
After each time you give a reinforcer, you want to make sure you reward the next step in the behavior. The first few rewards will be for walking up to the bowl and getting closer to the bowl. The next step would be rewarding each time the dog either gets closer to the bowl or when it lowers its head to the bowl.
Returning back to resetting your animal’s position, I know several trainers who, once the dog has been treated for a successive approximation, would stop the dog, and place them in the middle of the room again. Why do this? Because it’s easier for your dog to understand that it’s the act of moving toward the bowl AND NOTHING ELSE that will get a reward. If you’ve owned a dog before or tried dog training yourself, you’ll know that most dogs will throw a million behaviors at you, and when you reinforce behaviors, you need to make it as clear as possible to your subject which behavior you want.
Once the dog reaches the bowl, the slightly tricky part is going to be reinforcing it when the dog puts the bowl in its mouth. This is where an analysis of the behavior comes in handy. Break down the behavior your dog has to do in order to put the bowl in their mouth or, if your dog figured out what you want fairly quickly, what behavior your dog has to do to put the bowl in the washing machine. This is also where you reward for each of those gradual steps being completed.
In this exact case, it’s been my experience that most dogs will meander over to the bowl (since they learned that we’re reinforcing for walking to the bowl) and then look back at us, unsure what to do. When they realize we’re not clicking and treating, then they will be confused and try new behaviors in an attempt to get us to click.
When your dog is throwing out new behaviors, make sure you click for each successive approximation your dog performs. Ask yourself questions like, “does my dog need to learn that every time it lowers its head toward the bowl then it’ll get a treat?” Or “does my dog need to learn that every time it touches the bowl, it will result in a treat?”
Keep an eye on your dog in this scenario, and go at the pace your dog is learning at. But remember, if your dog isn’t getting it, break down the behavior more or go back to a step your dog already knows.
But in this case, you would reward once your dog touched its food bowl. Then, reward again when it opens its mouth near the bowl, and finally, you reward when your dog finally picks it up into his mouth.
If you wanted to get your dog to put the bowl into the machine, then you would just continue reinforcing the successive approximations until your dog has reached the final behavior of picking up the bowl.
I love shaping, but in practice, shaping is not as easy as it sounds. This is just as much of an art form as it is a science.
When you’re training dogs, you’ll notice that dogs throw out a million and one behaviors for you to try and click. During this time, you need to click at just the right time for the right behavior so your dog understands what he or she needs to be doing.
However, when dogs throw out a million and one behaviors, you’re going to find that it’s rather difficult to time the click for that one behavior you want. Back in our example, when we put our dog into the middle of the kitchen, we’re waiting to click when our dog steps toward its food bowl. Totally reasonable. But what is more likely to happen is that our dog is going to step toward the bowl and then bark, or step toward the bowl and then sit, or step toward the bowl and twirl. If you click during any of those times, then your dog has to figure out what is being reinforced, the step or the other behavior. For all our dog knows, we just want them to stand in one spot and turn into a canine hurricane (this has happened to me once).
During this time, I find it easier to have done some prep work first. If I trained my dog to move toward something I point at, then it’ll be easier for me to train them into walking without other behaviors getting in the way. For example, if I was trying to shape a new behavior, and I had trained my dog beforehand to walk toward where I’m pointing, then I could just point at the bowl and my dog would walk up to it. I would still have to be patient and follow the normal ideas behind shaping (rewarding approximate behaviors, etc.) but this makes the entire process a bit easier.
However, for this scenario where I hadn’t done any other training, the best I can do is gradually teach my dog which specific behavior is going to result in a reward. The most important part here is that you, as a dog owner, must be patient and consistent. You’re not always going to click at the right time, and you’re not always going to reinforce the right behavior. Shaping in practice requires a lot of practice for both you and the subject you’re training. Be patient with yourself as you learn how to make this process work for you, and be patient with your subject if they don’t get it, or get taught the wrong thing during the training session.
I mentioned this earlier, but I think it’s worth talking about it again. Misclicks are going to happen. Your dog either looked at the dog bowl or started walking toward it, and you just started your click when they turn around and get interested in something else. The click goes off, and your dog thinks that “something else” is what got them their click.
During these times, be aware that you’ve made a mistake, but never break the click-treat rule – you clicked, your animal gets a treat. This rule must be maintained at all times.
During this time, you want to let your animal figure out that they’re on the wrong path and then, you have to reward each time they go near your target. In these scenarios, if your target is moveable, I’ve found that moving to a new (empty) setting and bringing your target with you can help make this a bit easier if you’ve made too many misclicks.
Shaping is going to be one of the most useful tools you can ever have in your training toolbelt. Teaching both people and animals through successive approximations will force you to stay focused on every behavior your subject is doing.
Most importantly, shaping will teach you how best to analyze behaviors and how best to teach behaviors. Being able to break down complex behaviors and figuring out when to reward your subject is a skill that can apply to both animals and potential consumers.
Thanks for reading, and thank you again for being patient with these blog posts.