Golden Rule Training: Correct Use of Positive Reinforcement

There is a misconception that positive reinforcement is simply using food in training. “My dog isn’t food motivated” or “What if he isn’t hungry” are common remarks to hear when discussing positive training. Some people believe that a dog cannot be trained to recall off chasing a squirrel through positive only methods because chasing that squirrel is the most reinforcing thing for the dog, thus he will ignore the chance to simply earn a food reward. When dogs fail to heed a command in order to receive a food reward, often punishment such as shock collars, choke collars or other methods are resorted to. Many trainers feel moving to punishment tools is justified since the positive method has ‘failed’.

I pride myself on being a ‘positive trainer’ in that I will not use shocks, prong collars, chokes or physical force in my training. There is a myth that positive training does not get reliable results or that some dogs cannot be trained with it. Positive methods are sound and will work with any dog, but you must deploy them skillfully.

Full dog approach

Knowing what behavior change you want in a dog isn’t enough. A full history and detail of the dog’s living conditions and history is required so you can understand what is influencing the behavior. A dog that is jumping or barking needs to be trained to be calm and quiet. However if the dog lacks exercise, is isolated, or excessively confined, or perhaps subjected to inconsistent training (maybe a doggie daycare he goes to encourages jumping and barking!) then the picture starts to get clearer as to why this dog is always hyperactive. All private training should start with a full profile on the dog’s routine.

Rather than force, shock or punish the dog for jumping, I’ll look at ways to improve his living conditions so hyper behavior decreases. Stuffing food in a chew toy for example reinforces calm behavior. While punishment may reduce the behavior, the underlying cause of the behavior is still there. Without addressing the influences, the animal may manifest those emotions in other undesirable behaviors later on. Now we are ready to train.

Classical Conditioning is on your side

Classical conditioning is happening all the time. By pairing positive reinforcement with learning, the dog will enjoy training instead of simply tolerating it to avoid punishments. When you are consistently pairing training with reward, the dog starts to see you, the environment and training as enjoyable. Training is then always viewed as a fun and positive experience. By using positive reinforcement you can use classical conditioning to make things your dog may fear or dislike into predictors of reward. This will change the dog’s emotional state and have an impact on his behavior. Punishment training can classically condition your dog in the opposite direction, to dislike whatever was around him at the time of the punishment – and prompt further aggressive or fearful behavior later.

It’s not all about food

Positive reinforcement is not just about feeding. It’s critical to vary your rewards and use what is reinforcing for that dog in that moment. An open door, a ball toss, a sniff of a pole - Positive training is about putting what the dog wants under your control. A dog that is being rewarded with liver treats every-time will blow that reward off should they favor something else. Positive trainers know the overall message that needs to be clear is “I control what you want, listen to me first”. When the dog understands this message, he is happy to perform behavior, provided he has actually been ‘trained’ to a high enough level.

Positive training involves keeping the environment under temporary control until we build a strong habit so the dog will listen to our cue because he has been conditioned to do so in order to access environmental reinforcement.

Training for success

Learning follows 4 distinct periods:

Acquisition: During this phase, dogs are given continuous reinforcement in order to build a new behavior. We also control the environment by starting training in a no distraction environment so the dog will focus on training. It’s also important the dog has been trained for basic things such as attention before moving onto training a difficult behavior like heeling.

Fluency: The duration of a behavior is increased, slowly small distractions are added. The distractions remain under our control. Trainer Jean Donaldson popularized a method called “push, drop, stick”. Five repetitions of a behavior are performed with a certain level of distraction and difficulty. If the dog performs the behavior 5 times correctly, difficulty is increased. If the dog performs the behavior only 3-4 times, the exercise is repeated. If the dog fails to perform the behavior 3 times, you drop the difficulty to a level that he will succeed at. The behavior to reward ratio should also be increased to 2:1 and is slowly increased until the dog is performing the behavior several times for 1 reward.

Generalization: The behavior is practiced in different areas with increasing levels of distraction. Build the behavior slowly and with only one increase in difficulty at a time. You would not build a 2 minute stay in a house, than expect a 5 minute stay at a dog park. This is just as unreasonable as expecting your son to make a professional sports league after a couple practices at the local arena. During this phase rewards are greatly reduced since the behavior is lasting longer and with increased difficulty. Variable schedules of reinforcement are introduced so the dog is randomly rewarded for correct behaviors, and thus will repeat them since he doesn’t know when the reward will come. Habits are now being formed.

Maintenance: In the final phase you have trained in a number of areas, built up a strong behavior, and now just need to ensure the dog has a chance to practice occasionally. Differential reinforcement is critical at this stage, so the dog only receives a reward for the best and quickest manifestations of the behavior and thus it continues to improve.

Premack Exercises

It’s critical the dog learns that ignoring reinforcement and listening to instruction first is the path to reinforcement. Premack's Principle suggests that if a person wants to perform a given activity, the person will perform a less desirable activity to get at the more desirable activity. You must change a dog’s perception about how to obtain environmental reinforcement.

Setup a partner who tempts the dog with food in a closed hand. Have the handler recall the dog away from the food. Once the dog listens, the partner runs over and gives the reward. Increasing difficulty with an exercise like this can get a dog running passed open food to run to the foodless handler time and time again. If you’re on a walk with your dog, stop and ask for a sit. He will learn once he sits, the walk resumes, thus reinforcing the sit. Over time the dog will sit immediately on the first cue after just one or two walks.

My method of teaching ‘leave it’ involves leashing the dog and tossing food out of his reach while saying ‘leave it’. Once the dog stops straining on leash at the food and looks at me, he is rewarded – sometimes by me, or sometimes by being allowed to reach the food. This teaches a loose leash towards reward, focus on me, and a “Leave it” behavior that brings his focus back on me and away from what he just saw being thrown. After a while you can remove the leash because the behavior has now become a habit. This method is much more pleasant for a dog than the old traditional style of snapping the leash when a dog went for the forbidden item.

Repetitive re-instruction with negative reinforcement

This is an idea that Ian Dunbar has used for many years. If a dog has had a high level of training yet fails to heed a command, the command is given again and the dog is restricted from gaining any reinforcement until he listens. It’s critical that a dog obey a command once it is given, but this is something that is trained and practiced, not forced through painful or scary punishment.

If a dog fails to sit at the door, repeat the instruction. Since the door will not open unless the dog listens, he will eventually comply to get what he wants. There is nothing wrong with repeating the cue if you feel the dog does in fact know the cue (based on previously sound training) and he is choosing to ignore it because he is focused on the reinforcement (open the door already!). Once he learns the door will not open until there is compliance, he will obey. Thus there is negative reinforcement through the relief of stress and frustration by the door opening (oh, I have to listen to YOU when I want things!). Similar exercises can be setup with chasing prey and playing with dogs.

Using this principle consistently the dog will learn to simply obey the command the first time to cut down on the delay of reinforcement.

Ian Dunbar says about this method:

“The secret to success is to never give up. The dog learns that she has to sit following a single command before being allowed to play once more. This technique is extremely effective, works surprisingly quickly, and prevents the need for physical restraint or aversive punishment. “

When a dog doesn’t listen

Before simply doling out punishment, review the above. Is the dog’s lifestyle a problem? Did you follow 4 phases of learning correctly or are you asking a dog that just learned stay in the house to now do the behavior on a busy street? Does the dog have a good history of Premack training and handler attention? Have you insisted on commands in the past or gave up and let the dog run off? Did you control the environment and build a habit before allowing the dog more freedom of choice?
Positive training will work for all dogs, but it depends on the skill of that trainer. A successful positive trainer does simply more than toss out cookies!

Professional athletes train every-day and for many years to reach a high level. Your dog is your best friend. Take the time and effort to train him how you would like to be trained if he was holding the leash.

Golden Rule Training

Sunday, December 4, 2011

Correct Use of Positive Reinforcement

No comments:

Post a Comment