Tarquin the Terrible: Punishment Effectiveness - Operant Conditioning, Positive Reinforcement, Negative and Positive Punishments

Operant conditioning (sometimes referred to as instrumental conditioning) is a method of learning that occurs through rewards and punishments for behavior. Through operant conditioning, an association is made between a behavior and a consequence for that behavior.
For example, when a lab rat presses a blue button, he receives a food pellet as a reward, but when he presses the red button he receives a mild electric shock. As a result, he learns to press the blue button but avoid the red button.

The History of Operant Conditioning

Operant conditioning was coined by behaviorist B.F. Skinner, which is why you may occasionally hear it referred to as Skinnerian conditioning. As a behaviorist, Skinner believed that it was not really necessary to look at internal thoughts and motivations in order to explain behavior. Instead, he suggested, we should look only at the external, observable causes of human behavior.
Through the first part of the 20th-century, behaviorism had become a major force within psychology.

The ideas of John B. Watson dominated this school of thought early on. Watson focused on the principles of classical conditioning, once famously suggesting that he could take any person regardless of their background and train them to be anything he chose.

Where the early behaviorists had focused their interests on associative learning, Skinner was more interested in how the consequences of people's actions influenced their behavior.
Skinner used the term operant to refer to any "active behavior that operates upon the environment to generate consequences" (1953). In other words, Skinner's theory explained how we acquire the range of learned behaviors we exhibit each and every day.
His theory was heavily influenced by the work of psychologist Edward Thorndike, who had proposed what he called the law of effect. According to this principle, actions that are followed by desirable outcomes are more likely to be repeated while those followed by undesirable outcomes are less likely to be repeated.
Operant conditioning relies on a fairly simple premise - actions that are followed by reinforcement will be strengthened and more likely to occur again in the future. If you tell a funny story in class and everybody laughs, you will probably be more likely to tell that story again in the future.
Conversely, actions that result in punishment or undesirable consequences will be weakened and less likely to occur again in the future. If you tell the same story again in another class but nobody laughs this time, you will be less likely to repeat the story again in the future.

Types of Behaviors

Skinner distinguished between two different types of behaviors: respondent behaviors and operant behaviors. Respondent behaviors are those that occur automatically and reflexively, such as pulling your hand back from a hot stove or jerking your leg when the doctor taps on your knee. You don't have to learn these behaviors, they simply occur automatically and involuntarily.
Operant behavior, on the other hand, are those under our conscious control. Some may occur spontaneously and others purposely, but it is the consequences of these actions that then influence whether or not they occur again in the future. Our actions on the environment and the consequences of those action make up an important part of the learning process.
While classical conditioning could account for respondent behaviors, Skinner realized that it could not account for a great deal of learning. Instead, Skinner suggested that operant conditioning held far greater importance.
Skinner had invented many different devices during his boyhood and he put these skills to work during his studies on operant conditioning. He created a device known as an operant conditioning chamber, most often referred to today as a Skinner box. The chamber was essentially a box that could hold a small animal such as a rat or pigeon. The box also contained a bar or key that the animal could press in order to receive a reward.
In order to track responses, Skinner also developed a device known as a cumulative recorder. The device recorded responses as a upward movement of a line so that response rates could be read by looking at the slope of the line.

Components of Operant Conditioning

Some key concepts in operant conditioning:

Reinforcement

Reinforcement is any event that strengthens or increases the behavior it follows. There are two kinds of reinforcers:

Positive reinforcers are favorable events or outcomes that are presented after the behavior. In situations that reflect positive reinforcement, a response or behavior is strengthened by the addition of something, such as praise or a direct reward.
Negative reinforcers involve the removal of an unfavorable events or outcomes after the display of a behavior. In these situations, a response is strengthened by the removal of something considered unpleasant.

In both of these cases of reinforcement, the behavior increases.

Punishment

Punishment, on the other hand, is the presentation of an adverse event or outcome that causes a decrease in the behavior it follows. There are two kinds of punishment:

Positive punishment, sometimes referred to as punishment by application, involves the presentation of an unfavorable event or outcome in order to weaken the response it follows.
Negative punishment, also known as punishment by removal, occurs when an favorable event or outcome is removed after a behavior occurs.

In both of these cases of punishment, the behavior decreases.

Reinforcement Schedules

Skinner also found that when and how often behaviors were reinforced played a role in the speed and strength of acquisition. He identified several different schedules of reinforcement:

Continuous reinforcement involves delivery a reinforcement every time a response occurs. Learning tends to occur relatively quickly, yet the response rate is quite low. Extinction also occurs very quickly once reinforcement is halted.
Fixed-ratio schedules are a type of partial reinforcement. Responses are reinforced only after a specific number of responses have occurred. This typically leads to a fairly steady response rate.
Fixed-interval schedules are another form of partial reinforcement. Reinforcement occurs only after a certain interval of time has elapsed. Response rates remain fairly steady and start to increase as the reinforcement time draws near, but slow immediately after the reinforcement has been delivered.
Variable-ratio schedules are also a type of partial reinforcement that involve reinforcing behavior after a varied number of responses. This leads to both a high response rate and slow extinction rates.
Variable-interval schedules are the final form of partial reinforcement Skinner described. This schedule involves delivering reinforcement after a variable amount of time has elapsed. This also tends to lead to a fast response rate and slow extinction rate.

Examples of Operant Conditioning

We can find examples of operant conditioning at work all around us. Consider the case of children completing homework to earn a reward from a parent or teacher, or employees finishing projects to receive praise or promotions.
In these examples, the promise or possibility of rewards causes an increase in behavior, but operant conditioning can also be used to decrease a behavior. The removal of a desirable outcome or the application of a negative outcome can be used to decrease or prevent undesirable behaviors. For example, a child may be told they will lose recess privileges if they talk out of turn in class. This potential for punishment may lead to a decrease in disruptive behaviors.
Learn more: The Differences Between Classical Conditioning and Operant Conditioning

Classical and operant conditioning are two important concepts central to behavioral psychology. While both result in learning, the processes are quite different. In order to understand how each of these behavior modification techniques can be used, it is also essential to understand how classical conditioning and operant conditioning differ from one another.
Let's start by looking at some of the most basic differences.

Classical Conditioning

First described by Ivan Pavlov, a Russian physiologist
Involves placing a neutral signal before a reflex
Focuses on involuntary, automatic behaviors

Operant Conditioning

First described by B. F. Skinner, an American psychologist
Involves applying reinforcement or punishment after a behavior
Focuses on strengthening or weakening voluntary behaviors

How Classical Conditioning Works

Even if you are not a psychology student, you have probably at least heard about Pavlov's dogs. In his famous experiment, Ivan Pavlov noticed dogs began to salivate in response to a tone after the sound had been repeatedly paired with the presentation of food.

Pavlov quickly realized that this was a learned response and set out to further investigate the conditioning process.

Classical conditioning involves pairing a previously neutral stimulus (such as the sound of a bell) with an unconditioned stimulus (the taste of food). This unconditioned stimulus naturally and automatically triggers salivating as a response to the food, which is known as the unconditioned response. After associating the neutral stimulus and the unconditioned stimulus, the sound of the bell alone will start to evoke salivating as a response. The sound of the bell is now known as the conditioned stimulus and salivating in response to the bell is known as the conditioned response.

How Operant Conditioning Works

Operant conditioning focuses on using either reinforcement or punishment to increase or decrease a behavior. Through this process, an association is formed between the behavior and the consequences for that behavior. For example, imagine that a trainer is trying to teach a dog to fetch a ball. When the dog successful chases and picks up the ball, the dog receives praise as a reward.

When the animal fails to retrieve the ball, the trainer withholds the praise. Eventually, the dog forms an association between his behavior of fetching the ball and receiving the desired reward.

The Differences Between Classical and Operant Conditioning

One of the simplest ways to remember the differences between classical and operant conditioning is to focus on whether the behavior is involuntary or voluntary. Classical conditioning involves making an association between an involuntary response and a stimulus, while operant conditioning is about making an association between a voluntary behavior and a consequence.
In operant conditioning, the learner is also rewarded with incentives, while classical conditioning involves no such enticements. Also remember that classical conditioning is passive on the part of the learner, while operant conditioning requires the learner to actively participate and perform some type of action in order to be rewarded or punished.
Today, both classical and operant conditioning are utilized for a variety of purposes by teachers, parents, psychologists, animal trainers and many others. In animal training, a trainer might utilize classical conditioning by repeatedly pairing the sound of a clicker with the taste of food. Eventually, the sound of the clicker alone will begin to produce the same response that the taste of food would.
In a classroom setting, a teacher might utilize operant conditioning by offering tokens as rewards for good behavior. Students can then turn in these tokens to receive some type of reward such as treat or extra play time.

"Positive reinforcement is the most important and most widely applied principle of behaviour analysis"

- Cooper, Heron and Heward (2007, p.257)

What is Reinforcement?

Miltenberger (2008, p.73) states that ‘reinforcement is the process in which a behaviour is strengthened by the immediate consequence that reliably follows its occurrence’. To “strengthen” a behaviour is to make it occur more frequently; as clarified by Michael (2004, p. 258) stating that 'when a type of behaviour is followed by reinforcement there will be an increased future frequency of that type of behaviour'.
This basically means that if you engage in a certain behaviour, and this behaviour gets you something that you wanted, then you are more likely to engage in that same behaviour again when you want the same outcome in the future.
For example, when you want to turn on your television you will press the “ON” button. Before you pressed this button the TV was off but you wanted it on and so after pressing the button you got what you wanted. Therefore, in future, when you want the TV on you will press the ON button again and so positive reinforcement has occurred.
You won’t press the VOLUME button because when you have done this in the past it doesn’t turn on your TV, therefore pressing the VOLUME button when you want the TV to turn on will mean positive reinforcement does not occur.
Note though that making a behaviour occur more frequently is not the only “strengthening” that can occur. The duration, latency, magnitude and/or topography of behaviours can also be strengthened (Cooper et al, 2007).

What’s the “Positive” in Positive Reinforcement?

The term “positive” is used in conjunction with reinforcement to denote a specific form of reinforcement. It does not mean something “good” but instead the term positive relates more to the mathematical term of “adding” or “addition”.
This is because positive reinforcement is the addition of something as a result of a behaviour after you have engaged in this behaviour. Before you engaged in the behaviour, what you wanted was not present but after you engaged in the behaviour what you wanted is present.

Showing how positive reinforcement is the addition of something you wanted after engaging in a certain behaviour.

The "positive" in positive reinforcement is when something that was not present becomes present after engaging in a behaviour.

Positive Reinforcement Example

Positive Reinforcement Does Not Occur: Johnny comes running into his mother after being outside in the hot sun playing with his friends. He exclaims “I’m really thirsty! Can I have some coke please Mam?” His mother says “No Johnny, now run along back outside to your friends”. Johnny isn’t very happy with this and he decides there’s no point in asking for coke the next time he wants some because he didn’t get any this time.
In this example, although Johnny asked for coke he was not given it. Therefore even after engaging in the behaviour of requesting coke, positive reinforcement did not occur. Additionally, because he was not given coke by asking for it he has decided not to ask next time and so there will not be an increase in the future frequency of that behaviour.
Positive Reinforcement Does Occur: Johnny comes running into his mother after being outside in the hot sun playing with his friends. He exclaims “I’m really thirsty! Can I have some coke please Mam?” His mother says “Of course you can Johnny!” and promptly gets a bottle of coke from the refrigerator and pours him a glass. He gulps it down and decides that the next time he wants some coke he’ll make sure to ask again.
In this example, Johnny had no coke but wanted some, his behaviour (asking for coke) led to him getting what he wanted. Johnny’s request for coke was positively reinforced by him being given some. By being given what he wanted he is also more likely to ask this question again at a later time when he is thirsty and so there will be an increased future frequency of that behaviour.

Asking and getting a drink of coke as an example of positive reinforcement.

Johnny's request for a drink of coke is positively reinforced by him being given some (coke is added).

Reinforce the Behaviour…not the Person

In the example above where positive reinforcement did occur, it is important to note that it was Johnny’s behaviour that was reinforced and not Johnny himself. It would be incorrect to say “Johnny was positively reinforced with coke” because it was his request (behaviour) for coke that was reinforced.
Instead you would say “Johnny’s request for coke was positively reinforced”. In the words of Cooper et al (2007, p. 258) ‘behaviours are reinforced, not people.’

Educational Example of Positive Reinforcement

A teacher is working through a discrimination programme where she places photographs of common fruits on the desk in front of her student, Brian, and then asks him to point to specific ones. For example, the teacher will place one photo of an apple and one photo of an orange on the desk and then says “point to apple” and Brian must point to the apple.
The teachers have found that Brian is only getting 2 out of 10 discriminations correct. As a way to try and increase his correct discriminations (his behaviour) they have decided to use a token economy as a way of providing positive reinforcement to Brian for responding correctly.
For every correct response the teacher will give Brian a token. These tokens are like the student’s version of money where he can earn them for completing his work and then use them to buy things that he wants such as fun activities, breaks from school work or sweets. The more correct responses he makes the more tokens he earns and so there will (or should) be an increased future frequency of correct responding because more tokens means more fun activities.

Delivering tokens to a child for responding correctly as an example of positive reinforcement.

Using a token economy to delivery positive reinforcement for responding correctly should increase the amount of correct responses.

Positive Reinforcement Does Occur: on the first day using the token economy, Brian gets 6 out of 10 correct; on the second day he gets 8 out of 10 and on the third day he gets 10 out of 10. Remember, he is given tokens when he responds correctly; these tokens are “added” to the amount he has after he responds correctly and because his correct responses have increased it can be said that positive reinforcement is occurring.
The tokens act as a form of positive reinforcement because before pointing to the correct photo he wouldn’t have a token and then after pointing to the correct photo he gets one. From looking at how his correct responding increased, it could be said that Brian wanted to earn the tokens because earning them leads to him being able to trade them for a reinforcing activity and therefore he is more likely to continue to respond correctly when using this token economy.
Positive Reinforcement Does Not Occur: on the first day using the token economy, Brian gets 3 out of 10 correct; on the second day he gets 2 out of 10 and on the third day he gets 2 out of 10 again. In this case, positive reinforcement has not occurred because his responding has not increased. Even though he is being given tokens for correct responding and that he can trade the tokens for fun activities, his correct responding has not increased and therefore positive reinforcement is not occurring.

What is Negative Punishments and What is Positive Punishments?

Negative punishment is an important concept in B. F. Skinner's theory of operant conditioning. In behavioral psychology, the goal of punishment is to decrease the behavior that precedes it. In the case of negative punishment, it involves taking something good or desirable away to reduce the occurrence of a particular behavior.
One of the easiest ways to remember this concept is to note that in behavioral terms, positive means adding something while negative means taking something away. For this reason, negative punishment is often referred to as punishment by removal.

Examples of Negative Punishment

After getting into a fight with his sister over who gets to play with a new toy, the mother simply takes the toy away.
A teenage girl stays out for an hour past her curfew, so her parents ground her for a week.
A third-grade boy yells at another student during class, so his teacher takes away "good behavior" tokens that can be redeemed for prizes.

Can you identify the examples of negative punishment? Losing access to a toy, being grounded and losing reward tokens are all examples of negative punishment. In each case, something good is being taken away as a result of the individual's undesirable behavior.

The Effects of Negative Punishment

While negative punishment can be highly effective, Skinner and other researchers have suggested that a number of different factors can influence its success.
Negative punishment is most effective when:

It immediately follows a response
It is applied consistently

Consider this example: a teenage girl has a driver's license, but it does not allow her to drive at night.

However, she drives at night several times a week without facing any consequences. One evening while she is driving to the mall with a friend, she is pulled over and issued a ticket. As a result, she receives a notice in the mail a week later informing her that her driver's privileges have been revoked for 30 days. Once she regains her license, she goes back to driving at night even though she has six more months before she is legally allowed to drive during evening and nighttime hours.

As you might have guessed, losing her license is the negative punishment in this example. So why would she continue to engage in the behavior even though it led to punishment? Because the punishment was inconsistently applied (she drove at night many times without facing punishment) and because the punishment was not applied immediately (her driving privileges were not revoked until a week after she was caught), the negative punishment was not effective at curtailing her behavior.
Another major problem with punishment is that while it might reduce the unwanted behavior, it does not provide any information or instruction on more appropriate reactions.

B.F. Skinner also noted that once the punishment is withdrawn, the behavior is very likely to return.

Positive punishment is a concept used in B. F. Skinner's theory of operant conditioning. The goal of punishment is to decrease the behavior that it follows. In the case of positive punishment, it involves presenting an unfavorable outcome or event following an undesirable behavior.
The concept of positive punishment can difficult to remember, especially because it seems like a contradiction. How can punishment be positive? The easiest way to remember this concept is to note that it involves an aversive stimulus that is added to the situation. For this reason, positive punishment is sometimes referred to as punishment by application.

Examples of Positive Punishment

You wear your favorite baseball cap to class but are reprimanded by your instructor for violating your school's dress code.
Because you're late to work one morning, you drive over the speed limit through a school zone. As a result, you get pulled over by a police officer and receive a ticket.

Your cell phone rings in the middle of a class lecture, and you are scolded by your teacher for not turning your phone off before class.

Can you identify the examples of positive punishment? The teacher reprimanding you for breaking the dress code, the officer issuing the speeding ticket and the teacher scolding you for not turning off your cell phone are all examples of positive punishment. They represent aversive stimuli that are meant to decrease the behavior that they follow.
In all of the examples above, the positive punishment is purposely administered by another person. However, positive punishment can also occur as a natural consequence of a behavior.

Touching a hot stove or a sharp object can cause painful injuries that serve as natural positive punishers for the behaviors.

Spanking as Positive Punishment

While positive punishment can be effective in some situations, B.F. Skinner noted that its use must be weighed against any potential negative effects. One of the best-known examples of positive punishment is spanking. Defined as striking a child across the buttocks with an open hand, this form of discipline is reportedly used by approximately 75 percent of parents in the United States.
Some researchers have suggested that mild, occasional spanking is not harmful, especially when used in conjunction with other forms of discipline. However, in one large meta-analysis of previous research, psychologist Elizabeth Gershoff found that spanking was associated poor parent-child relationships as well as with increases in antisocial behavior, delinquency and aggressiveness. More recent studies that controlled for a variety of confounding variables also found similar results.

Punishment is a term used in operant conditioning to refer to any change that occurs after a behavior that reduces the likelihood that that behavior will occur again in the future. While positive and negative reinforcement are used to increase behaviors, punishment is focused on reducing or eliminating unwanted behaviors.
Punishment is often mistakenly confused with negative reinforcement. Remember, reinforcement always increases the chances that a behavior will occur and punishment always decreases the chances that a behavior will occur.

Types of Punishment

Behaviorist B. F. Skinner, the psychologist who first described operant conditioning, identified two different kinds of aversive stimuli that can be used as punishment.

Positive Punishment: This type of punishment is also known as "punishment by application." Positive punishment involves presenting an aversive stimulus after a behavior as occurred. For example, when a student talks out of turn in the middle of class, the teacher might scold the child for interrupting her.

Negative Punishment: This type of punishment is also known as "punishment by removal." Negative punishment involves taking away a desirable stimulus after a behavior as occurred. For example, when the student from the previous example talks out of turn again, the teacher promptly tells the child that he will have to miss recess because of his behavior.

Is Punishment Effective?

While punishment can be effective in some cases, you can probably think of a few examples of when punishment does not reduce a behavior. Prison is one example. After being sent to jail for a crime, people often continue committing crimes once they are released from prison.

Why is it that punishment seems to work in some instances, but not in others? Researchers have found a number of factors that contribute to how effective punishment is in different situations. First, punishment is more likely to lead to a reduction in behavior if it immediately follows the behavior. Prison sentences often occur long after the crime has been committed, which may help explain why sending people to jail does not always lead to a reduction in criminal behavior.
Second, punishment achieves greater results when it is consistently applied. It can be difficult to administer a punishment every single time a behavior occurs. For example, people often continue to drive over the speed limit even after receiving a speeding ticket. Why? Because the behavior is inconsistently punished.
Punishment also has some notable drawbacks. First, any behavior changes that result from punishment are often temporary.
"Punished behavior is likely to reappear after the punitive consequences are withdrawn," Skinner explained in his book About Behaviorism.
Perhaps the greatest drawback is the fact that punishment does not actually offer any information about more appropriate or desired behaviors.

While subjects might be learning to not perform certain actions, they are not really learning anything about what they should be doing.

Another thing to consider about punishment is that it can have unintended and undesirable consequences. For example, while approximately 75 percent of parents in the United States report spanking their children on occasion, researchers have found that this type of physical punishment may lead to antisocial behavior, aggressiveness and delinquency among children. For this reason, Skinner and other psychologists suggest that any potential short-term gains from using punishment as a behavior modification tool need to be weighed again the potential long-term consequences.

Tarquin the Terrible

About Me

Thursday, 10 March 2016

Punishment Effectiveness - Operant Conditioning, Positive Reinforcement, Negative and Positive Punishments