OPERANT CONDITIONING -- learning in terms of rewards & punishments
Generally, operant condition works according to THE LAW OF EFFECT, which simply states that what FOLLOWS a behavior can influence likelihood that the behavior will occurring again.
B. F. SKINNER studied operant conditioning by using a SKINNER BOX (a simplified environment in which a rat pushes a level to dispense food pellets (rewards)). The point here is to get at the PRINCIPLES of operant conditioning.
SHAPING -- generally, the best way to get a creature to
learn a complex behavior (X) is first to reward a behavior that approaches
X, and then reward a behavior that gets a little closer to X, etc., etc.
First, here are some handy ways to distinguish classical from operant conditioning:
Classical conditioning involves RESPONDENT BEHAVIOR (an automatic response to a stimulus). Also, the new stimulus (CS) is presented BEFORE the behavior.
Operant conditioning involves OPERANT BEHAVIOR (a behavior that tries to operate on the environment to get rewards & avoid punishments). Also, the new stimulus (e.g., a reward) is presented AFTER the behavior.
Technically, rewards are called REINFORCERS, which by definition INCREASE the likelihood of the behaviors they follow. There are 2 basic kinds:
POSITIVE REINFORCER -- a pleasurable stimulus presented after a behavior.
NEGATIVE REINFORCER -- removal of an aversive stimulus
after a behavior. (n.b. this also INCREASES the likelihood of the behavior).
A reinforcer can also be either a
PRIMARY REINFORCER -- inherently satisfying (e.g., candy)
SECONDARY REINFORCER -- learned (e.g., money, grades)
Technically, PUNISHMENT involves introducing an aversive stimulus after a behavior. By definition, it DECREASES the likelihood of the behavior it follows.
POSITIVE PUNISHMENT -- an aversive stimulus presented after a behavior.
NEGATIVE PUNISHMENT -- removal of an pleasant stimulus after a behavior. (n.b. this also DECREASES the likelihood of the behavior).
REINFORCEMENT SCHEDULES -- different ways of passing out reinforcers produce different patterns of learning.
CONTINUOUS REINFORCEMENT -- giving a reinforcer every time the behavior occurs -- produces fast learning, but also fast extinction.
PARTIAL REINFORCEMENT -- giving a reinforcer only some of time -- produces slower learning, but also more resistance to extinction.
There are four basic SCHEDULES OF PARTIAL REINFORCEMENT:
FIXED RATIO: a reinforcer after a fixed # of behaviors. Fastest rate of learning.
VARIABLE RATIO: a reinforcer after an unpredictable, varying # of behaviors (e.g., gambling).
FIXED INTERVAL: a reinforcer available only after time
intervals of predetermined duration. Stop-start pattern of learning, with
faster responding near interval's expiring (e.g., checking the mailbox).
VARIABLE INTERVAL: reinforcer available only after an unpredictable, varying intervals. Slowest learning, with slow, steady responding.
Graphically, it looks like this
# responses fixed ratio variable ratio
|
|
|
| fixed interval
|
| variable interval
|
|
|
|
|--------------------------------------------------------->
time
Generally, learning via operant conditioning follows the same scheme for GENERALIZATION, DISCRIMINATION, ACQUISITION & EXTINCTION as we saw in classical conditioning.
----------------------------------------------------------------------------------------
COGNITIVE LEARNING
Generally, COGNITIONS enter into how we learn & behave
Creatures form COGNITIVE MAPS (mental representations)
based on their simple experience, even if not reinforced.
BANDURA’S SOCIAL-COGNTIVE LEARNING
(from watching, imitating & modeling)
OBSERVATIONAL LEARNING, a.k.a. "modeling" -- learning by watching and imitating others.
BOBO DOLL EXPERIMENT - first demonstration of observational learning
4 primary factors: Attention, Memory, Imitation, Motivation
Also, BIOLOGICAL PREDISPOSITIONS enter into learning
Certain creatures tend to learn certain behaviors more easily than others.
Application: BEHAVIOR MODIFICATION -- the use of behavioral principles to modify behavior -- usually to eliminate problems.
Treatment of autism -- using operant conditioning to shape acceptable behaviors (looking in the eye, speaking, etc.).
Biofeedback -- using awareness of physiological responses
(usually via machines) to produce relaxation responses.