Operant Variability: A Behavioral and Pharmacological Analysis by Erin Fae Cotton A dissertation submitted to the Graduate Faculty of Auburn University in partial fulfillment of the requirements for the Degree of Doctor of Philosophy Auburn, Alabama December 12, 2011 Keywords: operant variability, amphetamine, rats, control, intermittent Copyright 2011 by Erin Fae Cotton Approved by M. Christopher Newland, Chair, Alumni Professor of Psychology Alejandro A. Lazarte, Associate Professor of Psychology Jeffrey Katz, Alumni Professor of Psychology Jennifer M. Gillis-Mattson, Associate Professor of Psychology ii Abstract Behavioral variability is often demonstrated in the laboratory setting by imposing a multiple schedule of reinforcement: one component requires the animal to vary sequences of responding and the other component serves as a control. An important determinant of behavioral variability is the type of reinforcement schedule present. When reinforcement is infrequent or is discontinued, as during extinction, behavior begins to vary. The extent to which intermittent reinforcement produces variability can change based on reinforcer density and schedule type. Behavioral variability also increases when it is directly reinforced. This operant variability is robust and is unaffected by interventions. The purpose of the Experiment 1 was to determine intermittency parametrically by using different rates of reinforcement (both rich and lean) on operant variability in a session. Another purpose was to assess differences, if any, between interval schedules and ratio schedules of reinforcement on variability either when it is required or permitted. Long Evans rats were trained under a multiple VARY 8:4 FR 4 schedule. In the VARY 8:4 component, all four-response sequences that differed from previous 8 were reinforced. In the FR 4 component, all four- response sequences were reinforced. There was much higher variability (based on an entropy measure) in the VARY 8:4 than in the FR 4 component in the FR 1 condition. All intermittent reinforcement schedules, both VIs and the VR contingencies always increased entropy in the FR 4 component but did not affect behavior in the VARY 8:4 component. Variability in the unit FR 4 component reflected the prevailing schedule and returned to baseline rapidly upon imposition iii of the baseline schedule. Intermittent reinforcement blunted the differences between the VARY 8:4 and FR 4 components regardless of the schedule parameters. The purpose of Experiment 2 was to examine intermittent reinforcement in the control condition, the utility of the timeout for non-criterion sequences in the vary condition, and the importance of similar reinforcement rates between the two components. The effects of d- amphetamine on behavioral variability under these conditions were also examined. Long Evans rats were trained to press two levers in multiple conditions. In the VARY 8:4 component, any four-response sequence distributed between these levers that differed from previous 8 sequences was eligible for reinforcement. Rats were then divided into two groups, depending on the control condition used. Two different control components were introduced. The first was a simple FR 4 procedure where every four-response sequence was reinforced. The other procedure was a Yoked FR 4 where inter-reinforcer intervals were equated between the VARY 8:4 component and the FR 4 component. To examine the role of timeouts for errors, each group was exposed to an ABA design including and removing timeouts in the VARY 8:4 component. The effects of d- amphetamine were assessed during the first two phases. Finally, to equate the rate at which sequences were reinforced in both the VARY 8:4 and FR 4 components, a Variable Ratio 3 schedule was imposed in the FR 4 component only (i.e., mult FR 1 (VARY 8:4) VR 3 (FR 4). In both the Yoke and Non-Yoke conditions, there was much greater variability in the VARY 8:4 than in the FR 4 component. The inclusion of a timeout for non-criterion sequences had no effect on entropy in both the Yoke and No-Yoke groups. The number of errors made during the VARY 8:4 component decreased only in the Yoke group. The effects of d-amphetamine on variability were not influenced by the timeout. The highest dose of d-amphetamine decreased response rates, increased entropy in the FR 4 component and decreased or had no effect on entropy in the iv VARY 8:4 component suggesting that effects may be baseline dependent. Entropy in the FR 4 component increased as a result of the presence of the VR 3 schedule. Intermittent reinforcement blunted the differences between the VARY 8:4 and FR 4 components. v Acknowledgments The author would like to thank Dr. Christopher Newland for his patience and guidance throughout her graduate career. He was helpful on this project by leading her through the writing process. She would also like to thank her committee members Drs. Alejandro Lazarte, Jeffrey Katz, and Jennifer Gillis, for their constructive comments on this dissertation project. Many thanks are owed to Dr. Thadeus Roppel, the outside reader, for his continued interest and attention to this dissertation. Finally, the author wishes to thank her husband, parents, Dr. George Flowers, and the people at OSF Saint Francis Medical Center in Peoria, IL because without these people, she would not have made it this far. vi Table of Contents Abstract ......................................................................................................................................... ii Acknowledgments ........................................................................................................................ v List of Tables ............................................................................................................................. viii List of Figures .............................................................................................................................. ix Chapter 1: Introduction ............................................................................................................... 1 The Response Unit .......................................................................................................... 2 Prior Research on Behavioral Variability ....................................................................... 3 Methods and Measurement of Behavioral Variability .................................................. 10 A Behavioral Mechanism? ............................................................................................ 11 Behavioral Variability and Choice ................................................................................ 13 A Neurochemical Correlate? ......................................................................................... 15 Evidence that Dopamine Plays a Role in Behavioral Variability ................................. 18 The Control Procedure to Use in Variability Experiments ........................................... 22 Timeout and Behavioral Variability ............................................................................. 27 References ..................................................................................................................... 30 Chapter 2: A Parametric Examination of Intermittent Reinforcement on Behavioral Variability ..................................................................................................................... 34 vii Abstract ......................................................................................................................... 34 Introduction ................................................................................................................... 36 Method .......................................................................................................................... 39 Results ........................................................................................................................... 44 Discussion ..................................................................................................................... 47 References ..................................................................................................................... 51 Figures ........................................................................................................................... 53 Chapter 3: Operant Variability: A Behavioral and Pharmacological Analysis ....................... 59 Abstract ......................................................................................................................... 59 Introduction ................................................................................................................... 61 Method .......................................................................................................................... 64 Results ........................................................................................................................... 71 Discussion ..................................................................................................................... 74 References ..................................................................................................................... 80 Tables ............................................................................................................................ 82 Figures ........................................................................................................................... 83 viii List of Tables Chapter 3, Table 1 Experimental Design .................................................................................... 82 ix List of Figures Chapter 2, Figure 1 Rat 712 Entropy Levels .............................................................................. 53 Chapter 2, Figure 2 Rat 727 Entropy Levels .............................................................................. 54 Chapter 2, Figure 3 Mean Entropy Levels for Group 1 .............................................................. 55 Chapter 2, Figure 4 Mean Total Responses for Group 1 ............................................................ 56 Chapter 2, Figure 5 Mean Entropy Levels for Group 2 .............................................................. 57 Chapter 2, Figure 6 Mean Total Responses for Group 2 ............................................................ 58 Chapter 3, Figure 1 An Example Session ................................................................................... 83 Chapter 3, Figure 2 Dose-Response Functions for Entropy in Phase 1 ...................................... 84 Chapter 3, Figure 3 Dose-Response Functions for Response Rates in Phase 1 ......................... 85 Chapter 3, Figure 4 Dose-Response Functions for Entropy in Phase 2 ...................................... 86 Chapter 3, Figure 5 Dose-Response Functions for Response Rates in Phase 2 ......................... 87 Chapter 3, Figure 6 Dose-Response Functions for Entropy in Phase 4 ...................................... 88 Chapter 3, Figure 7 Entropy Across All Phases ......................................................................... 89 Chapter 3, Figure 8 VARY Errors Across All Phases ................................................................ 90 Chapter 3, Figure 9 Response Rates Across All Phases ............................................................. 91 Chapter 3, Figure 10 Overall Reinforcement Rates in Phases 1 and 2 ....................................... 92 1 CHAPTER 1 Introduction Variability in behavior is often seen as a hindrance to achieving an understanding of the underlying causes of the behavior and to controlling its appearance in laboratory or applied settings. Great efforts may be undertaken to keep variability at minimum. Frequently, variation is attributed to ?problems? in methodology, often at the expense of achieving a full understanding of its source (Johnston & Pennypacker, 1993). Variability in behavior, however, is functional and it appears to be a fundamental property of behavior. In the process of acquiring new behavior, variation in responding must occur. Already established behavior can become highly complex and far removed from the initially learned behavior. In order for this process to take place, variation in responding is also necessary (Catania, 1998). Variability can also be useful in problem solving. Arnesen (2000) trained rats to interact with objects (e.g. a soup can) in different ways. When faced with novel objects, the rats were more likely to interact with them than rats that were not previously trained to be variable in their interactions. These variations led to reinforcement. Some recent studies have shown not only that variability is functional but also that it is shapeable (Grunow & Neuringer, 2002; Page & Neuringer, 1985). This is where the term ?operant? variability comes from. An operant implies that the behavior is modified by its consequences. Variability refers to a level of unpredictability or dispersion across a set of responses. One approach that has emerged to study operant variability is the ?lag procedure? 2 (Page & Neuringer, 1985) where the present sequence of responses must differ from a predetermined number of sequences. In this procedure, a reinforcer is given only if the current sequence differs from previous sequences. If a sequence is repeated, then no reinforcement will be given. The end result is a high number of different sequences being produced at nearly random levels. Control procedures, however, have varied across different laboratories or even studies within the same laboratory. The different types of control procedures will be discussed below. The Response Unit A collection of responses that meets a criterion is considered a response unit. Zeiler (1977) distinguished between formal and conditionable response units. The formal response unit is that which is explicitly paired with presentation of some stimulus, which could be reinforcing or even punishing; therefore the formal response unit is defined methodologically. Conditionable response units, or operants, increase if paired with the presentation of the reinforcer and more generally, are influenced by their consequences. This distinction is important because both formal and conditionable response units are clearly defined but the former may not be influenced by the reinforcement contingencies. Conditionable response units, however, are modifiable and can be manipulated by the researcher through operant conditioning. When training new behavior, one must often define the individual conditionable response units that comprise the target behavior and then specify a shaping procedure that chains those units together. For example, in training a four-response chain (the formal response unit), each individual response must be established first. Each sequential response is signaled by some arbitrary stimulus in which only the terminal link will result in reinforcement. Eventually the response unit emerges as a set of 3 four responses rather than only an individual response. The four response chain may become a simple cohesive integrated behavioral unit that is modifiable and therefore conditionable through different reinforcement contingencies the same way a simple unit, like a lever press, is modifiable. Just as a single instance of behavior (e.g., lever-pressing) can be defined as a response unit, so too can sequences of responses. Prior Research on Behavioral Variability Schwartz (1982b) argued that the highly stereotyped response patterns that result from reinforcement are incompatible with the acquisition of new behavior because variation in one?s behavior is necessary for learning. Reinforcement creates a narrowly defined response that is separable and distinct from non-reinforced responses. Once the behavior has been well established according to Schwartz, this low variability can interfere with the shaping of new behavior. Schwartz (1982a) attempted to train response variation in pigeons in order to prevent the narrowing/stereotypic effects of reinforcement on behavior. In that study, a pattern of eight responses on two keys had to be different from the pattern of eight responses just performed. Each response moved a light on a 5x5 matrix of lights located on the side wall of the chamber. If the bird moved the light, through key-pecking, from the top left corner to the bottom right corner without repeating a previously pecked sequence or ?moving off? the matrix, which both resulted in a timeout, it received access to grain. The pigeons could ?move off? the matrix by responding more than four times on one of the keys. Therefore, besides varying behavior from the previous sequence, there was also the response requirement that no more than four pecks on one key must occur in the sequence. In an effort to increase variability and unpredictability, Schwartz introduced a contingency such that a response sequence had to differ from the previous one. 4 Only one pigeon of four demonstrated an increase in its variable responding. As will be discussed, these results are replicable, and peculiar to the requirement that the pigeon ?not move off the grid." Schwartz, however, concluded that response variability cannot be reinforced and that stereotypies in behavior will arise no matter what is being reinforced. In fact, Schwartz's definition follows from Zeiler's definitions of response units. By definition, an operant response unit is strengthened, narrowed, and made less variable, by reinforcement. Perhaps, however, one difficulty in reinforcing variability lies in the definition of the operant unit to be reinforced. In order to confront this issue, it is important to determine whether there is an issue to confront, i.e., if "variability" can be reinforced. Many studies have shown that behavioral variability meets the definition of an operant which is that it can be manipulated and selected (Pryor, Haag, & O?Reilly, 1969; Schoenfeld, Harris, & Farmer, 1966; Blough, 1966; Bryant & Church, 1974; Schwartz, 1980, 1982a), i.e., it is sensitive to its consequences. Pryor et al. (1969) reinforced porpoises? novel responses or, in other words, behavior such as jumping, breaching, or flipping that had not been observed by the trainers. As the experiment progressed, the animals performed other movements that had not been previously trained (e.g., laying on their right side, turning upside down) suggesting that behavioral variability can be established through the reinforcement of novel behaviors. Blough (1966) used a schedule that reinforced least frequent interresponse times in pigeons? key-pecking. This schedule, known as the ?LF schedule?, produces interresponse time (IRT) variability through the reinforcement of least frequent IRTs. Interresponse times were sorted into bins (e.g. 0-1 second, 1-2 seconds, etc.) and key-pecking would only be reinforced if its IRT fell within the bin that had previously contained the fewest IRTs. The behavior of the 5 pigeons closely matched that of a stochastic generator and the pigeons? IRTs varied with bin changes. This is a stronger demonstration that variability can be sensitive to reinforcement contingencies, although the exact response unit is uncertain. Bryant and Church (1974) reinforced alternations in responding (in the form of pressing a lever that was different from the previous trial) 75% of the time and more likely ?stay? (staying on the same lever as the previous trial) lever presses 25% of the time. The percentage of alternation on the two levers matched the reinforcement contingency. Alternate responding reached asymptote around the probability of 49%. Results were consistent with that of a stochastic generator; as alternations were reinforced the probability of them occurring increased whereas when they were not reinforced, their occurrence decreased. The results above show that variability in animals can be similar to that of a stochastic generator and variability occurs as a result of reinforcement. However, the previous studies do not provide enough evidence that behavioral variability is an operant. Some difficulties with Schwartz's experiments have already been discussed. In the Pryor et al. (1969) study the researchers were active observers of the porpoises but could not see every behavior that was being emitted by the animals. For instance, variability in eye-rolling and high-pitched calls was not reinforced because it was difficult for the observers to record such events. Therefore, behavior was limited to easily viewable movements that were determined to be novel by the trainer, who also was a behaving organism. Although much more objective and reliable, the findings of Blough (1966) and Bryant and Church (1974) come with uncertainties, too. Variability effects observed in interresponse times may be highly limited and circumscribed. The variation in these IRTs was likely due to the schedule effects set in place because there appeared 6 to be a pattern in the responses. Shorter IRTs tended to occur after other short IRTs and longer IRTs tended to occur after other long IRTs. Also, if there were relatively few IRTs in a particular bin, it is possible that the same IRTs from that bin would be reinforced repeatedly. Reinforcement of switching behavior, due to the schedule requirement, as in Bryant and Church (1974) could also appear as opposed to the animals behaving variably/randomly. Variability in responding has also been seen in extinction procedures. Antonitis (1951) trained rats to nose poke anywhere on a 50 cm strip. Even though the location of the nose poke was not specific for reinforcement to occur, the pokes tended to be very close to one another. However, when the reinforcer was withheld, the rats began nose-poking in various spots along the strip. The absence of reinforcement increased variability in responses. Other studies have seen similar increases in variable responding when the response is under extinction (Eckerman & Lanson, 1969; Stokes, 1995; Neuringer, Kornell, & Olufs, 2001). Overall, there is evidence that variability is at least a byproduct of reinforcement contingencies, such as extinction, and may even be a conditionable unit. Page and Neuringer (1985) sought to determine if ?variability? is an operant, i.e., that can be influenced by its consequences and, especially, to identify the conditions under which operant variability arises. In the first experiment, a sequence of eight responses, distributed between two keys, had to vary from either the one (Lag 1) or five (Lag 5) sequences prior to that just performed. Unlike Schwartz's experiment, there was no constraint on the number of presses on each key. Thus, there was only one possible error to make: repeating a sequence that was among the previous one or five sequences made. This ?variability condition? was compared to a ?variability-plus-constraint condition?. In the ?variability-plus-constraint condition?, exactly four 7 responses on both the left and right keys were required, and the sequence had to be different from the last sequence performed. This condition resembled Schwartz's, except that there was no light that moved through a 5X5 array. In the ?variability condition? around 90% of the sequences produced met the reinforcer criterion. However, when the constraint was in place, only 42% of sequences met criterion as compared with Schwartz (1982a) who reported 36%. A key comparison between Page and Neuringer (1985) and Schwartz (1982a) can be found in the measurement of variability per se. One measure used was the percentage of reinforced trials per session, an indicator of how often the pigeons met the criteria. The second measure was the percentage of different sequences per session, or if the current sequence was different from all the previous sequences in the session. Interestingly, in both the variability and variability-plus-constraint conditions, the pigeons of Page and Neuringer (1985) performed similarly to a random number generator that produced eight-response sequences. Page and Neuringer's first experiment differed from Schwartz?s in several important respects. In Experiment two, conditions for variability alone and variability-plus-constraint were compared with each other again so to replicate the conditions in Schwartz's previous study more directly (Schwartz, 1982a). The replication resembled Experiment one, but there was no interpeck interval, reinforcement was 4 seconds access to grain, and all trials were followed by a 0.5 second intertrial interval. Once again, the only difference between the two conditions was the absence of a ?no more than four pecks? constraint in the variability alone condition. Pigeons behaved more variably (the measurement of variability will be described in detail below) under the variability-alone condition than under the variability-plus-constraint condition. The results support the hypothesis that variability may be hindered by schedule and/or contingency 8 requirements such as the ?no more than four pecks on the same key in a sequence? constraint. There are simply more sequences that can be performed when there is no constraint, 256, as opposed to when the constraint is in place, 70. In other words, the odds of producing a response sequence different from the previous are greater when there is no constraint in place, and that is exactly what happened. Page and Neuringer went on to show that variability is an operant and introduced a technique for studying it in the laboratory. In further experiments, Page and Neuringer (1985) identified some necessary and sufficient conditions for variability to emerge as well as potential determinants of variability. They addressed directly such questions as how many different sequences can be performed, whether memory is playing a factor, and if variability can come under control of a stimulus and they showed that, once established, variability has the properties of a reinforceable dimension of behavior that can be measured and predicted. Page and Neuringer (1985) systematically increased the lag (or lookback window) to determine whether variability remained high over long lags. Only at a lag 50 requirement did performance deteriorate; at this requirement the number of criterion sequences dropped to around 67% of all sequences. At a lag of 50, a sequence of eight responses had to be different from the previous 50 sequences in order for reinforcement to become available. Interestingly, a random number generator performed similarly to the pigeon at a lag of 50. It seems that increasing the lag decreases the opportunity for reinforcement. However, the percentage of reinforced sequences (reinforced sequences/total sequences) as compared to non-reinforced sequences is still quite high since the pigeon must peck a different set of 8-response sequences that is unlike that past 50 sequences already completed. 9 It is possible, if unlikely, that the animals were memorizing previously produced sequences and reproducing a different one. The hypothesis that memory of the previously performed sequences was responsible for the random responding was examined in another experiment. If memory is important, then one cannot say that the pigeons are behaving randomly, but rather, that the production of a specific sequence is driven directly by the fact that that sequence was not produced recently. This, of course, is incompatible with randomness in which the probability of any one sequence occurring is equal that of any other sequence occurring. If the bird remembers the previous sequences, then reducing the number of responses required from eight responses to four, should increase accuracy. Increasing the number of responses should be more difficult for the pigeon?s variable responding because there are more responses to remember. The results were incompatible with a ?memory? hypothesis. When the sequence length was shortened fewer, not more, criterion sequences occurred. This is compatible with a "randomness" interpretation since in a chain of 8 responses there are more random sequences (28) than there are in a chain of 4 (24). In other words, the longer the chain of responses, the higher the probability of pecking a different sequence. If a dimension of behavior is considered an operant, the probability of its occurrence can be brought under the control of a stimulus. The final experiment of Page and Neuringer (1985) was designed to test this very notion of behavior. The pigeons? behavior came under the control of blue-colored keys in the variability component and red-colored keys in the stereotypy component. When the key lights were reversed and the red signified variable responding and the blue meant stereotypic responding was necessary for reinforcement, responding reversed, too. 10 Based on their observations that operant variability can come under the control of specific stimuli, they concluded that it is a reinforceable dimension of behavior. Methods and Measurement of Behavioral Variability There are many different methods that can be used to generate behavioral variability (Neuringer, 2002). Novel response procedures, which include a change in contingencies, tend to reinforce variability because making a new response is usually beneficial towards the end goal of reinforcement. For example, in a radial arm maze the rat?s varying behavior is reinforced if the behavior is going down the arm of the maze that it did not previously visit. Therefore in this procedure the contingency requirement is about variation in responding. The lag procedure seen in Page and Neuringer (1985) requires that a number of different variations of a sequence must occur before reinforcement is given. Increasing the lag (or lookback window) to a high number (e.g., 50), results in a decrease in the number of reinforced trials. However, the percentage of reinforced trials still remained high at 67% and stayed in sync with the results of a random number generator. The more sequences that must be produced, the more likely sequences will be repeated. Another method is that of reinforcing least frequent occurrences of a response as seen in Blough (1966) and Schoenfeld, et al. (1966) in which interresponse times that were the least frequent were reinforced. Variability in responding increased for the pigeons because they had to emit consecutive responses that had differing interresponse intervals. Threshold procedures involving reinforcing responses that are below a particular relative frequency can also demonstrate variability. Along the same lines, there is the method of frequency dependence in which there is a connection between the reinforcement rate and response frequency. The more 11 frequent a response occurs, the lower the probability that it will be reinforced. To explain it another way, infrequent responses are more likely to be reinforced. Five dependent measures have been used as markers of response variability. The number of reinforcers obtained in the variability condition, the distribution of different sequences emitted, the frequency of dominant sequences (Schwartz, 1982a), interresponse time distributions (for the least-frequent IRT procedure) and the entropy (a measure of variability in responding). Different sequences occur when behavioral variability is reinforced (Grunow & Neuringer, 2002). The frequency of dominant sequences increases with shorter lags (e.g., a lag 1 only requires that the subject alternate responding between two different sequences) and decreases with longer lags and the contingency for high variability (Page & Neuringer, 1985; Grunow & Neuringer, 2002). Another dependent measure briefly mentioned before is interresponse times. Neuringer (1991) showed that the longer the interresponse time the more operant variability occurred. However, repetition of a single sequence tended to decrease with longer pauses between each response. An overall response measure for variability in a session is commonly denoted by the U value. This value, based on the information theory, is a measure of the entropy or stochastic generation of responding (Miller & Frick, 1949). When the U value approaches 1.0, sequence frequency is approximately equal. When one specific sequence occurs more often than others, the U value approaches 0.0. A Behavioral Mechanism? A final issue that deserves some attention is whether variability arises due to a response pattern that is actually reinforced. Thus, the formal response unit might be the lag-n requirement but the conditionable response unit might be something quite different. Machado (1997) 12 questioned whether the lag procedure, applied to the distribution of responding between two response levers actually reinforced variability, per se, or whether the increase in variability is the outcome of the reinforcement of some other response class. In other words, is variability a result of the schedule requirement presented to the animal or is it the result of a changeover requirement (e.g., animal must change levers/keys n number of times) in the schedule. Machado studied this by explicitly reinforcing changeovers and asking whether "variability" emerges. In experiments one and two of the Machado study (1997), pigeons? key-switching behavior was reinforced only if they changed from one key to the other at least once (Experiment one) or three to four times, (Experiment two). In Experiment one, animals performed 30 different sequences (out of 256 possible) on average even though they only had to produce a sequence that had at least one changeover in it (Group 1) or more than one changeover in it (Group 2). In Experiment two, many birds produced fewer different sequences than in the first experiment, some as few as 20 different sequences. In Machado?s final experiment, a replication of Page and Neuringer?s (1985) study, sequence variability was reinforced rather than switching. Eight-response sequences that differed from the previous 25 sequences were reinforced. The proportion of criterion sequences emitted by the pigeons was around 0.64 to 0.74 of the sequences performed. In his concluding statements, Machado argued that even though the pigeons did in fact behave more variably when response variability was reinforced than when switching was reinforced, there was a similarity between responding in all experiments. Therefore, it seemed that the reinforcement of key-switching contributed to variability. Machado identified three characteristics that were consistent between the two procedures: 1) the location of the first peck was usually on the same key throughout the session; 2) the 13 probability of switching from the first key increased as the sequence progressed; and 3) the probability of switching to the initially preferred key decreased or remained constant as the sequence progressed. These characteristics were present in all experiments regardless of whether the formal contingencies specified response switching or response variation, suggesting that they stem from the same mechanism. Variability in behavior is greater, however, when explicitly reinforced than it is when only switching is reinforced. So while direct reinforcement of switching is important, it appears to be insufficient in accounting for Neuringer's results. Behavioral Variability and Choice Neuringer has argued that variability, per se, is an operant. Machado has argued that a major component of this is switching, a response class that can be defined with precision. Either way, if they are operants then they should be susceptible to the matching relation. When two different response classes are reinforced at different rates, their relative rate of occurrence approximately matches relative reinforcement rates, a phenomenon known as the strict matching law (Herrnstein, 1970). Because response rates do not always strictly match reinforcer rates, an alternative formulation, called the generalized matching relation, often provides superior fits to the data, at least when only two response alternatives are available (Baum, 1974; Davison & McCarthy, 1988). To further strengthen the basis behind variability as an operant and its adaptability, Neuringer (1992) connected the two dimensions of behavior: choice and variability. The purpose of this study was to see if the relative appearance of varying and repeating response sequences was influenced by relative rate of reinforcement for the appearance of varying response 14 sequences. In other words, will their responding follow the matching law as does other forms of responding? Neuringer (1992, Experiment 1a) placed six pigeons in standard operant chambers and the reinforcer to be obtained was three seconds access to grain. The trials consisted of a sequence of four responses. Whether a sequence was reinforced depended on which condition, VARY or REPEAT happened to be in place. In the VARY condition, reinforcement was contingent on the pigeon pecking a sequence of four responses that differed from the three previous sequences. In the REPEAT condition reinforcement was provided if the pigeon pecked a sequence that replicated any of the previous three sequences. Whether the trial required a VARY or REPEAT contingency depended on random selection of the computer. Each requirement was reinforced with a pre-determined probability that varied across conditions. For example, in one condition, 20% of VARY sequences were reinforced during "VARY" trials while 80% of REPEAT sequences were reinforced during "REPEAT" trials. In other words, there were two response classes (VARY and REPEAT) reinforced at different, and independent, rates. There was no discriminative stimulus to signal which requirement was in place at the time. The question was whether the rate at which a response class was produced reflected the relative rate at which that class was reinforced. Neuringer (1992) reported that the percentages of variable sequences increased as a function of the relative reinforcement obtained. In other words, reinforcement increased the number of sequences in a manner that was consistent with the matching law. Graphical presentation shows a slight undermatching and a small bias for repeating sequences. Conclusions that can be drawn from this study are: 1) vary and repeat sequences are sensitive to 15 reinforcement contingencies and 2) performed sequences did not match the percentages of reinforcers obtained. Neuringer discusses three reasons for his interest in animals? choosing to respond variably over repeating: a) it supports adaptive action in an environment that is not predictable, b) it may change the very nature of one?s ability to predict and control behavior, that is if variation in behavior can actually come under operant control and c) it may shed some light on problem solving and learning methods in which variability is usually a necessity. From his conclusions, behaving variably seems to be an adaptive dimension of behavior that animals and humans can choose to perform. A Neurochemical Correlate? Even as some behavioral mechanisms have been identified, the neurochemical correlates of operant variability are poorly understood. Because of dopamine's involvement in reinforcement, choice, and stereotypy, it can be hypothesized that it is involved in operant variability, as well. While this is a reasonable hypothesis, it creates a potential conflict, because excess dopamine is also associated with response stereotypy (Ward, Bailey, & Odum, 2006), which is the opposite of variability. Dopamine is a catecholamine neurotransmitter that is a biological precursor to epinephrine and norepinephrine. There are two families of dopamine receptors and three (or by some counts, two) pathways. Before discussing dopamine?s involvement with reinforced behavior, the different subtypes and respective pathways in the brain for dopamine must be covered. There are three major dopamine pathways, and all begin with cell bodies located in midbrain regions. The nigrostriatal tract has cell bodies located in the substantia nigra and fibers 16 ascend to the caudate-putamen (aka, striatum). This pathway is associated with the control of movement. Damage to it results in Parkinsonian-type effects (e.g. tremors, rigidity, etc.) (Meltzer & Stahl, 1976). The mesolimbic dopamine pathway is the second dopamine pathway and it begins in the ventral tegmental area and fibers ascend into areas of the limbic system. This pathway is responsible for reinforced behavior and is involved in substance abuse (Meltzer & Stahl, 1976). The third pathway is the mesocortical pathway which also begins in the ventral tegmental area but instead ascends to areas in the cerebral cortex. The mesocortical pathway is concerned with motivation, problem solving, impulsivity, choice, and reinforcement learning (Meltzer & Stahl, 1976). How dopamine acts depends on the receptor it acts upon. One action of dopamine is to change the synthesis of cyclic AMP (adenosine monophosphate) in post-synaptic neurons, but whether it increases or decreases cAMP depends on the post-synaptic receptor. The two major subtypes of dopamine receptors can be distinguished by their actions on cAMP. These are the D1 family (D1 and D5) and the D2 family (D2, D3, and D4). D1-like receptors result in the activation of cyclic AMP through the stimulation of adenylyl cyclase and are excitatory post-synaptically. Classically, these receptors have been associated with cardiovascular and motor function. Unlike the D1-like receptors, activation of the D2-like receptor subtypes inhibit the stimulation of adenylyl cyclase, and therefore decreases the rate at which cyclic AMP is synthesized. This results in inhibition of the post-synaptic neuron (Missale, Nash, Robinson, Jaber, & Caron, 1998). Moreover, activation of D2-like receptors also increases the likelihood that the potassium ion channels will open, causing hyperpolarization and promoting further inhibition (Missale, et al., 1998). Physiological functions of the D2-like receptors differ 17 depending on where the receptors are located in the synapse. Location of the D2-like receptors also influences the actions of drugs that target D2 specific receptors either through their activation or deactivation. Pre-synaptically, the receptors act as autoreceptors and their activation decreases dopamine release into the synapse and diminished D2 activity. Deactivation of these autoreceptors, through the administration of a D2 antagonist, will result in enhancement of dopamine release from the pre-synaptic neuron (Westerink & de Vries, 1989). Post- synaptically, activation of D2-like receptors results in an increased D2 function (and post- synaptic inhibition) whereas antagonism of the results in decreased D2 function. To make things complicated, D2 activity, while inhibitory on the neuron, inevitably results in increased locomotor activity and arousal (Missale, et al., 1998). D2-like receptors are located in the striatum, olfactory tubercule, and the core of the nucleus accumbens, the pituitary gland, where the hormone prolactin is produced. Activating the D2 receptors in the pituitary gland inhibits the release of prolactin. They are also found in the frontal cortex, although in lower concentrations than D1 receptors. Unlike the D1-like receptors, which are located on the periphery of the synapse, the D2 receptors form a dense layer within the synapse (Schultz, 1998). Increases in dopamine concentration would, therefore, result in a saturation and activation of these receptors. Once levels of dopamine return to normal the D2 receptors remain slightly activated. Schultz (1998) noted that D2-like receptors are also sensitive, if not primarily sensitive, to changes in reinforcement contingencies. Their firing changes when there is an ?unexpected? reinforcer or absence of a reinforcer. When a response class has been reinforced, and has become well-established because of that history, the D2-like receptors in the striatum contribute to the perseveration of that response class (Kurylo, 2004). 18 Evidence that Dopamine Plays a Role in Behavioral Variability In this section, we review the effects of drugs, especially dopamine agonists, on behavioral variability. These studies were undertaken to examine hypotheses pertaining to dopamine?s involvement in attention-deficit disorders, behavioral variability, and how dopamine does or does not affect variability that is already present. Research on behavioral variability that exists usually involves the developmental disorder known as attention-deficit/hyperactivity disorder (ADHD). Patients with this disorder tend to behave impulsively, and exhibit a high degree of inattention. Their attention span is very short and they tend to become easily interrupted by their surroundings (American Psychiatric Association, 2000). Patients with ADHD show a decrease in focused attention span and an increase in variable behavior. This suggests that it is reasonable to hypothesize that drugs that help manage the clinical signs might also have effects on variability. Tripp and Wickens (2008) report that children with ADHD show ?altered reinforcement processing? because of a dopamine ?transfer deficit.? During the beginning stages of learning, in primates, a burst of dopamine is associated with presentation of a reinforcer (Schultz, 1998). Over time, this burst transfers to predictive stimuli that come before the reinforcer even in the absence of said reinforcer, a process easily recognized as conditioned reinforcement. Tripp and Wickens (2008) argue that this transfer does not occur in those diagnosed with ADHD because the stimuli that predict reinforcement do not acquire functional control, the ADHD individual is functionally undergoing a delay of reinforcement or extinction when primary reinforcers are not delivered. Psychomotor stimulants such as methylphenidate are used to treat the symptoms of ADHD. These drugs promote dopamine activity by inhibiting dopamine transporters that reuptake excess dopamine in the synapse. There is an increase in the 19 dopamine transporters (DAT) in the brains of ADHD patients compared with controls (Dougherty et al., 1999; Krause, Dresel, Krause, Kung, & Tatsch, 2000). An excess of DATs in the brain means that dopamine is cleared rapidly. By blocking the DATs, the stimulants keep dopamine in the synapse longer. Promoting dopamine activity will therefore, according to the dopamine transfer theory, increase the likelihood that dopamine dynamics are linked to reward prediction. Blockade of dopamine reuptake by drugs such as cocaine or amphetamine increase the levels of dopamine in the synapse. With the added bursts of dopamine after the delivery of a primary reward and the presentation of the predictive stimuli, the concentration of dopamine will be much greater than without the administration of the drugs. This is thought to produce a stronger connection or a more broadened one will occur for the predictive stimuli that may be salient to the animal. The reward value will also strengthen with an increased dopamine signal. A dopamine agonist will enhance dopamine release after cues associated with the prediction of the reinforcer. The predictive reward theory was derived by studying a response unit, lever pressing, but perhaps it may also apply to more complex response units such as response variability. This would suggest that response variability may be affected by an increase in dopamine because with an increase in the lag requirement of a session, the predictive capability of the reward presentation declines. Other cues, e.g. length of the sequence, time until reward delivery, etc., may interfere with the variation in sequences that is required in the variability procedure. The following two studies were designed to examine variability in a model of ADHD, and the effects of a psychomotor stimulant on this variability. Mook, Jeffrey, Neuringer (1993) studied the spontaneously hypertensive rat (SHR), a strain that has become one animal model for 20 ADHD. These rats exhibit higher locomotor activity levels, more risk-seeking behaviors, more variable behavior, and are more likely to approach novel objects than the Wistar-Kyoto (WKY) strain, the background strain from which SHRs are bred (Van den Buuse, & de Jong, 1989). Both sets of rats, SHRs and WKYs, were exposed to a radial arm maze task (Experiment 1) and key- pressing task distributed between two keys (Experiment 2). In Experiment 2, only variability was measured. Overall the SHRs responded more variably than the WKYs in both conditions. This tendency to vary appeared as a learning deficit for the SHRs when repetition was required whether it be to re-enter an arm in the maze or repeat a sequence of key responses. When reinforcement was contingent on entering only a subset of arms or producing a specific subset of lever responses, the WKYs were more accurate than SHRs in restricting responding to that particular subset. However, the SHRs were more likely to vary amongst the subset in both the radial arm maze and key-pressing, thereby receiving more reinforcement than their counterparts during the VARY components. Mook and Neuringer (1994) examined the effects of d-amphetamine on behavior under contingencies that reinforced variable or invariable responding in SHR and the WKY rats, using lever-pressing on two levers. As in the previous study, the SHRs tended to respond more variably than the WKYs. d-Amphetamine increased variability for both groups when compared with those injected with saline. In Experiment two, reduced variability was required in that only a subset of four out of sixteen sequences could be performed and the sequence had to be different from the previously performed sequence (Lag 1) for reinforcer delivery. In the second experiment, the SHRs that were administered amphetamine behaved similarly to the WKYs during control conditions in that they were accurate in repeating amongst a subset of lever 21 presses when the repetition contingency was in place. This result is consistent with effects of methylphenidate or other stimulants administered to those diagnosed with ADHD to perform similar to children who are not diagnosed with the disorder. The fact that these drugs promote dopamine activity, suggests the presence of a link between dopamine and response variation. An increase of dopamine in the synapse could be associated with a decrease in the variability of responding. Ward, Bailey, and Odum (2006) administered d-amphetamine to pigeons and measured the effects on variability in a MULT REPEAT (RRLL) VARY (Lag 10) procedure. Here, the ?control? or ?repeat? component was the production of a specific sequence. Producing a specific sequence (RRLL) was reinforced in the control (?repeat?) condition. In the VARY component, the animals were required to press a sequence of four responses that differed from the previous ten. There was an increase in variability as depicted by U-values in the REPEAT component and no effect on the VARY component except for at the highest dose. Our laboratory investigated the role of dopamine and its respective subtypes on operant variability (Pesek-Cotton, et al., 2011). d-Amphetamine (non-specific dopamine agonist), quinpirole (D2 receptor agonist), and SKF 38393 (D1 receptor agonist) were administered to rats under a MULT VARY 8:4 FR 4 schedule. In this procedure, the VARY 8:4 component required the animal to complete a sequence of four responses that differed from the previous eight sequences. The FR 4 component only required the animal to complete a four response sequence without the variability contingency. d-Amphetamine increased variability when it was low (as in the FR 4 component) and but had no effect when it was high (as in the VARY 8:4 component). Quinpirole increased variability in the FR 4 component only whereas SKF 38393 had no effect 22 on levels of variability in the VARY 8:4 component. Thus, increased variability in the control component was linked to D2 activity while the D1 agonist had little effect, even when affected response rate. The results of this study suggest that dopamine is involved in behavioral variability and its effects appear to be dopamine receptor-subtype specific. The Control Procedure to Use in Variability Experiments The selection of an appropriate control procedure in experiments on reinforced variability is not a straightforward task. Several have been reported in the literature and the control procedure used in the present studies differs somewhat from those that have been reported previously. The purpose of a control procedure is to have a comparable component in which variability is not directly reinforced but all other aspects of the experiment are similar. Ideally, it differs from the variability component on only a single dimension. In one type of control procedure, a specific sequence of responses is required for reinforcement. In two studies using rats (McElroy & Neuringer, 1990; Cohen, Neuringer, & Rhodes, 1990) a specific four- response sequence (LLRR) was required for reinforcement, while the other study trained pigeons to peck a specific four-response sequence (RRLL) as one component of a multiple schedule (Odum, Ward, Barnes, & Burke, 2006). A REPEAT component serves as a control only in that it has the same number of responses in a sequence as the VARY component. The criteria for each component are different (i.e. repeat vs. vary) and the non-criterion sequences also differ from one another. In the REPEAT component, non-criterion sequences are those that do not repeat and in the VARY component, non-criterion sequences are those that do repeat. This type of control procedure takes the animals a relatively long time to acquire, which immediately raises concern about its utility as a control for variable responding, which is acquired quickly. In each of the examples given, 23 the number of one hour sessions needed for training of the repeat component ranged from 12-18 (Odum, et al., 2006), 30 (Cohen, et al., 1990), and as high as 62 (McElroy & Neuringer, 1990). More important, this rigid requirement of a specific procedure does not permit a predilection for repetition to appear. Drug effects that may impair accuracy by simply producing, for example, RRRL or RLLL responses only permit one to draw conclusions about accuracy or tendencies to changeover rather than about variability or repetition. Drug-induced disruption of a specific response sequence could be related to the disruption of a chain of responses (Thompson & Moerschbaecher, 1979) or the location of a changeover in a chain (Laties, 1972; Laties, et al., 1981) rather than to changes in variability per se. In short, such a component may actually test the ability of an animal to produce a rigidly required chain under drug conditions, rather than serve as a control for variable responding. This is problematic as a control procedure because, as stated above, it differs from the variability component on several dimensions, including the criteria for each component (repeat vs. vary) and the non-criterion sequences (not repeating vs. not varying). Another pair of approaches entails yoking reinforcers in the VARY component to those received in the control component either (1) when an explicit sequence is required (e.g., Page & Neuringer, 1985) or (2) to any four-response sequence in which the probability of reinforcement is fixed (e.g., Ward, et al., 2008). This equates the number and timing of reinforcers per sequence between the two components, but the number of reinforcers per criterion sequence is still quite different. Perhaps more important, the first approach still compares the VARY component with a pre-defined sequence rather than with a four-response sequence that is free to vary. The problem with the second approach is that it reinforces every criterion response in the VARY component 24 but reinforces criterion responses intermittently in the FR 4 component and, as shown in our laboratory, intermittent reinforcement changes the structure of the response sequence by making it more variable. Thus both yoked control procedures differ from the VARY component on at least two dimensions, including the number of reinforcers per criterion sequence (FR 1 reinforcement in the VARY component and intermittent reinforcement in the control component) and the criteria for sequences for each component (repeat vs. vary). The third type of control procedure that is used to test in conjunction with variability procedures is a fixed ratio of a desired number of responses distributed in any way between the left and right levers. When investigating the difference between repetitive and variable responding in rats that are administered amphetamine, Mook and Neuringer (1994) employed a FR 4 requirement across two levers as the control component. Training for the FR 4 only took four half-hour sessions. The variability component requirement was a FR 4 on the two levers that differed from the previous four sequences (lag 4) emitted. Hunziker, Saldana, and Neuringer (1996) also trained a FR 4 requirement across two levers that took seven 45 minute sessions to train. This procedure is similar to the experiments described in the present dissertation except at least one response had to occur on the two levers or, stated differently, at least one ?changeover? response was required. Two advantages of the FR 4 control procedure can be noted. One is practical. Initial training and the establishment of a baseline occurs more rapidly if the animal selects its own preferred sequences than if an arbitrary sequence is required of all animals. In fact, this may be more than practical because it makes the behavioral history of the control and experimental conditions more similar. Second, by producing large differences in variability, this control 25 procedure can detect increases and decreases in variability resulting from drug effects. If only one specific sequence is required then the drug could only increase variability. The use of a FR 4 as a procedure bypasses a problem that arises when an arbitrary chain is pre-selected, a problem that may be especially acute if the control response sequence is relatively un-preferred or difficult to execute. As noted in studies of repeated acquisition, behavior chains can vary in difficulty (Wright & Paule, 2007) and difficult chains are often those that require greater travel or changeovers among response devices. In addition, which chains are more difficult can differ across animals. We (Pesek-Cotton, et al., 2011) noted that there were individual differences in what chains were produced during the FR 4 component in a previous experiment but the preferred chains entailed zero or one changeover(s). Moreover, the preferred sequence varied across animals and even across sessions for a specific animal. As stated above, the control response sequence used in many studies involves one or more changeovers and these sometimes require extensive training to establish, so they can be viewed as difficult chains. Even a one-changeover chain like RRLL can require training to produce since RLLL and RRRL, which would likely occur, would be an error. In our previous Experiment 1, every criterion sequence was reinforced and the number of reinforcers per component was held constant, but reinforcer delivery per unit of time was free to vary. The discrepancies noted in d-amphetamine?s effects between the two components may have been due to differences in the overall reinforcer rate rather than the VARY 8:4 contingency itself. Animals required more time and produced more sequences to obtain the 10 reinforcers in the VARY 8:4 component because non-criterion sequences could, and did, occur. During the VARY 8:4 component, an average of 2.5 sequences occurred for every correct, reinforced, 26 sequence. In contrast, every four-response sequence was reinforced in the FR 4 component. This had the intended advantage of producing a large difference in variability between the components and equating the relationship between the criterion response class (four-response sequence vs. variable four-response sequence) and reinforcer delivery. This occurred, however, at the expense of producing different overall rates of reinforcement per unit time and per four- response sequence. Thus the FR 4 procedure differs from the VARY component in that reinforcement rates tend to be higher in the FR 4 component than in the VARY component. This issue was addressed in Experiment 2 of Pesek et al. (2011) by delivering primary reinforcers for a criterion response sequence in both components under a variable interval 60 sec (VI 60 sec) schedule. This approach held constant the number of reinforcers per unit time (about 1/min) and per criterion response sequence. The use of an intermittent schedule attenuated the difference in reinforced variability between the two components. Variability during the VARY 8:4 component was relatively unaffected, as compared with Experiment 1. The difficulty with this approach is that it increased variability in the FR 4 condition, perhaps by greatly decreasing the overall reinforcement rate, so there is a narrower separation in variability between the two components, so only decreased variability is likely to appear as a drug effect. The intermediate to high levels of variability that occurred in both components were resistant to d-amphetamine. While it is unclear at present why variability under the intermittent schedule was so resistant, this discrepancy does suggest that a schedule in which criterion responses are reinforced at a high rate is better suited to examining drug effects. 27 Timeout and Behavioral Variability In many studies on response variability, a non-criterion sequence is defined as a sequence that matches one of the previous sequences when variability is required (Page & Neuringer, 1985; Pesek-Cotton, et al., 2011; Ward, et al., 2006). Non-criterion sequences result in a timeout from reinforcement in the current trial. Because variability can and does increase under extinction (Antonitis, 1951; Eckerman & Lanson, 1969; Stokes, 1995; Neuringer, Kornell, & Olufs, 2001), the presence of a timeout, per se, may create a situation where responding becomes more variable. The timeout may be producing variability that could either promote the likelihood of a reinforced trial in a ?vary? component or prevent a reinforced trial in a ?repeat? component. Therefore, rather than serving as a form of punishment for incorrect responses, the timeout is actually promoting responding. In a procedure in which the control component can be any four-response sequence, timeouts do not occur because there are no non-criterion sequences. This was the approach used in the experiment by Pesek-Cotton and colleagues (2011). One issue that needs to be addressed is the timeout in the VARY 8:4 component. No timeouts occurred in the FR 4 component because any four-response sequence during this component resulted in a reinforcer. Timeouts did occur in the VARY 8:4 component, however, when there was an incorrect sequence. This distinction produced overall differences in rate of reinforcement between the two components. In addition to producing variable responding, the effectiveness of the timeout may be altered by drug effects. The behavioral effects d-amphetamine can depend on the context under which punishment has occurred. Behavior of control animals is generally suppressed when a shock is presented with the reinforcer (e.g. food, water). Despite the rate-dependency typically 28 produced by d-amphetamine (i.e. low rates of behavior are increased and high rates of behavior are decreased), low rates of behavior due to punishment are either unaffected or further suppressed (Foree, Moretz, & McMillan, 1973). However, if an animal has been previously exposed to a shock-avoidance/postponement procedure in which the animal can control the occurrence of the shock, d-amphetamine will increase responding when it is followed by shock, which seems to function as a reinforcer (McKearney & Barrett, 1975). Bacotti and McKearney (1979) employed a five-minute fixed interval schedule of food presentation to squirrel monkeys. For every 30th response, the monkeys were also administered an electric shock. Responding was suppressed by the presence of the shock and d-amphetamine either suppressed responding even more or had no general effect at all. Next, the monkeys were trained on a shock-postponement schedule that required the animals to respond to avoid the shock. After this training, the animals returned to the original fixed interval schedule with shock presentation. d-Amphetamine increased punished responding in all monkeys. d-Amphetamine increases low rates of reinforced behavior but decreases or leaves unaffected low rates of punished behavior. It is interesting to note that behavioral effects of d-amphetamine can also be influenced by the contingencies set in place at the time of the behavior. The previous studies described d- amphetamine?s effect on positively punished behavior. Positive punishment refers to the presentation of an aversive stimulus to decrease behavior. In contrast, negative punishment can be defined by the removal of an appetitive stimulus to decrease behavior. To demonstrate the effects of d-amphetamine on negatively punished behavior, one can look at schedule-induced drinking that is intermittently reinforced. When a rodent is placed under an intermittent schedule, it begins a pattern of drinking during the times between availability of reinforcers (Falk, 1961). 29 This drinking can be stopped by increasing the time to the next reinforcer in an interval schedule. P?rez-Padilla and Pell?n (2003) studied the anti-punishment effects of adjunctive drinking of d- amphetamine on schedule-induced drinking when the drinking produced increases in the time to the next reinforcer. d-Ampehtamine produced dose-dependent increases in adjunctive drinking even though this was counterproductive to obtaining a reinforcer (drinking during the interval produced delays to reinforcement). d-Amphetamine had no effects, however, on the drinking behavior of those animals that were yoked to the experimental group and received the delays regardless of their own drinking behavior. The authors conclude that d-amphetamine produces anti-punishment effects on behavior that has been previously extinguished by negative punishment. The increase in interval described above can be correlated with a timeout in the variability procedure in that the time to reinforcement is extended. Since timeouts are a form of negative punishment, it may be asserted that d-amphetamine could increase responding that was previously suppressed. An increase in responding during a timeout removes the overall function that the timeout is supposed to be serving. It is difficult to predict the behavioral effects of d-amphetamine when there is a timeout present. Perhaps, as stated earlier, d-amphetamine will increase behavior that has been negatively punished. This runs counter to the effects of d-amphetamine on behavior that has been suppressed by positive punishment. Responding during a timeout may be unaffected by the presence of d-amphetamine. As context is important, if an animal has been previously exposed to a timeout in the variability procedure, then it is possible that responding will increase during this timeout. This again will remove the function of the timeout. 30 References American Psychiatric Association (2000) Diagnostic and statistical manual of mental disorders, text revision (DSM-IV-TR) (4th ed.). Arlington, VA: American Psychiatric Publishing, Incorporated. Antonitis, J. J. (1951). Response variability in the white rat during conditioning, extinction, and reconditioning. Journal of Experimental Psychology, 42, 273-281. Arnesen, E. M. (2000). Reinforcement of object manipulation increases discovery. Unpublished bachelor?s thesis, Reed College. Bacotti, A. V., & McKearney, J. W. (1979). Prior and ongoing experience as determinants of the effects of d-amphetamine and chlorpromazine on punished behavior. Journal of Pharmacology and Experimental Therapeutics, 211, 80-85. Baum, W. M. (1974). On two types of deviation from the matching law: Bias and undermatching. Journal of the Experimental Analysis of Behavior, 22, 231-242. Blough, D. S. (1966). The reinforcement of least frequent interresponse times. Journal of the Experimental Analysis of Behavior, 9, 581-591. Bryant, D., & Church, R. M. G. (1974). The determinants of random choice. Animal Learning & Behavior, 2, 245-248. Catania, A. C. (1998). Learning (4th ed.). Upper Saddle River, NJ: Prentice Hall. Cohen, L., Neuringer, A., & Rhodes, D. (1990). Effects of ethanol on reinforced variations and repetitions by rats under a multiple schedule. Journal of the Experimental Analysis of Behavior, 54, 1?12. Davison, M., & McCarthy, D. (1988). The matching law. Hillsdale, NJ: Lawrence Erlbaum Associates. Dougherty, D. D., Bonab, A. A., Spencer, T. J., Rauch, S. L., Madras, B. K., & Fischman, A. J. (1999). Dopamine transporter density in patients with attention deficit hyperactivity disorder. Lancet, 354, 2132. Eckerman, D. A., & Lanson, R. N. (1969). Variability of response location for pigeons responding under continuous reinforcement, intermittent reinforcement, and extinction. Journal of the Experimental Analysis of Behavior, 12, 73?80. Falk, J. L. (1961). Production of polydipsia in normal rats by intermittent food schedule. Science 133, 195?196. 31 Foree, D. D., Moretz, F. H., & McMillan, D. E. (1973). Drugs and punished responding II: d- Amphetamine induced increases in punished responding. Journal of the Experimental Analysis of Behavior, 20, 291-300. Grunow, A., & Neuringer, A. (2002). Learning to vary and varying to learn. Psychonomic Bulletin & Review, 9, 250-258. Herrnstein, R. J. (1970) On the law of effect. Journal of the Experimental Analysis of Behavior, 13, 243-266. Hunziker, M. H. L., Saldana, R. L., & Neuringer, A. (1996). Behavioral variability in SHR and WKY rats as a function of rearing environment and reinforcement contingency. Journal of the Experimental Analysis of Behavior, 65, 129?143. Johnston, J. M. & Pennypacker, H. S. (1993). Strategies and tactics of behavioral research. Hillsdale, NJ: Lawrence Erlbaum Associates. Krause, K. H., Dresel, S. H., Krause, J., Kung, H. F., & Tatsch. K. (2000). Increased striatal dopamine transporter in adult patients with attention deficit hyperactivity disorder: Effects of methylphenidate as measure by single photon emission computer tomography. Neuroscience Letters, 285, 107-110. Kurylo, D. D. (2004). Effects of quinpirole on operant conditioning: perseveration of behavioral components. Behavioural Brain Research, 155, 117-124. Laties, V. G. (1972). The modification of drug effects on behavior by external discriminative stimuli. Journal of Pharmacology and Experimental Therapeutics, 183, 1-13. Laties, V.G, Wood, R. W., & Rees, D. C. (1981). Stimulus control and the effects of d- amphetamine in the rat. Psychopharmacology , 75, 277-282. Machado, A. (1997). Increasing the variability of response sequences in pigeons by adjusting the frequency of switching between two keys. Journal of the Experimental Analysis of Behavior, 68, 1-25. McElroy, E., & Neuringer, A. (1990). Effects of alcohol on reinforced repetitions and reinforced variation in rats. Psychopharmacology, 102, 49?55. McKearney, J. W., & Barrett, J. E. (1975). Punished behavior: Increases in responding after d- amphetamine. Psychopharmacologia, 41, 23-26. Meltzer, H. Y., & Stahl, S. M. (1976). The dopamine hypothesis of schizophrenia: A review. Schizophrenia Bulletin, 2, 19-76. Miller, G. A., & Frick, F. C. (1949). Statistical behavioristics and sequences of responses. Psychological Review, 56, 311-324. 32 Missale, C., Nash, S. R., Robinson, S., Jaber, M., & Caron, M. (1998). Dopamine receptors: From structure to function. Physiological Reviews, 78, 189-225. Mook, D. M., Jeffrey, J., & Neuringer, A. (1993). Spontaneously hypertensive rats (SHR) readily learn to vary but not to repeat instrumental responses. Behavioral & Neural Biology, 59, 126-135. Mook, D. M., & Neuringer, A. (1994). Different effects of amphetamine on reinforced variations versus repetitions in spontaneously hypertensive rats (SHR). Physiology & Behavior, 56, 939-944. Neuringer, A. (1991). Operant variability and repetition as functions of interresponse time. Journal of Experimental Psychology: Animal Behavior Processes, 17, 3?12. Neuringer, A. (1992). Choosing to vary and repeat. Psychological Science, 3, 246?250. Neuringer, A. (2002). Operant variability: Evidence, functions, and theory. Psychonomic Bulletin & Review, 9, 672?705. Neuringer, A., Kornell, N., & Olufs, M. (2001). Stability and variability in extinction. Journal of Experimental Psychology: Animal Behavior Processes, 27, 79?94. Odum, A. L., Ward, R. D., Barnes, C. A., & Burke, K. A. (2006). The effects of delayed reinforcement on variability and repetition of response sequences. Journal of the Experimental Analysis of Behavior, 86, 159-179. Page, S., & Neuringer, A. (1985). Variability is an operant. Journal of Experimental Psychology: Animal Behavior Processes, 11, 429-452. Per?z-Padilla, A., & Pell?n, R. (2003). Amphetamine increases schedule-induced drinking reduced by negative punishment procedures. Psychopharmacology, 167, 123?129. Pesek-Cotton, E. F., Johnson, J. E., & Newland, M. C. (2011). Reinforcing behavioral variability: An analysis of dopamine-receptor subtypes and intermittent reinforcement. Pharmacology, Biochemistry and Behavior, 97, 551-559. Pryor, K. W., Haag, R., & O?Reilly, J. (1969). The creative porpoise: Training for novel behavior. Journal of the Experimental Analysis of Behavior, 12, 653-661. Schoenfeld, W. N., Harris, A. H., & Farmer, J. (1966). Conditioning response variability. Psychological Reports, 19, 551-557. Schultz, W. (1998). Predictive reward signal of dopamine neurons. Journal of Neurophysiology, 80. 33 Schwartz, B. (1980). Development of complex stereotyped behavior in pigeons. Journal of the Experimental Analysis of Behavior, 33, 153-166. Schwartz, B. (1982a). Failure to produce variability with reinforcement. Journal of the Experimental Analysis of Behavior, 37, 171-181. Schwartz, B. (1982b). Reinforcement-induced behavioral stereotypy: How not to teach people to discover rules. Journal of Experimental Psychology: General, 111, 23-59. Stokes, P. D. (1995). Learned variability. Animal Learning and Behavior, 23, 164?176. Thompson, D. M., & Moerschbaecher, J. M. (1979). An experimental analysis of the effects of d-amphetamine and cocaine on the acquisition and performance of response chains in monkeys. Journal of the Experimental Analysis of Behavior, 32, 433?44. Tripp, G., & Wickens, J. R. (2008). Research review: Dopamine transfer deficit: A neurobiological theory of altered reinforcement mechanisms in ADHD. Journal of Child Psychology and Psychiatry, 49, 691?704. Van den Buuse, M., & de Jong, W. (1989). Differential effects of dopaminergic drugs on open field behavior of spontaneously hypertensive rats and normotensive Wistar-Kyoto rats. Journal of Pharmacology and Experimental Therapeutics, 248, 1189-1196. Ward, R. D., Bailey, E. M., & Odum, A. L. (2006). Effects of d-amphetamine and ethanol on variable and repetitive key-peck sequences in pigeons. Journal of the Experimental Analysis of Behavior, 86, 285-305. Ward, R. D., Kynaston, A. D., Bailey, E. M., & Odum, A. L. (2008). Discriminative control of variability: Effects of successive stimulus reversals. Behavioral Processes, 78, 17-24. Westerink, B. C., & de Vries, J. B. (1989). On the mechanism of neuroleptic induced increase in striatal dopamine release: Brain dialysis provides direct evidence for mediation by autoreceptors localized on nerve terminals. Neuroscience Letters, 99, 197?202. Wright, L. K. M., & Paule, M. G. (2007). Response sequence difficulty in an incremental repeated acquisition (learning) procedure. Behavioural Processes, 75, 81-4. Zeiler, M. (1977). Schedules of reinforcement: the controlling variables. In Honig, W. K. & Staddon, J.E.R. (Eds.), Handbook of operant behavior (pp. 201-232). Englewood Cliffs, NJ: Prentice Hall. 34 CHAPTER 2 A Parametric Examination of Intermittent Reinforcement on Behavioral Variability Abstract When reinforcement is infrequent or is discontinued, as during extinction, behavior begins to vary. The extent to which intermittent reinforcement produces variability can change based on reinforcer density and schedule type. The purpose of the current experiment was to determine intermittency parametrically by using different rates of reinforcement (both rich and lean) on operant variability in a session. Another purpose was to assess differences, if any, between interval schedules and ratio schedules of reinforcement on variability either when it is required or permitted. Long Evans rats were trained to press two levers under two conditions. In the VARY 8:4 component, any four-response sequence distributed between these levers that differed from previous 8 sequences was eligible for reinforcement. In the FR 4 component, all four-response sequences were eligible. These were treated as response units that were, in turn, reinforced under different overarching schedules, i.e., these were unit schedules under a second-order schedule. First the rats responded under a mult FR 1 (VARY 8:4) FR 1 (FR 4) in which every criterion sequence was reinforced with a sucrose pellet. To examine the effect of intermittent reinforcement, criterion sequences were reinforced intermittently according to either a Variable Interval schedule, 10 sec or 60 sec (i.e., mult VI 60-sec (VARY 8:4) VI 60-sec (FR 4)). Finally, to equate the overall reinforcement density (defined as total four-response sequences/reinforcer) 35 a Variable Ratio 2.5 schedule was imposed in the FR 4 component only (i.e., mult FR 1 (VARY 8:4) VR 2.5 (FR 4). There was much higher variability (based on an entropy measure) in the VARY 8:4 than in the FR 4 component in the FR 1 condition. All intermittent reinforcement schedules, both VIs and the VR contingencies always increased entropy in the FR 4 component but did not affect behavior in the VARY 8:4 component. Variability in the unit FR 4 component reflected the prevailing schedule and returned to baseline rapidly upon imposition of the baseline schedule. Intermittent reinforcement blunted the differences between the VARY 8:4 and FR 4 components regardless of the schedule parameters. 36 A Parametric Examination of Intermittent Reinforcement on Behavioral Variability Introduction The reinforcement of behavior is classically thought to result in repetitive responding that shows little variation since a reinforcer, by definition, strengthens or selects the response that produces it. Variability, a functional component of behavior that is necessary for learning or behavior change to occur, might seem to be incompatible with operant conditioning. Reinforcement, however, can produce variation in behavior when variability is directly targeted, and operant variability is now a well-established response class (Neuringer, 2002). The creation of a solid method for bringing operant variability into the laboratory is an important step toward understanding this phenomenon. Environment-behavior interactions can yield operant variability and reinforcement schedules can be a determinant of variability, or the lack of it. Frequent reinforcement, for example, under certain conditions will produce stereotypies in operant responding (Schwartz, 1980). Behavior begins to vary as reinforcement becomes less frequent (Pesek-Cotton, Johnson, & Newland, 2011, Experiment 2) or is discontinued, as during extinction (Antonitis, 1951; Eckerman & Lanson, 1969; Stokes, 1995; Neuringer, Kornell, & Olufs, 2001). This type of variability may be referred to as schedule-induced variability (Souza, Abreu-Rodrigues, & Bauman, 2010). Direct reinforcement of behavioral variability, otherwise known as operant variability (Neuringer, 2002), is another determinant of its occurrence. 37 The extent to which intermittent reinforcement produces variability can change based on reinforcer density. Eckerman and Lanson (1969) studied pigeons? pecking responses along a narrow strip under continuous and random interval schedules of reinforcement. During the continuous reinforcement phase, pecking was concentrated on a small location and little variability was noted, but during the intermittent phase of the experiment, pecking became more variable and occurred at different positions along the strip. The authors attributed the variable pecking behavior to intermittent reinforcement?s resemblance to extinction. In addition to reinforcement density, the type of intermittent reinforcement schedule (i.e. ratio vs. interval, fixed vs. variable) influences the occurrence of variability in responding (Boren, Moerschbaecher, & Whyte, 1978). Ratio schedules produce high rates of responding because the more responses that occur, the more likely that those responses will be reinforced and because they tend to reinforce short interresponse times (IRTs) (Zeiler, 1977). Stereotypy is often engendered with ratio schedules because the short IRTs are incompatible with variability and variability will often slow responding. Under interval schedules, the link between response rate and reinforcement rate is weaker and interval schedules tend to reinforce moderate or longer interresponse times (Baum, 1993; Cole, 1994). This permits other behavior to arise and so such schedules may contribute to variability. Pesek-Cotton, et al. (2011, Experiment 2), measured the effects of intermittent reinforcement (Variable Interval 60 sec, VI 60?) on variability when it was required and, in a control condition, when it was merely allowed. In the VARY 8:4 component, a four-response sequence was reinforced if it differed from the previous eight sequences. In the FR 4 component, any sequence of four responses would result in a reinforcer but variability was not required. High levels of variability occurred in the VARY 8:4 component whereas very low levels occurred in 38 the FR 4 component. Then these four-response sequences were placed under an overarching intermittent schedule of reinforcement. That is, the VARY 8:4 performance was treated as a response unit that was reinforced under a Variable Interval 60? (VI 60? in which a sequence of four responses was reinforced on average every 60 seconds if it differed from the previous eight sequences). Similarly, the FR 4 sequence was also treated as a response unit that was reinforced under a VI 60? schedule in which any sequence of four responses was reinforced on average every 60 seconds. This contingency was set up to equate reinforcement rates, as measured by reinforcers/unit time, between both the VARY 8:4 and FR 4 conditions. Reinforcing criterion sequences intermittently under the VI 60? schedule increased variability in the FR 4 component to nearly the levels seen in the VARY 8:4 component, but intermittent reinforcement had no effect on variability when the response unit was the VARY 8:4 contingency. This large increase in variability in the FR 4 component may have been due to decreased reinforcement caused by the lean reinforcement schedule. Higher response variability, as noted above, can be a result of lower reinforcement density. Perhaps using a shorter interval (i.e. richer schedule of reinforcement) would have yielded different results. Another possibility may be the implementation of the schedule type that contributed to the increased variability in the FR 4 component. Interval schedules tend to reinforce longer interresponse times which permit the occurrence of more variability. Ratio schedules, on the other hand, permit more stereotypic responding. The purpose of the current experiment was to determine intermittency parametrically on variability when it is required or permitted in a session. This was completed by using different reinforcement densities (both rich and lean) and two different types of schedules (interval and 39 ratio). Perhaps a richer schedule or a ratio schedule of reinforcement, which are still both intermittent, would not increase variability in responding. Method Subjects The subjects were 16 male Long-Evans rats housed in a temperature- and humidity- controlled, AAALAC-accredited colony room that was maintained on a 12-hour light-dark cycle (lights on at 7:00 a.m.). Adult male rats were maintained at 300 grams by individualized feeding of a chow diet. This is a weight that establishes food as an effective reinforcer while being consistent with good health. Some of an animal?s caloric intake was provided during experimental sessions but supplemental feeding was conducted to maintain their body weights. The Auburn University IACUC guidelines for caloric restriction were followed. Testing Apparatus The experiments were conducted in 16 commercial operant chambers (Med Associates Inc. model #Med ENV 007) containing two front levers, each calibrated so that 0.20 N registered a press. A pellet dispenser was situated midway between the two front levers and the reinforcer was a 45 mg sucrose pellet (Purina Mills, Inc., St. Louis, MO). Sonalert tones? (2900 and 4500 Hz, nominally) were calibrated to an amplitude of 70 dbC. A house light (28 V 100 ma) was located midway at the top of the back wall, opposite the levers, and a light emitting diode (LED) was above each lever. Dimensions of the chambers were 12?L x 9 ??W x 11 ??H. Each chamber was surrounded by a sound-attenuating cabinet with built-in ventilating fan that circulated air into the experimental environment and provided masking white noise. Programs for experimental procedures and data-collection were written using MED-PC IV (Med- Associates, Georgia, VT). Session events were recorded with 0.01" resolution. 40 Procedure Rats were trained to execute a four-response sequence distributed between the left (L) and right (R) levers as described below. Training was accomplished as follows. Lever-pressing was autoshaped in two separate sessions using a Fixed-Time 5.5 min (FT 5.5 min) schedule in which a sucrose pellet was delivered, non-contingent on responding, during the last 30 seconds of a 5.5 minute interval. In the first autoshaping procedure, the left lever was extended, a high tone was sounded, and a pellet was delivered based on the FT 5.5 minute schedule. The lever then retracted and the cycle occurred again in the next 5.5 minutes. A press on the lever, which could only occur during the 30? that it was extended, also delivered a pellet. Once 10 pellets had delivered via lever-pressing, pellet delivery was contingent only on lever-pressing, i.e., the FT schedule was removed. This continued until there were 100 reinforced left lever presses. Then the same procedure occurred for training of the right lever. Once autoshaping was complete, or 100 reinforced responses on each lever occurred, the animal was trained under a fixed-ratio 4 (FR 4) schedule in which the animal was trained to perform a sequence of four lever presses by gradually incrementing the sequence length from one to four. Technically, this is an FR 1 (FR 4) schedule in which every four-response sequence is reinforced. During this component a consistent low tone (2900 Hz) was sounded. Training the VARY condition involved the removal of the low tone and the additional criterion that a four-response sequence must differ from the previous sequence. This was called a ?Lag 1? criterion because only one previous sequence was considered. The reinforcement cycle began when the animal met the lag requirement. Criterion trials ended in a 3-sec inter-trial interval, in which both levers retracted and a pellet was delivered. Non-criterion trials ended in a 15-sec timeout during which all lights in the chamber darkened and no pellet was delivered. 41 Once at least half of the sequences met the variability criterion, the lag was increased by one sequence for the next session. For example, once the animal met the criteria for Lag 1, a Lag 2 criterion was imposed, in which the current four-response sequence must differ from the previous two sequences executed. Then a lag 3 was imposed, and so on. The maximal number of different possible sequences with a four response sequence and two levers is 24, or 16. If responding resulted in more errors than reinforcers for 5 consecutive days on a specific lag, the requirement was lowered to the previous lag that was achieved. The Lag 8 criterion constituted final performance. All animals reached the VARY 8:4 schedule within 8-10 sessions. Baseline phases: Reinforcement of every criterion sequence. The target procedure during each baseline was a multiple schedule containing a FR 4 and a VARY 8:4 component. Technically, the VARY 8:4 and FR 4 schedules could be viewed as unit schedules of a second order FR schedule of reinforcement, with the designation mult FR 1 (VARY 8:4) FR 1 (FR 4). In the FR 4 component, any four-response sequence was reinforced, hereafter referred to as the baseline FR 4 component. Under the VARY 8:4 schedule, a four-response sequence was reinforced if it differed from the previous eight sequences. For example, if the current sequence was LLRL then a reinforcer would be delivered if none of the previous eight sequences was LLRL. Non-criterion trials ended in a 15-sec time-out, during which all lights in the chamber darkened and no pellet was delivered. No tone sounded during the VARY 8:4 component. The VARY 8:4 and a FR 4 components were presented in strict alternation as a multiple schedule. Components changed after 10 reinforcers. All training and testing sessions began with illumination of both the house and lever lights. The reinforcement cycle commenced immediately after the animal met the response requirement and began by turning off the lever lights and low tone, sounding a high tone (4500 42 Hz) for 0.5 sec, and delivering a 45-mg sucrose pellet. The sessions ended after one hour or after 100 reinforcers were presented, whichever occurred first. The next experimental phase was introduced after responding stabilized (approximately 3-6 sessions). There was a return to baseline on completion of each phase. Phase I: Intermittent reinforcement of criterion sequences using time-based schedules. Rats were divided into two separate groups of eight each that were approximately matched for response rate and variability performances as denoted by individual U-values (described below). For Group 1 animals, criterion response sequences were reinforced under a VI 10? schedule of reinforcement. Thus, whether a four-response sequence was eligible for reinforcement was determined using criteria described in the previous section. Therefore, these could be viewed as second order schedules with a VARY 8:4 or a FR 4 unit schedule being reinforced under a VI 10? schedule. That is, completion of the unit schedule was reinforced randomly but, on average, every 10?. Technically, this is a mult VI 10? (VARY 8:4) VI 10? (FR 4). Hereafter multiple schedule components are referred to as the VARY 8:4 or FR 4 component. Group 2 animals were placed under an overriding VI 60? schedule of reinforcement in both components (mult VI 60? (VARY 8:4) VI 60? (FR 4)). This was done to compare levels of variability under both rich and lean schedules of reinforcement in which the number of reinforcers per unit of time was held constant. All criterion sequences ended in a 0.5 sec high- pitched tone which was paired with a sucrose pellet only when the schedule requirement was fulfilled. Phase I ended when responding had stabilized (approximately 18 sessions). Phase II: Intermittent reinforcement of criterion sequences using response-based schedules. We found that approximately one of every 2.5 sequences (2 sequences of 5) met 43 criterion during the VARY component. Therefore, in Phase II, every criterion sequence in the VARY component was reinforced but completion of criterion sequences during the FR 4 component were reinforced under a VR 2.5 schedule of reinforcement (mult FR 1 (VARY 8:4) VR 2.5 (FR 4)). This was done to hold the number of criterion sequences per reinforcer equal between both components. The VR 2.5 condition was imposed twice for replication purposes with a return to baseline in between and will be designated as VR 2.5 (a) and VR 2.5 (b). Conditioned reinforcement of criterion sequences occurred as described in Phase I. Statistical Analyses The dependent measures were the U-value, or entropy, an index of variability in the sequences produced and total responses. The U-value is an index of overall sequence variability (Page and Neuringer, 1985). The U statistic is denoted by the following equation: where p is equal to the probability of a given sequence i, and n is the total number of sequences possible, or 2N. A U value of 1.00 signifies each sequence occurred 1/16th of the time and a U value of 0.00 signifies that only one sequence was produced. A repeated-measures analysis of variance (RMANOVA) was performed for the dependent variable for each phase (i.e. baseline, VI 10?, VI 60?, etc.), with phase and component (VARY 8:4 vs. FR 4) as the within-subjects factors. When a phase X component interaction was significant, post hoc tests were performed using paired-sample t-tests. A p value of 0.05 was considered statistically significant. Huynh-Feldt corrections to univariate tests were used. All statistical analyses were performed using SYSTAT? 12 (SYSTAT Software Inc. Richmond, CA, USA). n pp U i ii 2 16 1 2 log )log(? = ?? = 44 Results Figures 1 and 2 represent data from two individual animals, one from each group. These figures display behavior across each experimental phase. Baseline Baseline performance during the unit FR 4 component was markedly different from that during the unit VARY 8:4 component for all animals (Black bars in Figures 3 and 5). In the VARY 8:4 component, U-values for individual subjects ranged from 0.62 to 0.93 in both Group 1 and Group 2, signifying a high degree of variability in the sequences produced. In the FR 4 component, U-values were substantially lower, ranging from 0 to 0.61 across individual subjects in both Group 1 and Group 2. Total responding was always higher in the VARY 8:4 component than in the FR 4 component for both groups during baseline (Black bars in Figures 4 and 6). Intermittent Reinforcement Group 1. As during baseline, entropy was significantly higher in the unit VARY 8:4 component than in the unit FR 4 component (F (1, 7) = 54.14, p < .001) (see Figure 3). There was also a significant effect of phase (F (6, 42) = 11.52, p < .001) and a significant Component X Phase interaction (F (6, 42) = 11.42, p < .001) because the effect of phase occurred only in the FR 4 component. Post hoc analyses show that there were significant differences (p < .05) between each of the intermittent phases and their corresponding baselines in the FR 4 component. Total responding was higher in the unit VARY 8:4 component than in the unit FR 4 component (F (1, 7) = 8.56, p < .05) (Figure 4). There was a significant effect of phase (F (6, 42) = 19.57, p < .001) and a significant Component X Phase interaction (F (6, 42) = 54.14, p < .001) which indicated that intermittent phases produced both increases and decreases in total 45 responding from baselines in the components differently. Post hoc analyses show significant decreases (p < .05) in responding in the VARY 8:4 component during the VI 10? schedule and increases in responding during the first VR 2.5 phase. In the FR 4 component, all intermittent phases significantly increased (p < .05) total responding from the baseline values. Group 2. In Group 2, entropy was significantly higher in the unit VARY 8:4 component than in the unit FR 4 component (F (1, 6) = 106.93, p < .001) (Figure 5). One animal in Group 2 did not respond during all sessions of the second VR 2.5 challenge, therefore its data were removed from analysis in this section. There was a significant effect of phase (F (6, 36) = 16.74, p < .001) and a significant Component X Phase interaction (F (6, 36) = 24.70, p < .001), showing that intermittent phases produced higher entropy values than non-intermittent phases in the FR 4 component. In the VARY 8:4 component, however, one intermittent phase (VI 60?) produced lower entropy values than its corresponding baseline. Post hoc analyses confirmed that the changes in entropy values in both components were significant (p < .05). For total responding (Figure 6), there was a significant effect of component (F (1, 7) = 16.95, p < .01); more responses occurred during the VARY 8:4 component than during the FR 4 component. There was a significant effect of phase (F (6, 42) = 5.52, p < .01) and a significant interaction (Component X Phase, F (6, 42) = 32.87, p < .001); intermittent phases produced a larger number of responses than baseline phases during the FR 4 component. In the VARY 8:4 component, total response values increased or decreased depending on the intermittent schedule. The VI 60? schedule and the second VR 2.5 schedule decreased total responding, while the first VR 2.5 phase increased total responding. Post hoc analyses confirmed that the changes in total responding in both components were significant (p < .05). 46 Group 1 (VI 10?) vs. Group 2 (VI 60?). During the Variable Interval phases, entropy values in both groups were significantly higher in the unit VARY 8:4 component than in the unit FR 4 component (F (1, 7) = 12.23, p < .05 for Group 1 and F (1, 7) = 44.34, p < .01 for Group 2). There were no significant differences between Groups 1 and 2 in entropy in the VARY 8:4 (F (1, 7) = 1.89, p = .21) or the FR 4 (F (1, 7) = .15, p = .71) components. Also in the interval phases, total responding was significantly higher in the VARY 8:4 component than in the FR 4 component (F (1, 7) = 7.09, p < .05 for Group 1 and F (1, 7) = 23.58, p < .01 for Group 2). There were no significant differences between Groups 1 and 2 in total responding in the VARY 8:4 (F (1, 7) = .31, p = .59) or the FR 4 (F (1, 7) = 1.29, p = .29) components. Group 1 (VR 2.5 (a)) vs. Group 2 (VR 2.5 (a)). During the first implementation of the variable ratio phases, entropy values in both groups were significantly higher in the unit VARY 8:4 component than in the unit FR 4 component (F (1, 7) = 38.07, p < .001 for Group 1 and F (1, 7) = 23.32, p < .01 for Group 2). There were no significant differences between Groups 1 and 2 in entropy in the VARY 8:4 (F (1, 7) = .12, p = .74) or the FR 4 (F (1, 7) = .43, p = .53) components. Also in the variable ratio phases, total responding was significantly higher in the unit VARY 8:4 component than in the unit FR 4 component (F (1, 7) = 14.84, p < .01 for Group 1 and F (1, 7) = 15.66, p < .01 for Group 2). There were no significant differences between Groups 1 and 2 in total responding in the VARY 8:4 (F (1, 7) = .32, p = .59) or the FR 4 (F (1, 7) = .33, p = .59) components. Group 1 (VR 2.5 (b)) vs. Group 2 (VR 2.5 (b)). In the second implementation of the variable ratio phases, entropy values in both groups were significantly higher in the unit VARY 8:4 component than in the unit FR 4 component (F (1, 7) = 37.32, p < .001 for Group 1 and F (1, 6) = 11.03, p < .05 for Group 2). There were no significant differences between Groups 1 and 2 47 in entropy in the VARY 8:4 (F (1, 7) = 2.34, p = .17) or the FR 4 (F (1, 7) = .65, p = .45) components. Total responding was not significantly different between the unit VARY 8:4 and unit FR 4 components during this phase. There were no significant differences between Groups 1 and 2 in total responding in the VARY 8:4 (F (1, 7) = .42, p = .54) or the FR 4 (F (1, 7) = 2.29, p = .17) components. Discussion The present experiment compared entropy between rich and lean schedules of reinforcement in order to examine the impact of reinforcement density on operant variability. When every criterion sequence was reinforced during baseline conditions, there was a large difference in variability (as indicated by U-values) between the VARY 8:4 and FR 4 components. U-values were high when variability was explicitly reinforced in the FR 1 (VARY 8:4) component, and were low when variability was permitted but not required, as in the FR 1 (FR 4) components used during baseline. Introducing intermittent reinforcement in the form of interval- or response-based schedules always increased variability in the unit FR 4 component. This increase was coupled with an increase in responding in that component. Whether intermittent reinforcement was rich or lean did not affect entropy levels or total responses. Whenever reinforcement was based on time, i.e., under the VI schedules, both rich and lean schedules increased entropy to around 0.6 in the unit FR 4 component while having little effect in the unit VARY 8:4 component. Total responding decreased during the VARY 8:4 component and increased during the FR 4 component when reinforcement was based on the interval schedules. Overall, intermittent interval schedules of reinforcement, whether rich or lean, attenuated the differences in entropy levels between the VARY 8:4 and the FR 4 components. 48 The response-based intermittent schedule (VR 2.5 (FR 4)) tested the importance of maintaining similar rates of reinforced sequences. In the VR 2.5 phase, all criterion sequences were reinforced during the VARY 8:4 component, but every 2.5 sequences were reinforced during the FR 4 component. This number was picked because we found that every 2 out of 5 sequences were reinforced in the VARY 8:4 component. As with the VI schedules, when intermittent reinforcement was introduced there was a significant increase in entropy and total responses in the unit FR 4 component. Unlike the VI schedules, responding increased in the unit VARY 8:4 component during the first implementation of the VR 2.5 phase even though this schedule was not imposed during that component. The imposition of a response-based schedule did not alter the entropy during the VARY 8:4 component for Group 2. These values were indistinguishable from the values seen for Group 1. Building a stronger method for producing behavioral variability requires taking into account the criteria for an appropriate control procedure. All aspects of the variability component should be present in the control component, except for the actual ?be variable? contingency. These include the same number of responses per sequence, the same outcomes for criterion and non-criterion responses, and similar reinforcement rates between both components. An approach that is often used entails yoking reinforcers in the control component to those received in the VARY component (e.g., Denney & Neuringer, 1998; Ward, et al., 2008). This equates the timing of reinforcers per sequence between the two components and the number of responses per sequence, but the number of reinforcers per criterion sequence is still quite different. This approach reinforces every criterion response in the VARY component but reinforces criterion responses intermittently in the FR 4 component. As shown in the present study and in others 49 (Pesek-Cotton, et al., 2011; Eckerman & Lanson, 1969), intermittent reinforcement changes the structure of the response sequence by making it more variable. The goals, therefore, of creating an appropriate control procedure are that the two components differ on as few dimensions as possible, training time should be similar between the two components, there should be a large difference between variability in both components, and that the two components should be similar in the relationship between the operant and the consequence. The present study fulfilled each these goals during baselines with the exception of differing reinforcement rates. When reinforcement rates were equated by implementing the intermittent schedules, however, others goals were undermined. There were no longer large differences in variability between the two components, and when the VR schedule was introduced, the relationship between the operant and the consequence was no longer similar between the two components. The FR 1(FR 4) component used during baselines in the present study and in Experiment 1 of Pesek-Cotton, et al. (2011) could be considered an appropriate control for behavioral variability studies despite the occurrence of different reinforcement rates. Intermittent reinforcement, regardless of reinforcement rate, either by way of the VI schedule that equates reinforcement density or the VR schedule that equates the number of sequences per reinforcer, increased variability in the FR 4 condition so there was a narrower separation between the two components. The components should be as distinct as possible and there should be having large differences in variability to see any sort of effects. Schedule-type, interval vs. ratio, did not seem to play a factor in determining the level of variability that occurred in the FR 4 component. Any intermittent reinforcement phase caused an increase in variable behavior and, therefore, would not be a sufficient control in a drug administration study. Effects seen could not necessarily be attributed to the drugs but also to the intermittent schedule. 50 Finally, any drug effects that may be monitored during an experiment would only appear as decreased variability. The present study focused on the influences of reinforcement density and schedule type on variability that is directly reinforced or permitted. This study demonstrated both schedule- induced (by-product of an existing contingency) and operant variability (directly reinforced). Variability in the unit FR 4 component reflected the prevailing schedule and returned to baseline rapidly upon imposition of the baseline schedule. Thus, there was no influence of even recent history. Intermittent reinforcement blunted the differences between the VARY 8:4 and FR 4 components regardless of the schedule parameters. 51 References Antonitis, J. J. (1951). Response variability in the white rat during conditioning, extinction, and reconditioning. Journal of Experimental Psychology, 42, 273-281. Baum, W. M. (1993). Performances on ratio and interval schedules of reinforcememt: Data and theory. Journal of the Experimental Analysis of Behavior, 59, 245-264. Boren, J. J., Moerschbaecher, J. M., & Whyte, A. A. (1978). Variability of response location on fixed-ratio and fixed interval schedules of reinforcement. Journal of the Experimental Analysis of Behavior, 30, 63-67. Cole, M. R. (1994). Response-rate differences in variable-interval and variable-ratio schedules: An old problem revisited. Journal of the Experimental Analysis of Behavior, 61, 441-451. Denney, J. & Neuringer, A. (1998) Behavioral variability is controlled by discriminative stimuli. Animal Learning & Behavior, 26, 154-162. Eckerman, D. A., & Lanson, R. N. (1969). Variability of response location for pigeons responding under continuous reinforcement, intermittent reinforcement, and extinction. Journal of the Experimental Analysis of Behavior, 12, 73?80. Mowrer, O. H., & Jones, H. (1945). Habit strength as a function of the pattern of reinforcement. Journal of Experimental Psychology, 35, 293-311. Neuringer, A. (2002). Operant variability: Evidence, functions, and theory. Psychonomic Bulletin & Review, 9, 672?705. Neuringer, A., Kornell, N., & Olufs, M. (2001). Stability and variability in extinction. Journal of Experimental Psychology: Animal Behavior Processes, 27, 79?94. Page, S., & Neuringer, A. (1985). Variability is an operant. Journal of Experimental Psychology: Animal Behavior Processes, 11, 429-452. Pesek-Cotton, E. F., Johnson, J. E., & Newland, M. C. (2011). Reinforcing behavioral variability: An analysis of dopamine-receptor subtypes and intermittent reinforcement. Pharmacology, Biochemistry and Behavior, 97, 551-559. Schwartz, B. (1980). Development of complex stereotyped behavior in pigeons. Journal of the Experimental Analysis of Behavior, 33, 153-166. Souza, A., Abreu-Rodrigues, J., & Baumann, A. (2010). History effects on induced and operant variability. Learning & Behavior, 38, 426-437. Stokes, P. D. (1995). Learned variability. Animal Learning and Behavior, 23, 164?176. 52 Ward, R. D., Kynaston, A. D., Bailey, E. M., & Odum, A. L. (2008). Discriminative control of variability: Effects of successive stimulus reversals. Behavioral Processes, 78, 17-24. Zeiler, M. (1977). Schedules of reinforcement: the controlling variables. In Honig, W. K. & Staddon, J.E.R. (Eds.), Handbook of operant behavior (pp. 201-232). Englewood Cliffs, NJ: Prentice Hall. 53 Figures Figure 1. Rat 712 entropy levels for the VARY 8:4 and FR 4 components in Group 1 across the different components. The VARY 8:4 component is indicated by the filled circles and the FR 4 component is represented by the unfilled circles. 54 Figure 2. Rat 727 entropy levels for the VARY 8:4 and FR 4 components in Group 2 across the different components. The VARY 8:4 component is indicated by the filled circles and the FR 4 component is represented by the unfilled circles. 55 Condition B1 VI10 B2 VR2.5(1) B3 VR2.5(2) B4 E NT RO PY 0.0 0.2 0.4 0.6 0.8 1.0 VARY 8:4 Entropy - Group 1 Condition B1 VI10 B2 VR2.5(1) B3 VR2.5(2) B4 EN TR OP Y 0.0 0.2 0.4 0.6 0.8 1.0 FR 4 Entropy - Group 1 * * * ^ ^ ^ ^ ^ ^ ^ Figure 3. Mean entropy levels for the VARY 8:4 (top) and FR 4 (bottom) components for Group 1. The black bars represent the baseline components and the gray bars represent the different intermittent challenges. Significant differences (p < .05) from baseline are shown with an *. Significant differences (p < .05) between the VARY 8:4 and FR 4 components are shown with a ^ in the bottom graph. Error bars = 1 S.E.M. 56 Group 1 Vary 8:4 - Total Responses Condtion B1 VI10 B2 VR2.5(1) B3 VR2.5(2) B4 To tal R es po ns es 0 200 400 600 800 1000 1200 1400 * * Group 1 FR 4 - Total Responses Condition B1 VI10 B2 VR2.5(1) B3 VR2.5(2) B4 To tal R es po ns es 0 200 400 600 800 1000 1200 1400 * * * ^^ ^ ^ ^ ^ Figure 4. Mean total response levels for the VARY 8:4 (top) and FR 4 (bottom) components for Group 1. The black bars represent the baseline components and the gray bars represent the different intermittent challenges. Significant differences (p < .05) from baseline are shown with an *. Significant differences (p < .05) between the VARY 8:4 and FR 4 components are shown with a ^ in the bottom graph. Error bars = 1 S.E.M. 57 Condition B1 VI60 B2 VR2.5(1) B3 VR2.5(2) B4 EN TR OP Y 0.0 0.2 0.4 0.6 0.8 1.0 Group 2 VARY 8:4 - Entropy * Condition B1 VI60 B2 VR2.5(1) B3 VR2.5(2) B4 EN TR OP Y 0.0 0.2 0.4 0.6 0.8 1.0 Group 2 FR 4 - Entropy * ** ^ ^ ^ ^ ^ ^ ^ Figure 5. Mean entropy levels for the VARY 8:4 (top) and FR 4 (bottom) components for Group 2. The black bars represent the baseline components and the gray bars represent the different intermittent challenges. Significant differences (p < .05) from baseline are shown with an *. Significant differences (p < .05) between the VARY 8:4 and FR 4 components are shown with a ^ in the bottom graph. Error bars = 1 S.E.M. 58 Group 2 Vary 8:4 - Total Responses Condition B1 VI60 B2 VR2.5(1) B3 VR2.5(2) B4 To tal R es po ns es 0 200 400 600 800 1000 1200 * * * Group 2 FR 4 - Total Responses Condition B1 VI60 B2 VR2.5(1) B3 VR2.5(2) B4 To tal R es po ns es 0 200 400 600 800 1000 1200 * * * ^ ^ ^^^^ Figure 6. Mean total response levels for the VARY 8:4 (top) and FR 4 (bottom) components for Group 2. The black bars represent the baseline components and the gray bars represent the different intermittent challenges. Significant differences (p < .05) from baseline are shown with an *. Significant differences (p < .05) between the VARY 8:4 and FR 4 components are shown with a ^ in the bottom graph. Error bars = 1 S.E.M. 59 CHAPTER 3 Operant Variability: A Behavioral and Pharmacological Analysis Abstract Behavioral variability is often demonstrated in the laboratory setting by imposing a multiple schedule of reinforcement: one component requires the animal to vary sequences of responding and the other component serves as a control. Levels of variability in both components can be influenced by the method employed in the experiment. The purpose of the current study was to examine intermittent reinforcement in the control condition, the utility of the timeout for non-criterion sequences in the vary condition, and the importance of similar reinforcement rates between the two components. The effects of d-amphetamine on behavioral variability under these conditions were also examined. The purpose of the current experiment was to determine intermittency parametrically by using different rates of reinforcement (both rich and lean) on operant variability in a session. Another purpose was to assess differences, if any, between interval schedules and ratio schedules of reinforcement on variability either when it is required or permitted. Long Evans rats were trained to press two levers under a multiple schedule. In the VARY 8:4 component, any four-response sequence distributed between these levers that differed from previous 8 sequences was eligible for reinforcement. In the FR 4 component any four- response sequence was reinforced. Rats were then divided into two groups, depending on the control condition used. The first was a simple FR 4 procedure where every four-response sequence was reinforced. The other procedure was a Yoked FR 4 in which inter-reinforcer 60 intervals were equated between the VARY 8:4 component and the FR 4 component. To examine the role of timeouts for errors, each group was exposed to an ABA design in which a timeout was imposed for non-criterion sequence in the VARY 8:4 component (The ?B? phase). The effects of d-amphetamine were assessed during the first two phases. Finally, to equate the number of executed sequences per reinforced sequence in both the VARY 8:4 and FR 4 components, a Variable Ratio 3 (VR 3) schedule was imposed in the FR 4 component (i.e., mult FR 1 (VARY 8:4) VR 3 (FR 4). In both the Yoke and Non-Yoke conditions, there was much greater variability in the VARY 8:4 than in the FR 4 component. The inclusion of a timeout for non-criterion sequences had no effect on entropy in either group. Timeout slightly decreased the number of errors made during the VARY 8:4 component in the Yoke group, but not enough to affect entropy. The highest dose of d-amphetamine decreased response rates, increased entropy in the FR 4 component and decreased or had no effect on entropy in the VARY 8:4 component suggesting that effects may be baseline dependent. The effects of d-amphetamine on variability were not influenced by the timeout. The VR 3 schedule increased entropy in the FR 4 component. Thus, intermittent reinforcement blunted the differences between the VARY 8:4 and FR 4 components. 61 Operant Variability: A Behavioral and Pharmacological Analysis Introduction Variations in behavior allow adaptations to changes in one?s environment. A determinant of behavioral variability is frequency of reinforcement. High reinforcement rates, for example, will often produce stereotypies in operant responding (Schwartz, 1980). Behavior begins to vary as reinforcement becomes less frequent (Pesek-Cotton, Johnson, & Newland, 2011, Experiment 2) or is discontinued, as during extinction (Antonitis, 1951; Eckerman & Lanson, 1969; Stokes, 1995; Neuringer, Kornell, & Olufs, 2001). Direct reinforcement of behavioral variability, otherwise known as operant variability (Neuringer, 2002), is another determinant of its occurrence. The ?Lag? procedure sets up reinforcement to occur after every criterion sequence. Operant variability is a well-established phenomenon that has been studied inside and outside the laboratory (Neuringer, 2002). The behavior studied in laboratory investigations of operant variability research is a sequence of responses that results in reinforcement when it differs from previously executed sequences. Page and Neuringer (1985) pioneered this laboratory model by producing highly variable responding in pigeons by implementing a ?Lag procedure? that reinforced only those eight-response sequences that differed from previous sequences. Pesek-Cotton, et al. (2011) used a similar approach and compared it with a control condition with very little variability. In their VARY component, any sequence of four responses resulted in reinforcement if it differed from the previous eight sequences. In their control component, any sequence of four responses resulted in reinforcement and variability was permitted but not required. 62 In the standard ?Lag? procedure for producing operant variability, criterion sequences are those that meet the ?different from the previous number of sequences? contingency and these sequences are always reinforced. Non-criterion sequences (i.e., sequences that reproduce at least one of the previous N sequences) are generally followed by a timeout (Page & Neuringer, 1985; Pesek-Cotton, et al., 2011; Ward, et al., 2008). In control conditions, the use of a timeout varies depending on the type of control condition employed. If a specific sequence is required, then a timeout follows any sequence that differs from that specific sequence (Page & Neuringer, 1985). If the probability of reinforcement is yoked from the VARY component to the control component, timeouts occur when this probability has not yet passed (Denney & Neuringer, 1998; Ward, et al., 2008). If reinforcement is based solely on a FR 4 sequence of responses, timeouts do not occur during the control condition (Pesek-Cotton, et al., 2011). The different criterion for incorrect sequences between the control and the VARY components may have influenced the presence or absence of variability when challenges were introduced. The effects that occurred in the first two procedures may have been due to the presence of the timeout rather than the contingency itself. This is particularly true in studies concerned with drug effects on variability. d-Amphetamine, for instance, affects behavior differently if said behavior is being positively punished (e.g. shock) or negatively punished (e.g. timeout). It is decreased further by the former (Foree, Moretz, & McMillan, 1973) and increased (P?rez-Padilla and Pell?n, 2003) by the latter. Rather than serving as a form of punishment for incorrect responses, the timeout may be promoting responding. Reinforcement in control procedures does not always follow every sequence. Yoking procedures (Page & Neuringer, 1985; Ward, et al., 2008) for example, equate the timing of reinforcers per sequence between the two components and the number of responses per 63 sequence, but the number of reinforcers per criterion sequence is different. This approach reinforces every criterion response in the VARY component but reinforces criterion responses intermittently in the FR 4 component. As shown in previous studies (Pesek-Cotton, et al., 2011; Eckerman & Lanson, 1969), intermittent reinforcement changes the structure of the response sequence by making it more variable. Operant variability is a robust phenomenon that is unaffected by interventions such as ethanol administration (Ward, Bailey, & Odum, 2006), increased delay to reinforcement (Odum, et al, 2006), stimulus reversals (Ward, et al., 2008), pre-feeding (Doughty & Lattal, 2001) and administration of dopamine reuptake inhibitors, as well as D1 and D2 dopamine receptor agonists (Pesek-Cotton, et al., 2011). The control procedures used in the aforementioned studies were a REPEAT, Yoked, or a FR 4 sequence. Each of these specific control procedures did not display the same robustness as their corresponding VARY components. Instead, variability in these components increased when challenges were introduced. A multiple schedule that includes both a VARY and FR 4 control contingency was arranged in order to accomplish four objectives. The first was to determine the importance of including a timeout in the VARY component for non-criterion sequences. This was done by implementing an ABA design in which the first phase of the experiment did not include a timeout, the second phase did include it, and during the third phase it was removed again. The second objective was to complete a parametric investigation of two different control procedures. The first control was a sequence of responses (equal to the required sequence length for the variability sessions) that was reinforced continuously. In this contingency, reinforcement for criterion sequences was similar but at the expense of differing reinforcement rates between the two components. The second control was similar to the first with the exception that inter- 64 reinforcer intervals were yoked to those in the variability contingency. The second contingency produced similar reinforcement rates and similar consequences for criterion sequences in both components. The third objective was to evaluate the effects of d-amphetamine on operant variability under the above-mentioned conditions (i.e. timeouts vs. no timeouts and yoked vs. not yoked). The final phase of the experiment was the implementation of a Variable Ratio (VR) schedule of reinforcement in the FR 4 component only. This was done to equate reinforced sequences between both of the VARY and FR 4 components and to see effects, if any, of intermittent reinforcement on behavior during the FR 4 component. Method Subjects The subjects were 20 male Long-Evans rats housed in a temperature- and humidity- controlled, AAALAC-accredited colony room that was maintained on a 12-hour light-dark cycle (lights on at 7:00 a.m.). Adult male rats were maintained at 300 grams by individualized feeding of a chow diet. This is a weight that we have found to establish food as an effective reinforcer in the experiment while being consistent with good health. Some of an animal?s caloric intake was provided in experimental sessions but supplemental feeding was conducted to maintain their body weights. The Auburn University IACUC guidelines for caloric restriction were followed. The rats were weighed at least three times weekly and fed to maintain stable body weights. Apparatus The experiments were conducted in 16 commercially purchased operant chambers (Med Associates Inc. model #Med ENV 007) containing two retractable front levers, each calibrated so that 0.20 N registered a press. A pellet dispenser was situated midway between the two front levers and the reinforcer was a 45 mg sucrose pellet (Purina Mills, Inc., St. Louis, MO). Sonalert 65 tones? (2900 and 4500 Hz, nominally) were calibrated to an amplitude of 70 dbC. A house light (28 V 100 ma) was located midway at the top of the back wall, opposite the levers, and a light emitting diode (LED) was above each lever. Dimensions of the chamber were 12?L x 9 ??W x 11 ??H. Each chamber was surrounded by a sound-attenuating cabinet with built-in ventilating fan that circulated air into the experimental environment and provided masking white noise. Programs for experimental procedures and data collection were written using MED-PC IV (Med- Associates, Georgia, VT). Session events were recorded with 0.01" resolution. Procedure Rats were trained to execute a four-response sequence on the two levers, hereafter designated as Left (L) and Right (R) as described below. Lever-pressing was autoshaped in two separate sessions using a Fixed-Time 5.5 min (FT 5.5 min) schedule in which a sucrose pellet was delivered, non-contingent on responding, during the last 30 seconds of a 5.5 minute interval. In the first autoshaping procedure, the left lever was extended, a high tone was sounded, and a pellet was delivered based on the FT 5.5 minute schedule. The lever then retracted and the cycle occurred again in the next 5.5 minutes. A press on the lever, which could only occur during the 30? that it was extended, also delivered a pellet. Once 10 reinforced lever-presses occurred, pellet delivery became contingent only on lever-pressing, i.e., the FT schedule was removed, and the lever was always available. This continued until there were 100 reinforced left lever presses. Then the same procedure occurred for training of the right lever. Once autoshaping was complete, or 100 reinforced responses on each lever occurred, training under a fixed-ratio 4 (FR 4) schedule occurred, in which the animal was trained to perform a sequence of four lever presses by gradually incrementing the sequence length from one to four. This was a FR 4 66 schedule in which every four-response sequence was reinforced. During this component a consistent low tone (2900 Hz) sounded. Training the VARY condition involved the removal of the low tone and the additional criterion that a four-response sequence must differ from the previous sequence. This was called a ?Lag 1? criterion because only one previous sequence was considered. The reinforcement cycle began when the animal met the lag requirement. Both criterion and non-criterion trials ended in a 3 second inter-trial interval (ITI), in which both levers retracted and a pellet was delivered for only criterion sequences. Once at least half of the sequences met the variability criterion, the lag was increased by one sequence for the next session. For example, once the animal met the criteria for Lag 1, a Lag 2 criterion was imposed, in which the current four-response sequence must differ from the previous two sequences executed. Then a lag 3 was imposed, and so on. The maximal number of different possible sequences with a four response sequence and two levers is 24, or 16. If responding resulted in more errors than reinforcers for 5 consecutive days on a specific lag, the requirement was lowered to the previous lag that was achieved. A lag 8 criterion constituted final performance. All animals reached the VARY 8:4 schedule within 8-13 sessions. The implementation of two different multiple schedules (described below) followed training. Two control procedures were used in the present experiment. The first group of animals (n = 10) was trained under a Mult VARY 8:4 FR 4 (hereafter called the No-Yoke group) schedule. Under this schedule, components alternated between one in which a sequence of four responses had to differ from the previous eight sequences (VARY 8:4) and the other in which any sequence of four responses was reinforced (FR 4). The second group of animals (n = 10) was trained under a Mult VARY 8:4 FR 4 YOKE (herein called the Yoke group) schedule. Under this 67 schedule, inter-reinforcer-intervals (IRIs) were recorded in the VARY 8:4 component and these were used to determine the inter-reinforcer-intervals in the FR 4 component (Figure 1). Levers were retracted during all timeouts and inter-trial-intervals (ITIs) in the VARY 8:4 component, and during the IRIs and ITIs in the FR 4 YOKE component. The length of the IRI was adjusted according to the IRI of the VARY 8:4 component. For example when the FR 4 component began, the levers extended and the yoked IRI from the VARY 8:4 component began counting down. If a four-response sequence was completed before the IRI clock reached zero, a reinforcer was given and the levers retracted until this time had finished. For the next trial, the levers were extended and the clock for the second yoked IRI began counting down again. These clock times were unique values representative of the IRIs from the VARY 8:4 component. If the animal took longer to respond in the FR 4 component than its yoked IRI time, then the first four-response sequence was reinforced and the levers retracted for the 3 sec. ITI and the next trial began. This, however, was a rare circumstance because animals responded quickly in the FR 4 component. Despite these rare occasions, this yoking procedure produced identical reinforcement rates between the VARY 8:4 and FR 4 components. This yoking was also done to ensure that every criterion sequence was reinforced. This was different from the typical Yoking procedure (Ward, et al., 2008) in which the levers would remain extended and the animal would be allowed to perform sequences despite not receiving reinforcers in the control condition. In both procedures in the present study, the two components alternated after ten reinforcers had been delivered. A low tone served as a discriminative stimulus indicating that the FR 4 or FR 4 YOKE component was in effect. The onset of each session was signaled by the illumination of the house light in the chamber. The reinforcement cycle consisted of the low tone turning off (only in the FR 4 component), a high tone (4500 Hz) 68 was briefly sounding [parallel structure] for 0.5?, and a 45-mg sucrose pellet being delivered. The sessions lasted one hour. To examine the role of the timeout in the VARY 8:4 component, an ABA design was set up where the timeout was first absent, then present, and then absent again. Non-criterion trials in the VARY 8:4 component ended in a 15-sec time-out, during which all lights in the chamber darkened and no pellet was delivered. This occurred in both the No-Yoke and Yoke multiple schedules and d-amphetamine was administered in the presence and absence of the timeout. Also, it was important to address the possibility of carryover effects of the timeout from the second phase to the last phase of the procedure. If the timeout served as a (negative) punisher of repeating sequences, then one would expect these to diminish during the second phase when the timeout was presented. In the third phase, any reduction of incorrect sequences may be a direct result of history with the timeout. Therefore, a comparison of correct sequences was made for both the first and last phase of the procedure to control for prior experience. During the initial baseline, we found that approximately every third sequence performed met criterion during the VARY 8:4 component. The final phase involved the contingency in which every criterion sequence in the VARY 8:4 component was reinforced but completion of criterion sequences during the FR 4 component were reinforced under a Variable Ratio 3 (VR 3) schedule of reinforcement. This was done to hold the number of criterion sequences per reinforcer equal between both components. Here, the FR 4 schedule could be viewed as a unit schedule of a second order VR schedule of reinforcement, with the designation Mult FR 1 (VARY 8:4) VR 3 (FR 4). This approach held the number of sequences per reinforcer constant between the two components. All animals from the No-Yoke group (n = 10) were exposed to the 69 Mult FR 1 (VARY 8:4) VR 3 (FR 4) schedule. Timeouts were presented during this phase. For a summary of the experimental design, refer to Table 1. Drug Administration After animals in each group reached stable performance on measures of responding in Phases 1 and 2, acute dose-effect curves were determined for d-amphetamine (0.3-10.0 mg/kg). The dose-effect curves occurred at the end of the first two phases of the experiment (A and B), see Table 1. d-Amphetamine was dissolved in 0.9% saline solution. The injection of the saline vehicle served as a control for the drug injections. The drug was administered acutely on Tuesday and Fridays. Monday and Wednesday served as non-injected control days and Thursday served as a vehicle control. All drugs were administered intraperitoneally. For all drug sessions, each animal?s performance was monitored and inspected at the end of a session. If responding decreased to more than 20% of baseline rates during a session, then a higher dose was not given. The drug was administered twice per week until the range of doses was complete. Repeated doses may have been administered as required to fill in the dose-effect curve. For example, if 3 mg/kg has no effect and 10 mg/kg produces a sizeable change in a dependent variable, then a dose of 5.6 (geometric mean of 3 and 10) would be administered to produce a clearer picture of the dose-effect relationship. Rats were placed into the experimental chamber immediately after injection and the session began ten minutes later. Data and Statistical Analyses All statistical analyses were performed using SYSTAT? 11 (SYSTAT Software Inc. Richmond, CA, USA). Dose-effect curves for levels of variability were produced and a univariate repeated-measure analysis of variance (RMANOVA) was performed for each dependent variable. The highest dose was not included in analyses of entropy because some 70 animals did not respond at that dose and calculating entropy requires that an animal must respond during a session. A paired t-test analysis was then calculated for the remaining animals that responded during the high dose. Two-way ANOVAs were first used to analyze drug effects on the different components (VARY 8:4 vs. FR 4) with component and drug dose as the within- subject factors. One-way ANOVAs were then used to analyze the effects of phases. Between- phase analyses examined the effect of the timeout during the VARY 8:4 component and, during Phase 4, the Variable Ratio 3 schedule of reinforcement. These statistics were based on control doses and in the case of comparisons with Phase 4 only the No-Yoke animals were analyzed. The different interventions and doses of the drug served as within-subjects factors. F- ratios, degrees of freedom and p-values were reported for all RMANOVAs. The dependent measures were: 1) Response rates performed in each component during a session. These were calculated by dividing the number of responses per component by the time available for responding (i.e. time not spent in timeouts or ITIs). 2) The percentage of sequences reinforced. This was calculated by dividing the number of reinforced sequences in each component by their total number of sequences completed in the component. 3) The U-value: The U-value is measure of entropy, or variability in sequence structure, (Page and Neuringer, 1985). The U statistic is denoted by the following equation: n pp U i ii 2 16 1 2 log )log(? = ?? = where p is equal to the probability of a given sequence i, and n is the total possible number of sequences. In the present study, n = 16 (24). A U value of 1.00 signifies that 71 all possible sequences were emitted with equal probability while a U value of 0.00 signifies that only a single sequence was produced. 5) Reinforcement rate in both components. This was calculated by dividing the total number of reinforcers earned in a component by the total time spent in that component during a session. 6) The number of sequences that did not meet the VARY 8:4 criteria. Results Within Phase Comparisons and Drug Effects Phase 1 ? No timeout. Entropy was significantly higher in the VARY 8:4 component than in the FR 4 component in both of the No-Yoke (F (1, 9) = 272.11, p < .001) and Yoke (F (1, 9) = 671.3, p < .001) groups (see Figure 2). There was no effect of d-amphetamine on entropy up to a dose of 1.7 mg/kg in the No-Yoke (F (4, 36) = .67, p = .61) and Yoke (F (4, 36) = .71, p = .56) groups. Separate analyses of the highest dose showed a significant increase in entropy (t (3) = -4.85, p < .05) for the 3.0 mg/kg dose when compared to the vehicle dose in the Yoke FR 4 component only. This analysis was done separately because only four animals responded at this dose in the FR 4 component. Entropy in the VARY 8:4 and FR 4 components for the Yoke group were indistinguishable from that in the VARY 8:4 and FR 4 components for the No-Yoke group. For response rates, there was a main effect of component (F (1, 9) = 7.46, p < .05) and a main effect of dose (F (5, 45) = 11.26, p < .01) in the No-Yoke group (Figure 3). The significant Component X Dose interaction (F (5, 45) = 3.29, p < .05) showed that larger decreases (p < .05) in response rate occurred in both components. In Yoke group, there was a main effect of component (F (1, 9) = 30.27, p < .001) where higher rates of responding occurred in the FR 4 component. Response rates decreased at the highest dose (F (5, 45) =31.67, p < .001) in both 72 components. There was not a significant Component X Dose interaction (F (5, 45) = 1.74, p = .18). Response rates in the VARY 8:4 and FR 4 components for the Yoke group were indistinguishable from those in the VARY 8:4 and FR 4 components for the No-Yoke group. Rate-decreasing effects of d-amphetamine occurred at the highest dose for both groups during the VARY 8:4 and FR 4 components. Phase 2 ? Timeout. Entropy was significantly higher in the VARY 8:4 component than in the FR 4 component in both of the No-Yoke (F (1, 9) = 241.09, p < .001) and Yoke (F (1, 9) = 132.08, p < .001) groups (Figure 4). There was no effect of d-amphetamine on entropy up to a dose of 1.7 mg/kg in the No-Yoke (F (4, 36) = .69, p = .59) and Yoke (F (4, 36) = 2.23, p = .10) groups. Separate analyses of the 3.0 mg/kg showed that entropy significantly decreased (t (6) = 2.62, p < .05) at the 3.0 mg/kg dose when compared to the vehicle dose in the Yoke VARY 8:4 component but not during the FR 4 component. This analysis was done separately because only seven animals responded during the 3.0 mg/kg dose in the FR 4 component. Entropy in the VARY 8:4 component for the Yoke group was indistinguishable from that in the VARY 8:4 component for the No-Yoke group. This was paralleled in the FR 4 component also. There was a main effect of component on response rates in the No-Yoke (F (1, 9) = 37.12, p < .001) and Yoke (F (1, 9) = 53.15, p < .001) groups (Figure 5). There was also a main effect of dose on response rates in the No-Yoke (F (5, 45) = 25.77, p < .001) and Yoke (F (5, 45) = 23.91, p < .001) groups. In the No-Yoke group, the significant Component X Dose interaction (F (5, 45) = 8.27, p < .01) showed that larger decreases (p < .05) in total response rates occurred in the FR 4 component. In the Yoke group, the significant Component X Dose interaction (F (5, 45) = 11.74, p < .01) showed that larger decreases (p < .05) in response rates occurred in the FR 4 component. Response rates in the VARY 8:4 and FR 4 components for the Yoke group were 73 indistinguishable from those in the VARY 8:4 and FR 4 components for the No-Yoke group. Rate-decreasing effects of d-amphetamine occurred at the highest dose for both groups during the VARY 8:4 and FR 4 components. Phase 4 ? Timeout (VR3). Three out of the ten No-Yoke animals used were not included in these analyses because their performance did not stabilize during this phase. For entropy, there was a significant effect of component (F (1, 6) = 30.04, p < .001) and a significant effect of dose (F (2, 12) = 5.01, p < .05) (Figure 6, left panel). A Component X Dose interaction (F (2, 12) = 12.05, p < .01) showed that the 1.7 mg/kg dose increased entropy as compared with the vehicle in the FR 4 component. Inspection of Figure 5 indicates that this is because the vehicle session for that condition was unusually low; the 1.7 mg/kg condition did not differ from non-injected control. For response rates, there was a main effect of component (F (1, 6) = 26.87, p < .01) where higher rates were observed in the FR 4 component (Figure 6, right panel). There was also a main effect of dose (F (3, 18) = 7.43, p < .05) where higher doses of d-amphetamine decreased response rates. There was not, however, a significant Component X Dose interaction (F (3, 18) = .68, p = .53). Comparisons Across Phases Entropy. In the No-Yoke group, there was an effect of phase on entropy for both of the VARY 8:4 (F (3, 27) = 5.62, p < .01) and FR 4 (F (3, 27) = 65.29, p < .001) components (Figure 7). In the VARY 8:4 component, entropy in Phase 3 was significantly higher (p < .05) than that in Phase 1. In the FR 4 component, entropy was significantly higher in Phase 4 than in any of the previous three phases (p < .001). For the Yoke group, there was no effect of phase on entropy in either of the VARY 8:4 (F (2, 18) = 2.86, p = .09) or the FR 4 (F (2, 18) = 2.27, p = .13) 74 components. A comparison of entropy in the FR 4 component between the No-Yoke and Yoke group yielded a difference during Phase 2 only. Entropy was higher in the Yoke group than in the No-Yoke group only during this phase t (9) = 2.47, p < .05 in the FR 4 component. VARY 8:4 Errors. For the No-Yoke group, there was not a significant effect of phase on errors (non-criterion sequences) (F (3, 27) = 2.86, p = .06) (Figure 8). For the Yoke group, the presence of the timeout in Phase 2 decreased the number of errors made in the VARY 8:4 component (F (2, 18) = 5.74, p < .05) when compared with those in Phase 1. Response Rates. For the No-Yoke group, response rates varied across phases in the VARY 8:4 component (F (3, 27) = 21.44, p < .001) but were unaffected in the FR 4 component (F (3, 27) = 3.74, p = .06) (Figure 9). In the VARY 8:4 component, rates of responding were significantly lower in Phase 2 and 4 than in those of Phase 1 (p < .05). For the Yoke group, there was a main effect of phase for both of the VARY 8:4 (F (2, 18) = 15.55, p < .001) and FR 4 (F (2, 18) = 4.42, p < .05) components. A decrease in response rates occurred in Phase 2 (p < .05), when timeouts were delivered for non-criterion sequences in the VARY 8:4 condition, when compared to response rates in Phase 1 for that same component. Discussion Behavioral variability was examined using a Mult VARY 8:4 FR 4 schedule. Two control procedures were implemented to determine the importance of reinforcement rate on variability. One maintained similar rates of reinforcement in both components (Yoke) while the other (No- Yoke) had no such constraint. In both the Yoke and No-Yoke conditions, there was much greater variability in the VARY 8:4 than in the FR 4 component. The overall findings showed that entropy levels were comparable between the two conditions. Rates of responding were always higher in the FR 4 component than in the VARY 8:4 component. Also, these rates tended to be 75 higher in the No-Yoke condition. To investigate the importance of a timeout for incorrect sequences during the VARY 8:4 component, an ABA design was created. The inclusion of a timeout for non-criterion sequences had no effect on entropy in both the Yoke and No-Yoke groups during Phase 2. The presence of the timeout also decreased response rates. Doses of d- amphetamine were then given during three of the phases. The effects of d-amphetamine on variability were not influenced by the timeout. The highest dose of d-amphetamine decreased response rates, increased entropy in the FR 4 component and either decreased or had no effect on entropy in the VARY 8:4 component suggesting d-amphetamine?s effects may depend on baseline levels of entropy. Finally in the fourth phase, reinforcement density was examined by implementing a VR 3 schedule in the FR 4 component only. Entropy in the FR 4 component increased as a result of the presence of the VR 3 schedule. Intermittent reinforcement blunted the differences between the VARY 8:4 and FR 4 components. Reinforcement Rate The No-Yoke procedure was replicated from the Pesek-Cotton, et al., (2011) study where, in the control condition, only a fixed number of responses resulted in reinforcement. The Yoke procedure was produced so that inter-reinforcer intervals were held constant between the VARY 8:4 and FR 4 components. This Yoking procedure was different from the Yoking procedure used in some studies of operant variability (Denney & Neuringer, 1998; Ward, et al., 2008). In those studies, the probability of reinforced sequences in the control condition was yoked to the percentage of reinforced trials in the previous vary component. For example, if 60% of trials were reinforced in the vary component then every 6 out of 10 trials were reinforced in the control component. A timeout was given for the remaining sequences that were not reinforced despite their similarity to the reinforced sequences. The use of a different type of 76 yoking procedure in the present experiment helped to resolve this issue of intermittent reinforcement that derives from linking reinforcement probabilities between the components. The current yoke procedure manipulated reinforcement rate but not at the expense of also manipulating intermittent reinforcement. For the reinforcement rates in both the No-Yoke and Yoke conditions please refer to Figure 10. The implementation of the Variable Ratio 3 (chosen because we found that approximately one third of the sequences were reinforced, i.e., met criterion, in the VARY 8:4 component) schedule in the FR 4 component helped confirm the effects of intermittent reinforcement on behavioral variability. When intermittent reinforcement was introduced, variability sharply increased in the FR 4 component. Variability levels had, in previous phases, remained low for the most part. The effects of intermittent reinforcement reported in the present study clarified why higher levels of variability exist in control conditions for several studies (Denney & Neuringer, 1998; Ward, et al., 2008) that yoke reinforcement probabilities between VARY and control conditions. Allowing responding before the probability passed created situations where reinforcement was not occurring. In the present experiment, the extension of the levers in the FR 4 component only after the yoked inter-reinforcer interval had passed meant that every criterion sequence was reinforced in the Yoke group. We were able to maintain a large distinction in entropy values between the VARY 8:4 and FR 4 components as in Pesek-Cotton, et al. (2011). Entropy remained significantly higher in the VARY 8:4 component than in the FR 4 component during all phases in both the Yoke and No-Yoke groups. In Ward, et al., (2008) average entropy values for the VARY component and the YOKE component were 0.78 and 0.63 respectively. In the present procedure, the average entropy values in the VARY component ranged from 0.70 to 0.75 and 77 from 0.22 to 0.26 for the Yoked FR 4 component. Reinforcing every criterion sequence leads to this separation. Timeout The second goal of the experiment was to examine the impact of timeout contingent on incorrect sequences in the VARY 8:4 component. This issue was important because drug effects may depend on whether negative or positive punishment is present. The inclusion of a timeout for non-criterion sequences had no effect on entropy in both the Yoke and No-Yoke groups during Phase 2. By comparing entropy and errors in Phase 1 (No Timeout) with those in Phase 3 (also did not include a timeout but followed experience with one), it was noted that history of a timeout may have altered behavior. Entropy was higher in Phase 3 than in Phase 1 even though the two phases were similar. This effect was only seen in the No-Yoke group. There appeared to be an upward trend in variability throughout the first three phases, which could account for the effect seen in Phase 3 when compared with Phase 1.The number of errors made during the VARY 8:4 component decreased only in the Yoke group but not enough to affect entropy. Finally, response rates in the VARY 8:4 component decreased when in the presence of the timeout in both Phases 2 and 4. This was perhaps due to lower levels of responding caused by the implementation of the timeout. Drug Effects The third goal of the current experiment was to evaluate/replicate d-amphetamine?s effect on variable behavior. Unlike Pesek-Cotton, et al., (2011), effects of lower doses of d- amphetamine on entropy levels in the FR 4 component were not found. There continued to remain a large discrepancy between the two components, VARY 8:4 and FR 4, throughout most of the dose effect curve. However, in the few animals that continued to respond during the 78 highest dose (3.0 mg/kg), a large increase in variability occurred in both of the Yoke and No- Yoke groups in the FR 4 component for Phase 1. During Phase 2, a decrease in variability occurred in the VARY 8:4 component for the Yoke group only. The highest dose of d- amphetamine decreased response rates in both of the No-Yoke and Yoke groups during the VARY 8:4 and FR 4 components. The effects of d-amphetamine on variability were not influenced by the timeout. The examination of drug effects using a within-subjects design, as done here, increases power and reduces the number of animals used in a study. Such a design does result in the administration of multiple doses of a drug to the same animal. The present study was designed to minimize the likelihood that sensitization or tolerance from repeated dosing could occur. It has been shown that behavioral sensitization may occur in the form of increased stereotypies with chronic treatment of d-amphetamine (Yetnikoff & Arvanitogiannis, 2005; Salomon et al., 2006, Anagnostaras & Robinson, 1996). However, the doses used in those studies were close or equal to our highest doses and they were administered consecutively over as many as 42 sessions. In the present study, at least two days elapsed between drug administrations and each dose was administered only once. Approximately 3 weeks, 15 sessions, elapsed between Phases 1 and 2 and the acute dosing regimen was similar between the two. Thus, sensitization or tolerance to d- amphetamine was not likely to be a significant confound in the present study. Summary In summary, the present study examined behavioral variability under a multiple schedule in which it was either required or allowed. In addition, two control procedures were implemented, a modified Yoke procedure where inter-reinforcer intervals were equated between the VARY 8:4 and FR 4 components and a FR 4 procedure where inter-reinforcer intervals were 79 not yoked. Both control procedures produced a large separation of behavioral variability between the two components. The effects of having a timeout present for non-criterion sequences in the VARY 8:4 component were also examined. The timeout acted as a punisher that reduced non- criterion sequences. d-Amphetamine?s effects on variability appeared to be separate from the presence or absence of the timeout. The pharmacologic action of d-amphetamine on behavioral variability appeared to be baseline dependent. When baseline levels of variability were low, the highest dose of d-amphetamine increased variability suggesting dopamine?s involvement in behavioral variability. When baseline levels of variability were high, the highest dose of the drug appeared to decrease operant variability. Finally, the inclusion of a VR 3 schedule in the control FR 4 component blunted the differences between the VARY 8:4 and FR 4 components. Since intermittent reinforcement increases behavioral variability, its use in control conditions should be carefully monitored. 80 References Anagnostaras, S. G, & Robinson, T. E. (1996). Sensitization to the psychomotor stimulant effects of amphetamine: Modulation by associative learning. Behavioral Neuroscience, 110, 1397?414. Antonitis, J. J. (1951). Response variability in the white rat during conditioning, extinction, and reconditioning. Journal of Experimental Psychology, 42, 273-281. Denney, J. & Neuringer, A. (1998) Behavioral variability is controlled by discriminative stimuli. Animal Learning & Behavior, 26, 154-162. Doughty, A. H., Lattal, K. A. (2001). Resistance to change of operant variation and repetition. Journal of the Experimental Analysis of Behavior, 76, 195-215. Eckerman, D. A., & Lanson, R. N. (1969). Variability of response location for pigeons responding under continuous reinforcement, intermittent reinforcement, and extinction. Journal of the Experimental Analysis of Behavior, 12, 73?80. Foree, D. D., Moretz, F. H., & McMillan, D. E. (1973). Drugs and punished responding II: d- Amphetamine induced increases in punished responding. Journal of the Experimental Analysis of Behavior, 20, 291-300. Neuringer, A. (2002). Operant variability: Evidence, functions, and theory. Psychonomic Bulletin & Review, 9, 672?705. Neuringer, A., Kornell, N., & Olufs, M. (2001). Stability and variability in extinction. Journal of Experimental Psychology: Animal Behavior Processes, 27, 79?94. Odum, A. L., Ward, R. D., Barnes, C. A., & Burke, K. A. (2006). The effects of delayed reinforcement on variability and repetition of response sequences. Journal of the Experimental Analysis of Behavior, 86, 159-179. Page, S., & Neuringer, A. (1985). Variability is an operant. Journal of Experimental Psychology: Animal Behavior Processes, 11, 429-452. Per?z-Padilla, A., & Pell?n, R. (2003). Amphetamine increases schedule-induced drinking reduced by negative punishment procedures. Psychopharmacology, 167, 123?129. Pesek-Cotton, E. F., Johnson, J. E., & Newland, M. C. (2011). Reinforcing behavioral variability: An analysis of dopamine-receptor subtypes and intermittent reinforcement. Pharmacology, Biochemistry and Behavior, 97, 551-559. 81 Salomon, L., Lanteri, C., Glowinski, J., & Tassin, J. (2006). Behavioral sensitization to amphetamine results from an uncoupling between noradrenergic and serotonergic neurons. Proceedings from the National Academy of Science, 103, 7476-81. Schwartz, B. (1980). Development of complex stereotyped behavior in pigeons. Journal of the Experimental Analysis of Behavior, 33, 153-166. Stokes, P. D. (1995). Learned variability. Animal Learning and Behavior, 23, 164?176. Ward, R. D., Bailey, E. M., & Odum, A. L. (2006). Effects of d-amphetamine and ethanol on variable and repetitive key-peck sequences in pigeons. Journal of the Experimental Analysis of Behavior, 86, 285-305. Ward, R. D., Kynaston, A. D., Bailey, E. M., & Odum, A. L. (2008). Discriminative control of variability: Effects of successive stimulus reversals. Behavioral Processes, 78, 17-24. Yetnikoff, L., & Arvanitogiannis, A. (2005). A role for affect in context-dependent sensitization to amphetamine. Behavioral Neuroscience, 119, 1678-81. 82 Tables Experimental Design Phases (1-4) Group: No Timeout Timeout No Timeout Timeout No-Yoke1: VARY 8:4 FR 4 n = 10 Amph Amph No drug FR 1 (VARY 8:4) VR 3 (FR 4) Yoke2: VARY 8:4 FR 4 n = 10 Amph Amph No drug Table 1. Experimental Design. 1In the No-Yoke group, inter-reinforcer intervals were not yoked between the FR 4 and VARY 8:4 components. 2In the Yoke group, the inter-reinforcer intervals in the FR 4 component were yoked to those in the VARY 8:4 component. 83 Figures Figure 1. An example session for the VARY 8:4 FR 4 Yoke group. In the Yoke group, inter- reinforcers-intervals in the FR 4 component were yoked to the inter-reinforcer-intervals (IRIs) in the VARY 8:4 component. R+ = Reinforcement. ITI = inter-trial-interval 84 No-Yoke Entropy - Phase 1 Dose (mg/kg) C V 0.3 1 1.7 3 En tro py 0.0 0.2 0.4 0.6 0.8 1.0 (n = 2) (n = 2) Yoke Entropy - Phase 1 Dose (mg/kg) C V 0.3 1 1.7 3 En tro py 0.0 0.2 0.4 0.6 0.8 1.0 (n = 4) (n = 6) * Figure 2. Dose-response functions for entropy in the No-Yoke group (left panel) and in the Yoke group (right panel) under d-amphetamine for the VARY 8:4 (filled circles) and the FR 4 (unfilled circles) components during Phase 1. In the Yoke group, inter-reinforcers intervals in the FR 4 component were yoked to the inter-reinforcer intervals in the VARY 8:4 component. In the No-Yoke group the inter-reinforcer intervals were not linked between the two components. The highest dose does not include all of the animals (as shown by the values in parentheses next to each data point) and was not included in RMANOVA analysis for entropy. Separate paired- sample t-tests were used to compare the highest dose (3.0 mg/kg) with the vehicle dose. Significant differences (p < .05) from vehicle are shown with an *. Error bars = 1 S.E.M. 85 No-Yoke Response Rates - Phase 1 Dose (mg/kg) C V 0.3 1 1.7 3 Re sp on se R ate (rs p/m in) 0 20 40 60 80 100 120 140 * * * Yoke Response Rates - Phase 1 Dose (mg/kg) C V 0.3 1 1.7 3 Re sp on se R ate s ( rsp /m in) 0 20 40 60 80 100 120 140 * * Figure 3. Dose-response functions for response rates in the No-Yoke group (left panel) and in the Yoke Group (right panel) under d-amphetamine for the VARY 8:4 (filled circles) and the FR 4 (unfilled circles) components during Phase 1. In the Yoke group, inter-reinforcers intervals in the FR 4 component were yoked to the inter-reinforcer intervals in the VARY 8:4 component. In the No-Yoke group the inter-reinforcer intervals were not linked between the two components. Significant differences (p < .05) from vehicle are shown with an *. Error bars = 1 S.E.M. 86 No-Yoke Entropy - Phase 2 Dose (mg/kg) C V 0.3 1 1.7 3 En tro py 0.0 0.2 0.4 0.6 0.8 1.0 (n = 2) (n = 3) Yoke Entropy - Phase 2 Dose (mg/kg) C V 0.3 1 1.7 3 En tro py 0.0 0.2 0.4 0.6 0.8 1.0 * (n = 4) (n = 7) Figure 4. Dose-response functions for entropy in the No-Yoke group (left panel) and in the Yoke Group (right panel) under d-amphetamine for the VARY 8:4 (filled circles) and the FR 4 (unfilled circles) components during Phase 2. In the Yoke group, inter-reinforcers intervals in the FR 4 component were yoked to the inter-reinforcer intervals in the VARY 8:4 component. In the No-Yoke group the inter-reinforcer intervals were not linked between the two components. The highest dose does not include all of the animals (as shown by the values in parentheses next to each data point) and was not included in RMANOVA analysis for entropy. Separate paired- sample t-tests were used to compare the highest dose (3.0 mg/kg) with the vehicle dose. Significant differences (p < .05) from vehicle are shown with an *. Error bars = 1 S.E.M. 87 No-Yoke Response Rates - Phase 2 Dose (mg/kg) C V 0.3 1 1.7 3 Re sp on se R ate s ( rsp /m in) 0 20 40 60 80 100 120 140 * * Yoke Response Rates - Phase 2 Dose (mg/kg) C V 0.3 1 1.7 3 Re sp on se R ate s 0 20 40 60 80 100 120 140 * * Figure 5. Dose-response functions for response rates in the No-Yoke group (left panel) and in the Yoke Group (right panel) under d-amphetamine for the VARY 8:4 (filled circles) and the FR 4 (unfilled circles) components during Phase 2. In the Yoke group, inter-reinforcers intervals in the FR 4 component were yoked to the inter-reinforcer intervals in the VARY 8:4 component. In the No-Yoke group the inter-reinforcer intervals were not linked between the two components. Significant differences (p < .05) from vehicle are shown with an *. Error bars = 1 S.E.M. 88 Entropy - Phase 4 Dose (mg/kg) C V 1.7 3 En tro py 0.0 0.2 0.4 0.6 0.8 1.0 (n = 2) (n = 2) * Response Rates - Phase 4 Dose (mg/kg) C V 1.7 3 Re sp on se R ate s ( rsp /m in) 0 20 40 60 80 100 120 140 * * Figure 6. Dose-response functions for entropy (left panel) and response rates (right panel) in the No-Yoke group under d-amphetamine for the VARY 8:4 (filled circles) and the FR 4 (unfilled circles) components during Phase 4. The highest dose in the left panel does not include all of the animals (as shown by the values in parentheses next to each data point) and was not included in RMANOVA analysis for entropy. Separate paired-sample t-tests were used to compare the highest dose (3.0 mg/kg) with the vehicle dose. Significant differences (p < .05) from vehicle are shown with an *. Error bars = 1 S.E.M. 89 Figure 7. Bar chart displaying entropy across all phases for the VARY 8:4 and the FR 4 components. In the Yoke group, inter-reinforcers intervals in the FR 4 component were yoked to the inter-reinforcer intervals in the VARY 8:4 component. In the No-Yoke group the inter- reinforcer intervals were not linked between the two components. Significant differences (p < .05) from Phase 1 are shown with an *. Significant differences (p < .05) from the No-Yoke group are shown with a ^. Error bars = 1 S.E.M. 90 Vary Errors by Phase in Vary Component Phase NO TIMEOUT1 TIMEOUT NO TIMEOUT2 TIMEOUT2 Va ry Er ro rs 0 20 40 60 80 100 120 140 No-Yoke Yoke * Figure 8. Bar chart displaying errors in the VARY 8:4 component across all phases in the No- Yoke (black bars) and Yoke (gray bars) groups. In the Yoke group, inter-reinforcers intervals in the FR 4 component were yoked to the inter-reinforcer intervals in the VARY 8:4 component. In the No-Yoke group the inter-reinforcer intervals were not linked between the two components. Significant differences (p < .05) from Phase 1 are shown with an *. Error bars = 1 S.E.M. 91 Response Rates Across Phases Groups Across Phases VARY 8:4 No-Yoke VARY 8:4 Yoke FR 4 No-Yoke FR 4 Yoke Re sp on se R ate s ( rsp /m in) 0 20 40 60 80 100 120 140 Phase 1 Phase 2 Phase 3 Phase 4 * ** Figure 9. Bar chart displaying response rates in the groups across all phases for the VARY 8:4 and the FR 4 components. In the Yoke group, inter-reinforcers intervals in the FR 4 component were yoked to the inter-reinforcer intervals in the VARY 8:4 component. In the No-Yoke group the inter-reinforcer intervals were not linked between the two components. Significant differences (p < .05) from Phase 1 are shown with an *. Error bars = 1 S.E.M. 92 Overall Reinforcement Rate in No Timeout Phase Group No-Yoke 1 Yoke 1 Re inf orc em en t R ate (rf /m in) 0 2 4 6 8 10 12 14 Vary 8:4 FR 4 Overall Reinforcement Rate in Timeout Phase Group No-Yoke 2 Yoke 2 Re inf orc me nt Ra te (rf/ mi n) 0 2 4 6 8 10 12 14 Vary 8:4 FR 4 Figure 10. Bar charts displaying overall reinforcement rates in Phase 1 (left panel) and Phase 2 (right panel) in the No-Yoke and Yoke groups. The VARY 8:4 component is represented by black bars and the FR 4 component as gray bars. In the Yoke group, inter-reinforcers intervals in the FR 4 component were yoked to the inter-reinforcer intervals in the VARY 8:4 component. In the No-Yoke group the inter-reinforcer intervals were not linked between the two components. Error bars = 1 S.E.M.