THE ROLE OF DOPAMINE RECEPTOR SUBTYPES IN REINFORCED VARIABILITY Except where reference is made to the work of others, the work described in this thesis is my own or was done in collaboration with my advisory committee. This thesis does not include proprietary or classified information. _________________________________________ Erin Fae Pesek Certificate of Approval: ___________________________ _____________________________ Alejandro A. Lazarte M. Christopher Newland, Chair Assistant Professor Alumni Professor Psychology Psychology ___________________________ _____________________________ Jennifer M. Gillis-Mattson George T. Flowers Assistant Professor Interim Dean Psychology Graduate School THE ROLE OF DOPAMINE RECEPTOR SUBTYPES IN REINFORCED VARIABILITY Erin Fae Pesek A Thesis Submitted to the Graduate Faculty of Auburn University in Partial Fulfillment of the Requirements for the Degree of Master of Science Auburn, Alabama August 9, 2008 iii THE ROLE OF DOPAMINE RECEPTOR SUBTYPES IN REINFORCED VARIABILITY Erin Fae Pesek Permission is granted to Auburn University to make copies of this thesis at its discretion, upon the request of individuals or institutions and at their expense. The author reserves all publication rights. __________________________________ Signature of Author __________________________________ Date of Graduation iv VITA Erin Fae Pesek, daughter of John Henry Pesek and M. Catherine Pesek, was born October 10, 1981, in Tyndall, South Dakota. She graduated from East Peoria Community High School in 2000. She attended Illinois Central College in East Peoria, Illinois for two years and Bradley University in Peoria, Illinois for 2 years where she graduated summa cum laude with a Bachelor of Science degree in Psychology. She entered Graduate School at Auburn University?s Department of Psychology, in August, 2004. v THESIS ABSTRACT THE ROLE OF DOPAMINE RECEPTOR SUBTYPES IN REINFORCED VARIABILITY Erin Fae Pesek Master of Science, August 9, 2008 (B. S., Bradley University, 2004) 67 Typed Pages Directed by M. Christopher Newland Variability in behavior can be measured and reinforced. However, the underlying neurological mechanisms of such variability are unknown. Dopamine is associated with reinforcement but its role in reinforced variability is also unclear. d Amphetamine has been proposed to induce both variability and stereotypy, so this drug was used as a probe to examine behavior under each of these components. The present study sought to examine how specific dopamine receptors, namely D1 and D2, may also influence variable responding. In Experiment 1, Long Evans rats were trained under a multiple VARY 8:4 FR 4 schedule. In the VARY 8:4 component, all four-response sequences that differed from previous 8 were reinforced. In the FR 4 component, all four-response sequences were vi reinforced. Discrimination between the two components was evidenced by high entropy (variability) in the VARY 8:4 component and low entropy in the FR 4 component. The effects of amphetamine on entropy depended on the baseline value of this measure of variability. When entropy was high (Vary 8:4 component) the effect was to reduce entropy, making behavior less variable. When variability was low and responding was more stereotyped, as during the FR 4 component, amphetamine increased variability. SKF 38393 administration decreased entropy only at doses that produced large reductions in response rate. There were dose-related increases in variability in the FR 4 component for the Quinpirole group. Entropy values in the FR 4 component approached those in the VARY 8:4 component for higher doses of Quinpirole. The present study suggests an association between dopamine and response variability as was evidenced by the effects of d amphetamine. It appears that these effects of amphetamine may have occurred because of its action on the dopamine D2 receptor. Experiment 2 used a variable interval schedule (VI 60?) to determine the effects of intermittent reinforcement on behavior in both the VARY 8:4 and FR 4 components. Discrimination between the two components disappeared and was evidenced by high entropy (variability) in the VARY 8:4 component and high entropy in the FR 4 component. Administration of d amphetamine had no effect on the behavior in either component. vii ACKNOWLEDGEMENTS The author would like to thank Dr. Christopher Newland for his patience and guidance. She would also like to thank her committee members Drs. Alejandro Lazarte and Jennifer Gillis, for their constructive comments. viii Style manual or journal used: Publication Manual of the American Psychological Association (5th Ed.) Computer software used: Microsoft Word XP, Microsoft Excel XP, Rs Series for Windows, Med-PC, SigmaPlot 8.0, and Systat 10.2 ix TABLE OF CONTENTS LIST OF FIGURES.............................................................................................................x CHAPTER 1: INTRODUCTION ......................................................................................1 The Response Unit..................................................................................................2 Prior Research on Behavior Variability..................................................................3 Parameters and Methods of Behavior Variability.................................................10 A Behavioral Mechanism......................................................................................12 Behavior Variability and Choice...........................................................................13 Dopamine in the Response-Reinforcer Relationship............................................15 The Potential Role of Dopamine in Behavior Variability.....................................20 The Control Procedure..........................................................................................23 References.............................................................................................................26 CHAPTER 2: EXPERIMENTS.......................................................................................29 Abstract .................................................................................................................29 Introduction...........................................................................................................31 Experiment 1.........................................................................................................33 Method ..................................................................................................................33 Results...................................................................................................................37 Discussion.............................................................................................................39 Experiment 2.........................................................................................................40 Method ..................................................................................................................41 Results...................................................................................................................42 Discussion.............................................................................................................42 General Discussion................................................................................................43 References.............................................................................................................50 Figures & Tables...................................................................................................53 x LIST OF FIGURES Figure 1. Histograms for RAT 624 displaying entropy and frequency of responding for each of the 16 possible sequences in the VARY 8:4 (left panel) and the FR 4 (right panel) components on a non-injected control day. The number of changeover responses required for each sequence is represented by the dotted lines (i.e. from zero up to three changeovers required to complete the sequences). ..........................................................................................53 Figure 2. Total responses (left panel) and entropy (right panel) for different doses of d amphetamine for the VARY 8:4 (filled triangles) and the FR 4 (unfilled circles) components. Error bars = 1 S.E.M...............................................54 Figure 3. Dose response functions for total responses (left panel) and entropy (right panel) under SKF38393 for the VARY 8:4 (filled triangles) and the FR 4 (unfilled circles) components. Error bars = 1 S.E.M. ...............................54 Figure 4. Dose-response functions for total responses (top panel) and entropy (lower panel) under quinpirole for the VARY 8:4 and the FR 4 components. Triangles represent the VARY 8:4 component with filled being the low dose group and unfilled being the high dose group. Circles represent the FR 4 component with filled being the low dose group and unfilled being the high dose group. Error bars = 1 S.E.M........................55 Figure 5. Histograms for RAT 616 displaying entropy and frequency of responding for each of the 16 possible sequences in the VARY 8:4 (top panel) and FR 4 (bottom panel) components when administered .17 mg/kg of Quinpirole. The number of changeover responses required for each sequence is represented by the dotted lines (i.e. from zero up to three changeovers required to complete the sequences). ...................................56 Figure 6. Total responses (left panel) and entropy (right panel) for different doses of d amphetamine for the VARY 8:4 (filled triangles) and the FR 4 (unfilled circles) components under a VI 60? schedule. Error bars = 1 S.E.M. ......57 1 INTRODUCTION The Role of Dopamine Receptor Subtypes in Reinforced Variability Variability in behavior is often seen as a hindrance to achieving an understanding of the underlying causes of the behavior and to controlling its appearance in laboratory or applied settings. Great efforts are often undertaken to keep variability at minimum. Frequently, variation is attributed to ?problems? in methodology, often at the expense of achieving a full understanding of its source. Variability in behavior, however, is functional and it may even be a fundamental property of behavior. Behavior change requires successive approximations towards a final goal, and without variation, this learning would be impossible. In the process of learning a new behavior, variation in responding must occur (e.g., shaping). Some recent studies have shown not only that variability is functional but also that it is shapeable (Grunow and Neuringer, 2002 & Page and Neuringer, 1985) and sensitive to drugs that act on dopamine neurotransmitter systems (Mook, Jeffrey, Neuringer, 1993 & Mook and Neuringer, 1994). The following proposal has two objectives in mind. The first is to compare operant variability in rats with previous studies using pigeons (Page and Neuringer, 1985 & Odum et al., 2006). This will be accomplished in a within-subject approach using a multiple-schedule procedure that permits a direct comparison of a reinforced variability procedure against a control in which a similar response sequence is required but without the variability contingency. The second objective is to determine the role of dopamine 2 and of dopamine-receptor subtypes of operant variability. This will be accomplished bysystemic administration of drugs selected to target specific receptor systems. The drugs that will be administered will be SKF 38393, Quinpirole, and d amphetamine. d Amphetamine is a dopamine re-uptake inhibitor so it results in elevated dopamine activity at all dopamine receptor subtypes. SKF 38393 and quinpirole are, respectively, dopamine D1-receptor and D2-receptor agonists that promote dopamine release and activation of the specific dopamine subtype receptors. These drugs were chosen because they act directly on the dopamine system and therefore may influence the ability to respond variably or not respond variably under a multiple schedule where the demands are just that. The Response Unit Zeiler (1977) distinguished between formal and conditionable response units. The formal response unit is that which is explicitly paired with presentation of a (presumably reinforcing) stimulus; therefore the formal response unit is defined methodologically. Conditionable response units, or operants, increase if paired with the presentation of the reinforcer and more generally, are influenced by their consequences. This distinction is important because both formal and conditionable response units are unambiguous but the former may not be influenced by the reinforcement contingencies. Conditionable response units, however, are modifiable and can be manipulated by the researcher through operant conditioning. When training new behavior, one must often define the individual conditionable response units that comprise the target behavior then specify a shaping procedure that chains those units together. For example, when establishing high- rate lever-pressing under a ratio schedule, it is usually assumed that the specified ratio of 3 lever-presses is the response unit but some have argued that it is the inter-response time (IRT) that is reinforced (Zeiler, 1977). If IRTs change with reinforced bouts of responding (the formal response unit), then they may be the conditionable response unit rather than the lever-presses. With the presentation of a reinforcer, the likelihood of the units (specific IRT) will occur until eventually reinforcement will only follow short IRTs. By the time the training has ended, the response unit is highly stereotyped and continuously reinforced. Prior Research on Behavior Variability Just as a single instance of behavior (e.g., lever-pressing) can be defined as a response unit, so too can sequences of responses. Schwartz (1982b) argues that the highly stereotyped response patterns that result from reinforcement are incompatible with the acquisition of new behavior because variation in one?s behavior is necessary for learning. Reinforcement will create a narrowly defined response that is separable and distinct from non-reinforced responses. Once the behavior has been trained, according to Schwartz, this training can interfere with the shaping of new behavior. Schwartz (1982a) attempted to train response variation in pigeons in order to prevent the narrowing/stereotypic effects of reinforcement on behavior. In that study, a pattern of eight responses on two keys had to be different from the previous pattern of eight responses just performed. Each response moved a light on a 5x5 matrix of lights located on the side wall of the chamber. If the bird moved the light, through key-pecking, from the top left corner to the bottom right corner without repeating a previously pecked sequence or moving off the matrix, which both resulted in a time-out, it received access to grain. The pigeons could ?move off? the matrix by responding more than four times on one of the keys. Therefore, besides varying 4 behavior from the previous sequence, there was also the response requirement that no more than four pecks on one key must occur in the sequence. In an effort to increase variability and unpredictability, Schwartz introduced a contingency such that a response sequence had to differ from the previous one. Only one pigeon of four demonstrated an increase in its variable responding. As will be discussed, these results are replicable, and peculiar to the requirement that the pigeon 'not move off the grid." Schwartz, however, concluded that response variability cannot be reinforced and that stereotypies in behavior will arise no matter what is being reinforced. In fact, Schwartz's definition follows from Zeiler's definitions of response classes. By definition, an operant response class is strengthened, narrowed, and made less variable, by reinforcement. Perhaps, however, one difficulty in reinforcing variability lies in the definition of the operant class to be reinforced. In order to confront this issue, it is important to determine whether there is an issue to confront, i.e., if "variability" can be reinforced. Many studies have shown that behavior variability appears to be an operant, in that it can be manipulated and selected (Pryor, Haag, & O?Reilly, 1969; Schoenfeld, Harris, & Farmer, 1966; Blough, 1966; Bryant & Church, 1974; Schwartz, 1980, 1982a), i.e., it is sensitive to its consequences. Pryor et al. (1969) reinforced porpoises? novel behaviors or, in other words, behaviors such as jumping, breaching, or flipping that had not been observed by the trainers. As the experiment progressed, the animals could perform other movements that had not been previously trained (e.g., laying on their right side, turning upside down) demonstrating that behavior variability can be established through the reinforcement of novel behaviors. Others, such as Blough (1966) and Bryant 5 and Church (1974), reinforced the least frequent interresponse times and lever alternation respectively. Such studies provide evidence that variation in behavior is an operant that can be controlled. Blough (1966) used a schedule that reinforced least frequent interresponse times in pigeons? key-pecking. This schedule, known as the ?LF schedule?, produces interresponse time (IRT) variability through the reinforcement of least frequent IRTs. Interresponse times were sorted into bins (e.g. 0-1 second, 1-2 seconds, etc.) and key- pecking would only be reinforced if its IRT fell within the bin that had previously contained the fewest IRTs. Behavior of the pigeons closely matched that of a stochastic generator and the pigeons? IRTs varied with bin changes. Bryant and Church (1974) reinforced less likely response sequences (in the form of lever-pressing by rats) 75% of the time and more likely response sequences 25% of the time. The percentage of alternation on the two levers matched the reinforcement contingency. Alternate responding reached asymptote around the probability of 49%. Results were consistent with that of a stochastic generator; as alternations were reinforced the probability of them occurring increased whereas when they were not reinforced, their occurrence decreased. The findings above are inconclusive. Some difficulties with Schwartz's experiments have already been discussed. In the Pryor et al. (1969) study the researchers were active observers of the porpoises but could not see every behavior that was being emitted by the animals. For instance, variability in eye-rolling and high-pitched calls was not reinforced because it was difficult for the observers to record such events. Therefore, behavior was limited to easily viewable movements that were determined to be novel by 6 the trainer, who also was a behaving organism. Although much more objective and reliable, the findings of Blough (1966) and Bryant and Church (1974) come with uncertainties, too. Variability effects observed in interresponse times may be highly limited and circumscribed. The variation in these IRTs was likely due to the schedule effects set in place because there appeared to be a pattern in the responses. Shorter IRTs tended to occur after other short IRTs and longer IRTs tended to occur after other long IRTs. Also, if there were relatively few IRTs in a particular bin, it is possible that the same IRTs from that bin would be reinforced repeatedly. Moreover, the variation seen was in IRTs not necessarily in response forms or choices. Reinforcement of switching behavior, due to the schedule requirement, as in Bryant and Church (1985) could also appear as opposed to the animals behaving variably/randomly. Variability in responding has also been seen in extinction procedures. Antonitis (1951) trained rats to nose poke anywhere on a 50 cm strip. Even though the location of the nose poke was not specific for reinforcement to occur, the pokes tended to be very close to one another. However, when the reinforcer was withheld, the rats began nose poking in various spots along the strip. It appears that when the behavior that worked in the past no longer produced the reinforcer; the rat?s behavior changed and is this change that may result in reinforcement again. Other studies have seen similar increases in variable responding when the response is under extinction (Eckerman & Lanson, 1969; Stokes, 1995; Neuringer, Kornell, & Olufs, 2001). However, again, extinction procedures do not provide evidence that variability can be directly reinforced and is not just a ?by- product? of schedule requirements. 7 Page and Neuringer (1985) sought to determine if ?variability? is an operant that can be influenced by its consequences examine and, especially, to identify the conditions under which variability can be sensitive to consequences. In the first experiment, a sequence of eight responses had to vary from either the one (Lag 1) or five (Lag 5) sequences prior to that just performed. Unlike Schwartz's experiment, there was no constraint on the number of presses on each key. Thus, there was only one possible error to make: repeating a sequence that was in the last one or five sequences already made. This ?variability condition? was compared to a ?variability-plus-constraint condition?. In the ?variability-plus-constraint condition?, exactly four responses on both the left and right keys were required, and the sequence had to be different from the last sequence performed. This condition resembled Schwartz's, except that there was no light that moved through a 5X5 array. In the ?variability condition? around 90% of the sequences produced met the reinforcer criterion. However, when the constraint was in place, only 42% of sequences met criterion. A key comparison between the two studies can be found in the measurement of variability per se and comparing the pigeons with a random number generator on these measures. One measure used was the percentage of reinforced trials per session, an indicator of how often the pigeons met the criteria. The second measure was the percentage of different sequences per session, or if the current sequence was different from all the previous sequences in the session. Interestingly, in both the variability and variability-plus-constraint conditions, the pigeons of Page and Neuringer (1985) performed similarly to a random number generator that produced 8-response sequences. 8 Page and Neuringer's first experiment differed from Schwartz?s in several potentially important respects. In experiment two, conditions for variability alone and variability-plus-constraint were compared with each other again so to replicate the conditions in Schwartz's previous study more directly (Schwartz, 1982a). The replication resembled experiment one, but there was no interpeck interval, reinforcement was 4 seconds access to grain, and all trials were followed by a 0.5 second intertrial interval. Once again, the only difference between the two conditions was the absence of a ?no more than four pecks? constraint in the variability alone condition. Pigeons behaved more variably (the measurement of variability will be described in detail below) under the variability-alone condition than under the variability-plus-constraint condition. The results support the hypothesis that variability may be hindered by schedule and/or contingency requirements such as the ?no more than four pecks on the same key in a sequence? constraint. There are simply more possible sequences that can be performed when there is no constraint, 256, as opposed to when the constraint is in place, 70. In other words, the odds of producing a response sequence different from the previous are greater when there is no constraint in place, and that is exactly what happened. Page and Neuringer showed that variability is an operant and introduced a technique for studying it in the laboratory. In further experiments, Page and Neuringer (1985) identified some necessary and sufficient conditions for variability to emerge as well as potential determinants. They addressed directly such questions as how many different sequences can be performed, whether memory is playing a factor, and if variability can come under control of a stimulus and they showed that, once established, 9 variability has the properties of a reinforceable dimension of behavior that can be measured and predicted. Page and Neuringer (1985) systematically increased the lag (or lookback window) to determine whether variability remained high over long lags. Only at a lag 50 requirement did performance deteriorate; at this requirement the number of criterion sequences dropped to around 67% of all sequences. At a lag of 50, a sequence of eight responses had to be different from the previous 50 sequences in order for reinforcement to become available. Interestingly, a random number generator performed similarly to the pigeon at a lag of 50. It seems that increasing the lag decreases the opportunity for reinforcement. However, the percentage of reinforced sequences (reinforced sequences/total sequences) as compared to non-reinforced sequences is still quite high since the pigeon must peck a different set of 8-response sequences that is unlike that past 50 sequences already completed. The hypothesis that memory of the previously performed sequences was responsible for the random responding was examined in another experiment. If memory is important then one cannot say that the pigeons are behaving randomly, but rather, that the production of a specific sequence is driven directly by the fact that that sequence was not produced recently. This, of course, is the opposite of randomness in which the probability of any one sequence occurring is equal that of any other sequence occurring. If the bird remembers the previous sequences then reducing the number of responses required from eight responses to four, should increase accuracy. Increasing the number of responses should be more difficult for the pigeon?s variable responding because there are more responses to remember. However, when the sequence length was shortened fewer, 10 not more, criterion sequences occurred. This is compatible with a "randomness" interpretation since in a chain of 8 response there are more random sequences (28) than there are in a chain of 4 (24). In other words, the longer the chain of responses, the higher the probability of pecking a different sequence. If a dimension of behavior is considered an operant, it will come under control of a particular stimulus. The final experiment of Page and Neuringer (1985) was designed to test this very notion of behavior. The pigeons? behavior came under the control of blue- colored keys in the variability component and red-colored keys in the stereotypy component. When the key lights were reversed and the red signified variable responding and the blue meant stereotypic responding was necessary for reinforcement, responding reversed, too. This was the final goal towards showing that behavior variability can in fact be reinforced which means that it is a dimension of behavior that should be looked at when testing schedule effects, drug effects, response magnitude, rate of responding, inter- response times, etc. Parameters and Methods of Behavior Variability Five different dependent measures have been used as markers of response variation: the number of reinforcers obtained in the variability condition, the number of different sequences emitted, and the frequency of dominant sequences (Schwartz, 1982a). Two others are interresponse times and the entropy (randomness) value of overall performance in the variability condition. Different sequences occur when behavior variability is reinforced (Grunow & Neuringer, 2002). The frequency of dominant sequences increases with shorter lags (e.g., a lag 1 only requires that the subject alternate responding between two different sequences) and decreases with longer lags and the 11 contingency for high variability (Page & Neuringer, 1985, and Grunow & Neuringer, 2002). Another dependent measure briefly mentioned before is interresponse times. Neuringer (1991) showed that the longer the interresponse time the more operant variability occurred. However, repetition of a single sequence tended to decrease with longer pauses between each response. An overall response measure for variability in a session is commonly denoted by the U value. This value, based on the information theory, is a measure of the entropy or stochastic generation of responding (Miller & Frick, 1949). The U value approaches 1.0 when the frequencies of sequences are approximately equal and it approaches 0.0 when one specific sequence occurs more often than others. There are many different methods that can be used to measure and demonstrate behavior variability (Neuringer, 2002). Novel response procedures, which include a change in contingencies, tend to reinforce variability because making a new response is usually beneficial towards the end goal of reinforcement. For example, in a radial arm maze the rat?s varying behavior is reinforced if the behavior is going down the arm of the maze that it did not previously visit. Therefore in this procedure the contingency requirement is about variation in responding. The lag procedure seen in Page and Neuringer (1985) requires that a number of different variations of a sequence must occur before reinforcement is given. Increasing the lag or lookback window to a high number (e.g., 50), results in a decrease in the number of reinforced trials. However, the percentage of reinforced trials still remained high at 67%. Another method is that of reinforcing least frequent occurrences of a response as seen in Blough (1966) and Schoenfeld, et al. (1966) in which interresponse times that were the least frequent were 12 reinforced. Variability in responding increased for the pigeons because they had to emit consecutive responses that had differing interresponse intervals. Threshold procedures involving reinforcing responses that are below a particular relative frequency can also demonstrate variability. Along the same lines, there is the method of frequency dependence in which there is a connection between the reinforcement rate and response frequency. More responding will lead to lower probabilities of reinforcement. A Behavioral Mechanism A final issue that deserves some attention is whether variability is secondary to a different response pattern. Thus, the formal response unit might be the lag-n requirement but the conditionable response unit might be something quite different. Machado (1997) questioned whether variability requirements of a schedule are being directly reinforced or whether they are a byproduct of another process. In other words, is variability a result of the schedule requirement presented to the animal or is it the result of a changeover requirement (e.g., animal must change levers/keys n number of times) in the schedule. Machado studied this by explicitly reinforcing changeovers and asking whether "variability" emerges. In experiments one and two of the Machado study (1997), pigeons? key-switching behavior was reinforced only if they changed from one key to the other at least once (experiment one) or three to four times, (experiment two). In experiment one, animals performed 30 different sequences (out of 256 possible) on average even though they only had to produce a sequence that had at least one changeover in it (Group 1) or more than one changeover in it (Group 2). In experiment two, many birds produced fewer different sequences than in the first experiment, some as low as 20 different sequences. In Machado?s final experiment, a replication of Page and Neuringer?s (1985) study, 13 sequence variability was reinforced rather than switching. Sequences of eight responses that differed from the previous 25 sequences were reinforced. The proportion of criterion sequences emitted by the pigeons was around .64 to .74 of the sequences performed. In his concluding statements, Machado argued that even though the pigeons did in fact behave more variably when response variation was reinforced than when switching was being reinforced, there was a similarity between responding in all experiments. Therefore, it seems that the two types of schedules draw upon the same processes for completion. Machado identified three characteristics that were consistent between the two procedures: 1) the location of the first peck was usually on the same key throughout the session; 2) the probability of switching from the first key increased as the sequence progressed; and 3) the probability of switching to the initially preferred key decreased or remained constant as the sequence progressed. These characteristics were present in all experiments regardless of whether they were response switching-based or response variation-based, suggesting that the same mechanism is being engaged for problem solving and reinforcement. Variability in behavior is greater when explicitly reinforced than it is when only switching is reinforced. So while direct reinforcement is important, it appears to be insufficient in accounting for Neuringer's results. Behavior Variability and Choice Response variation is a measurable aspect of behavior and there are specific response classes that can be identified sufficiently to reinforce. These imply that variation is an operant and, if so, that it should perform like other operants in choice situations. When two different response classes are reinforced at different rates, their relative rate of 14 occurrence approximately matches relative reinforcement rates, a phenomenon known as the strict matching law (Herrnstein, 1970). Because response rates do not always strictly match reinforcer rates, an alternative formulation, called the generalized matching relation, often provides superior fits to the data, at least when only two response alternatives are available (Baum, 1974 and Davison & McCarthy, 1988). To further strengthen the basis behind behavior variability research and its adaptability, Neuringer (1992) connected the two dimensions of behavior: choice and variability. The purpose of this study was to see if the relative appearance of varying and repeating response sequences was influenced by relative rate of reinforcement for the appearance of varying response sequences. In other words, will their responding follow the matching law as does other forms of responding? Neuringer (1992, experiment 1a) placed six pigeons in standard operant chambers and the reinforcer to be obtained was three seconds access to grain. The trials consisted of a sequence of four responses. Whether a sequence was reinforced depended on which condition, VARY or REPEAT happened to be in place. In the VARY condition, reinforcement was contingent on the pigeon pecking a sequence of four responses that was different from the three previous sequences. In the REPEAT condition reinforcement was provided if the pigeon pecked a sequence that replicated any of the past three sequences. Whether the trial required a VARY or REPEAT contingency depended on random selection of the computer. Each requirement was reinforced with a pre- determined probability that varied across conditions. For example, in one condition, 20% of VARY sequences were reinforced during "VARY" trials while 80% of REPEAT 15 sequences were reinforced 80% during "REPEAT" trials. There was no discriminative stimulus to signal which requirement was in place at the time. Neuringer (1992) reported that the percentages of variable sequences increased as a function of the percentages for reinforcement obtained. In other words, reinforcement increased the number of sequences. Graphical presentation shows a slight undermatching and a small bias for repeating sequences. Conclusions that can be drawn from this study are: 1) ?vary? sequences are sensitive to reinforcement contingencies and 2) performed sequences did not match the percentages of reinforcers obtained. Neuringer discusses three reasons for his interest in animals? choosing to respond variably over repeating: a) it supports adaptive action in the environment that is not predictable, b) it may change the very nature of one?s ability to predict and control behavior if an animal may choose to vary their behavior in unpredictable ways, and c) it may shed some light on problem solving and learning methods in which variability is usually a necessity. From his conclusions, behaving variably seems to be an adaptive dimension of behavior that animals and humans can choose to perform. Dopamine in the Response-Reinforcer Relationship The neurochemical basis of operant variability is poorly understood, but the behavioral mechanisms of choice and reinforcement appear to be important to operant variability. Because of dopamine's involvement in reinforcement, choice, and stereotypy, it can be hypothesized that it is involved in operant variability, as well. Dopamine is a catecholamine neurotransmitter that is a biological precursor to epinephrine and norepinephrine. There are two families of dopamine receptors and two or three (depending on how they are counted) pathways. Before discussing dopamine?s 16 involvement with reinforced behavior, the different subtypes and respective pathways in the brain for dopamine must be covered. How dopamine acts depends on the receptor it acts upon. One action of dopamine is to change the synthesis of cyclic AMP (adenosine monophosphate) in post-synaptic neurons, but whether it increases or decreases cAMP depends on the post-synaptic receptor. The two major subtypes of dopamine receptors can be distinguished by their actions on cAMP. These are the D1 family (D1 and D5) and the D2 family (D2, D3, and D4). D1-like receptors result in the activation of cyclic AMP through the stimulation of adenylyl cyclase and are excitatory post-synaptically. Classically, these receptors have been associated with cardiovascular and motor function. The D1-like receptors are located along and around the periphery of dopaminergic synapses in the striatum, nucleus accumbens, and the olfactory tubercule. Because of their location away from the synapse, the D1-like receptors may not be activated during bursts in dopamine concentration unless they are located near the release sites. More recently, the D1-like receptor subtypes have been linked to appetitive behaviors or other behavior said to be predictive of reward delivery (Schultz, 1998). This usually refers to reinforced behavior, but there is a nuance in this description; this receptor family is also activated by stimuli (and behavior can be a stimulus) that predict reinforcer delivery. Because D1 activation produces alerting behavior, selective agonists to the D1-like receptors result in a search for reinforcers (Kurylo, 2004) or self-stimulation that may interrupt operant behavior. Wachtel, Brooderson, and White (1992) reported that rats that were administered SKF 38393 (a dopamine D1 agonist) tended to engage in self-grooming behaviors that displaced operant behavior that had been trained. Therefore, it appears that an increase in dopamine at the 17 level of the D1 receptor may hinder or interfere with operant contingencies in place. However, animals treated with a low dose of SKF 38393 displayed locomotor activity across a wider span of locations rather than in the same locations in an open field experiment (Eilam, Clements, & Szechtman, 1991). The wider area of locomotion might be viewed as greater variability in behavior, although it is not clear that this is operant behavior, since no specific contingency is apparent. Unlike the D1-like receptors, activation of the D2-like receptor subtypes inhibit the stimulation of adenylyl cyclase, and therefore decreases the rate at which cyclic AMP is synthesized. This results in inhibition of the post-synaptic neuron (Missale, et al., 1998). Moreover, activation of D2-like receptors also increases the likelihood that the potassium ion channels will open, causing hyperpolarization and promoting further inhibition (Missale, et al., 1998). Physiological functions of the D2-like receptors differ depending on where the receptors are located in the synapse. Pre-synaptically the receptors act as autoreceptors and their activation decreases dopamine release into the synapse and diminished D2 activity. However, stimulation of the post-synaptic D2-like receptors results in an increased D2 function (and post-synaptic inhibition). To make things complicated, D2 activity, while inhibitory on the neuron, inevitably results in increased locomotor activity and arousal (Missale, et al., 1998). In the open field experiment of Eilam, Clements, and Szechtman (1992), rats administered quinpirole (a dopamine D2 agonist) moved in the same areas repeatedly. Also, Szechtman, Sulis, and Eilam (1998) injected rats with quinpirole and noted that the animals tended to engage in ?checking? behaviors. These behaviors include returning quickly to the same areas already visited in a ritualistic manner and not stopping as much 18 in the process. The authors compared these types of behaviors with those of a person diagnosed with Obsessive-Compulsive disorder. It is possible to suggest that an increase of dopamine at the level of the D2 receptor subtype will result in stereotypic behavior that may interfere with a contingency requirement of variable responding. According to Kurylo (2004), alerting behaviors produce the release of dopamine which activates the D1-like receptors, where once the behaviors become learned or contingent for the reward?s repeated occurrence, dopamine levels decrease through the activation of the D2- like receptors. The D2-like receptors are located in the striatum, olfactory tubercule, and the core of the nucleus accumbens, the pituitary gland, where the hormone prolactin is produced. Activating the D2 receptors in the pituitary gland inhibits the release of prolactin. Unlike the D1-like receptors, which are located on the periphery of the synapse, the D2 receptors form a dense layer within the synapse. Increases in dopamine concentration would, therefore, result in a saturation and activation of these receptors. Once levels of dopamine return to normal the D2 receptors remain slightly activated. Schultz (1998) ascribed to D2- like receptors the role of mediating reinforced behavior through the maintenance of focused responses of highly predictable reinforcers. When a response class has been reinforced, and has become well-established because of that history, the D2-like receptors contribute to the perseveration of that response class (Kurylo, 2004). Kurylo (2004) administered quinpirole (a dopamine D2 agonist) to rats in an experiment measuring its perseverative effects. Water-deprived rats were trained to (1) nose-poke in a funnel and (2) approach a drinking spout in a well for the reinforcer, water. Once responding was established, an extinction procedure was imposed where 19 water was no longer available after the nose-poke in the funnel. Animals in the lower dose group, 0.08 mg/kg, or the control animals had lower rates of responding and did not complete the chain of behaviors learned in training. They either remained in the funnel or in the well which suggests an extinction of the previously reinforced response chain. However, rats that were administered the highest dose of quinpirole, 0.60 mg/kg, continued to perform the operant response chain of a nose-poke to the funnel followed by the approach to the well. It appears that the administration of a higher dose of quinpirole resulted in perseveration of the response chain; animals were continuing to perform as they did in training. Therefore, quinpirole may block the effects of extinction through reduced sensitivity to changes in reinforcer availability. There are three major dopamine pathways. The nigrostriatal tract has cell bodies located in the midbrain. These cell bodies surround the substantia nigra and ascend to the caudate-putamen and the globus pallidus. The nigrostriatal tract is associated with the control of movement. Damage to it results in Parkonsonian-type effects (e.g. tremors, rigidity, etc.). The mesolimbic dopamine pathway is the second dopamine pathway and it begins in the ventral tegmental area and ascends into areas of the limbic system. This pathway is responsible for reinforced behavior and is involved in substance abuse. The third pathway is the mesocortical pathway which also begins in the ventral tegmental area but instead ascends to areas in the cerebral cortex. The mesocortical pathway is concerned with motivation, problem solving, impulsivity, choice, and reinforcement learning (Meyer & Quenzer, 2005). 20 The Potential Role of Dopamine in Behavior Variability There is little research on the neurochemical underpinnings of behavioral variability. However, what research that there is usually involves the developmental disorder known as attention-deficit/hyperactivity disorder (ADHD). Patients with this disorder tend to behave impulsively, and exhibit a high degree of inattention. Their attention span is very short and they tend to become easily interrupted by their surroundings (American Psychiatric Association, 2000). Because patients with ADHD lack a focused attention span their behavior tends to be variable. So studying drugs that are used to counteract symptoms of the disorder should also affect response variation. Psychomotor stimulants such as methylphenidate are used to treat the symptoms of ADHD. These drugs promote dopamine activity by inhibiting dopamine transporters that reuptake excess dopamine in the synapse. There is an increase in the dopamine transporters (DAT) in the brains of ADHD patients compared with controls (Dougherty et al., 1999 and Krause, et al., 2000). Mook, Jeffrey, Neuringer (1993) studied the spontaneously hypertensive rat (SHR) that has become one animal model for ADHD. These rats exhibit higher locomotor activity levels, more risk-seeking behaviors, more variable behavior, and are more likely to approach novel objects than the Wistar-Kyoto (WKY) strain, the background strain from which SHRs are bred. Both sets of rats, SHRs and WKYs, were exposed to a radial arm maze task and a schedule of response variation similar to that of Page and Neuringer (experiment one, 1985). Overall the SHRs responded more variably than the WKYs. When reinforcement was contingent on entering only a subset of arms or producing a specific subset of lever responses, the WKYs were more accurate in 21 responding in that particular subset. However, the SHRs were more likely to vary amongst the subset, thereby receiving more reinforcement than their counterparts. The SHRs showed learning deficits only when repetition was required, whether it be to re- enter an arm in the maze or repeat a sequence of lever responses. Mook and Neuringer (1994) examined the effects of d amphetamine on behavior under contingencies that reinforced variable or invariable responding in SHR and the WKY rats. In experiment one, a variability procedure was in place and reinforcement was contingent on producing a sequence of four lever press responses that differed from the previous four sequences performed (Lag 4). SHRs behaved significantly more variably than the WKY rats when variability was directly reinforced. Small, but peculiar, effects of amphetamine were seen in both the SHRs and in the WKY rats in that both groups behaved more variably than those injected with saline. In experiment two, "repetition" was measured in that only a subset of four out of sixteen sequences could be performed and the sequence had to be different from the previously performed sequence (Lag 1) for reinforcer delivery. In the second experiment, the SHRs that were administered amphetamine behaved similarly to the WKYs during control conditions in that they were accurate in repeating amongst a subset of lever presses when the repetition contingency was in place. This result is consistent with effects of methylphenidate or other stimulants administered to those diagnosed with ADHD to perform similar to children who are not diagnosed with the disorder. The fact that these drugs promote dopamine activity, suggests the presence of a link between dopamine and response variation. An increase of dopamine in the synapse could be associated with a decrease in the variability of responding. 22 Explanations for why this phenomenon may occur could be related to the theory of the predictive reward signal of dopamine. Historically, meso-limbic and meso-cortical dopamine pathways have been linked to reinforce delivery. More recently, Schultz (1998) argues that dopamine is involved in changes in the predictability of reinforce delivery. Early in training there is a burst of dopamine in nerve terminals in the striatum, nucleus accumbens, and the frontal cortex. This burst is associated with stimuli that precede reinforcers that become conditioned reinforcers, as well as reinforcer delivery itself. After repeated, reliable pairings between stimuli and the reinforcer, the dopamine burst generally declines. Dopamine is still being release during primary reinforcement but to a lesser degree. When the reward is presented without the predictive stimulus, unexpectedly, there is a burst in dopamine. Also, if the reinforcer fails to occur after the presentation of the predictive stimulus, an inhibition of dopamine release occurs at the time the reward normally would be presented. This suggests that dopamine neurons track the occurrence and the time of reward presentation and that a modification of the behavior of dopaminergic neurons is highly sensitive to "surprises." It follows that unpredictability of reward presentation may be important in the dopamine system. Blockade of dopamine reuptake by drugs such as cocaine or amphetamine increase the levels of dopamine in the synapse. With the added bursts of dopamine after the delivery of a primary reward and the presentation of the predictive stimuli, the concentration levels of dopamine will be much greater than without the administration of the drugs. Therefore, a stronger connection or a more broadened one will occur for the predictive stimuli and any other stimuli that may be salient to the animal normally. The reward value will also strengthen with an increased dopamine signal. A dopamine agonist 23 will enhance dopamine release after many cues that may or may not be useful in the prediction of the reinforcer. This would suggest that response variability may be reduced by an increase in dopamine because with an increase in the lag requirement of a session, the predictive capability of the reward presentation declines. Other cues, e.g. length of the sequence, time until reward delivery, lag requirement, etc., may interfere with the variation in sequences that is required in the variability procedure. A reduction of dopamine in the synapse will either enhance or not have an effect on variability in responding because once the relationship between the predictive stimuli and the reward has been established during training the increase in the dopamine signal is not necessarily needed for the responding to occur. The predictive stimulus is still occurring before reward presentation. However, it is difficult to predict what will happen. If dopamine promotes, "stamps in" reinforced responding, then it will promote the repetition of a response sequence. This is consistent with the definition of a reinforcer. If, however, response variability is an operant, then reinforcement, and dopamine, would increase its appearance. Finally, if dopamine is involved with "surprise" then how can surprises occur with random responding? The Control Procedure The selection of an appropriate control procedure in experiments on reinforced variability is not a straightforward task. Several have been reported in the literature and the control procedure used in the proposed studies differs somewhat from those that have been reported previously. In one type of control procedure, a specific sequence of responses is required for reinforcement. In two studies using rats (McElroy & Neuringer, 1990; Cohen, Neuringer, & Rhodes, 1990) a specific 4 response sequence 24 (LLRR) was required for reinforcement, while the other study trained pigeons to peck a specific four-response sequence (RRLL) as one component of a multiple schedule (Odum et al., 2006). This type of control procedure takes the animals a relatively long time to learn. In each of the examples given the number of one hour sessions needed for training of the repeat component ranged from 12-18 (Odum, et al., 2006), 30 (Cohen, et al., 1990), and as high as 62 (McElroy & Neuringer, 1990). More important, this rigid requirement of a specific procedure does not permit a predilection for repetition to appear. Drug effects that may impair accuracy by simply producing, for example, RRRL or RLLL responses only permit one to draw conclusions about accuracy or tendencies to changeover rather than about variability or repetition. Another type of control procedure might be an FR4 on a single lever. This was considered and rejected because consistency in what response chains could occur between the two components was preferred. The rats had access to two levers in both components. Drug effects may not impair or enhance responding on a single lever except when looking at rate of responding. Because fixed-ratio-response rates on a single lever are not what the present study wishes to examine, this option was not used. It is, however, of interest to see if variability increases or decreases in both components. With only a single lever, the rats cannot behave variably. The third type of control procedure that is used to test in conjunction with variability procedures is a fixed ratio of a desired number of responses distributed in any way between the left and right levers. When investigating the difference between repetitive and variable responding in rats that are administered amphetamine, Mook and Neuringer (1994) employed a FR4 requirement across two levers as the control 25 component. Training for the FR4 only took four half-hour sessions. The variability component requirement was a FR4 on the two levers that differed from the previous four sequences (lag 4) emitted. Hunziker, Saldana, and Neuringer (1996) also trained a FR4 requirement across two levers that took seven 45 minute sessions to train. Because of the shortened training times and the simplified requirement, the FR4 with no specific contingencies attached was implemented as the control for the present study. This approach is appealing because the same number of responses is required, and animals are able to change levers at any time without violating the reinforcement criterion, so variability can both increase or decrease in this arrangement. Thus, a broader range of non-specific drug effects can be captured more directly. 26 References American Psychiatric Association (2000) Diagnostic and statistical manual of mental disorders, text revision (DSM-IV-TR) (4th ed.). Arlington, VA: American Psychiatric Publishing, Incorporated. Antonitis, J. J. (1951). Response variability in the white rat during conditioning, extinction, and reconditioning. Journal of Experimental Psychology, 42, 273-281. Arnsten, A., Cai, J., Steere, J., & Goldman-Rakic, P. (1995). Dopamine D2 receptor mechanisms contribute to age-related cognitive decline: The effects of quinpirole on memory and motor performance in monkeys. Journal of Neuroscience, 15, 3429-3439. Baum, W. M. (1979). On two types of deviation from the matching law: Bias and undermatching. Journal of the Experimental Analysis of Behavior, 22, 231-242. Blough, D. S. (1966). The reinforcement of least frequent interresponse times. Journal of the Experimental Analysis of Behavior, 9, 581-591. Bryant, D., & Church, R. M. G. (1974). The determinants of random choice. Animal Learning & Behavior, 2, 245-248. Cohen, L., Neuringer, A., & Rhodes, D. (1990). Effects of ethanol on reinforced variations and repetitions by rats under a multiple schedule. Journal of the Experimental Analysis of Behavior, 54, 1?12. Davison, M., and McCarthy, D. (1988). The matching law. Hillsdale, NJ: Lawrence Erlbaum Associates. Dougherty, D. D., Bonab, A. A., Spencer, T. J., Rauch, S. L., Madras, B. K., & Fischman, A. J. (1999). Dopamine transporter density in patients with attention deficit hyperactivity disorder. Lancet, 354, 2132. Eckerman, D. A., & Lanson, R. N. (1969). Variability of response location for pigeons responding under continuous reinforcement, intermittent reinforcement, and extinction. Journal of the Experimental Analysis of Behavior, 12, 73?80. Eilam, D., Clements, K. V., & Szechtman, H. (1991). Differential effects of D1 and D2 dopamine agonists on stereotyped locomotion in rats. Behavioural Brain Research, 45, 117?124. Grunow, A., & Neuringer, A. (2002). Learning to vary and varying to learn. Psychonomic Bulletin & Review, 9, 250-258. 27 Herrnstein, R. J. (1970) On the law of effect. Journal of the Experimental Analysis of Behavior, 13, 243-266. Hunziker, M. H. L., Saldana, R. L., & Neuringer, A. (1996). Behavioral variability in SHR and WKY rats as a function of rearing environment and reinforcement contingency. Journal of the Experimental Analysis of Behavior, 65, 129?143. Krause, K. H., Dresel, S. H., Krause, J., Kung, H. F., & Tatsch. K. (2000). Increased striatal dopamine transporter in adult patients with attention deficit hyperactivity disorder: Effects of methylphenidate as measure by single photon emission computer tomography. Neuroscience Letters, 285, 107-110. Kurylo, D. D. (2004). Effects of quinpirole on operant conditioning: perseveration of behavioral components. Behavioural Brain Research, 155, 117-124. Machado, A. (1997). Increasing the variability of response sequences in pigeons by adjusting the frequency of switching between two keys. Journal of the Experimental Analysis of Behavior, 68, 1-25. McElroy, E., & Neuringer, A. (1990). Effects of alcohol on reinforced repetitions and reinforced variation in rats. Psychopharmacology, 102, 49?55. Miller, G. A., & Frick, F. C. (1949). Statistical behavioristics and sequences of responses. Psychological Review, 56, 311-324. Meyer, J. S., & Quenzer, L. F. (2005). Psychopharmacology: Drugs, the brain, and behavior. Sunderland, MA: Sinauer Associates, Inc. Missale, C., Nash, S. R., Robinson, S., Jaber, M., & Caron, M. (1998). Dopamine receptors: From structure to function. Physiological Reviews, 78, 189-225. Mook, D. M., Jeffrey, J., & Neuringer, A. (1993). Spontaneously hypertensive rats (SHR) readily learn to vary but not to repeat instrumental responses. Behavioral & Neural Biology, 59, 126-135. Mook, D. M., & Neuringer, A. (1994). Different effects of amphetamine on reinforced variations versus repetitions in spontaneously hypertensive rats (SHR). Physiology & Behavior, 56, 939-944. Neuringer, A. (1991). Operant variability and repetition as functions of interresponse time. Journal of Experimental Psychology: Animal Behavior Processes, 17, 3?12. Neuringer, A. (1992). Choosing to vary and repeat. Psychological Science, 3, 246?250. 28 Neuringer, A. (2002). Operant variability: Evidence, functions, and theory. Psychonomic Bulletin & Review, 9, 672?705. Neuringer, A., Kornell, N., & Olufs, M. (2001). Stability and variability in extinction. Journal of Experimental Psychology: Animal Behavior Processes, 27, 79?94. Page, S., & Neuringer, A. (1985). Variability is an operant. Journal of Experimental Psychology: Animal Behavior Processes, 11, 429-452. Pryor, K. W., Haag, R., & O?Reilly, J. (1969). The creative porpoise: Training for novel behavior. Journal of the Experimental Analysis of Behavior, 12, 653-661. Odum, A. L., Ward, R. D., Barnes, C. A., & Burke, K. A. (2006). The effects of delayed reinforcement on variability and repetition of response sequences. Journal of the Experimental Analysis of Behavior, 86, 159-179. Schoenfeld, W. N., Harris, A. H., & Farmer, J. (1966). Conditioning response variability. Psychological Reports, 19, 551-557. Schultz, W. (1998). Predictive reward signal of dopamine neurons. Journal of Neurophysiology, 80. Schwartz, B. (1980). Development of complex stereotyped behavior in pigeons. Journal of the Experimental Analysis of Behavior, 33, 153-166. Schwartz, B. (1982a). Failure to produce variability with reinforcement. Journal of the Experimental Analysis of Behavior, 37, 171-181. Schwartz, B. (1982b). Reinforcement-induced behavioral stereotypy: How not to teach people to discover rules. Journal of Experimental Psychology: General, 111, 23- 59. Stokes, P. D. (1995). Learned variability. Animal Learning and Behavior, 23, 164?176. Szechtman, H., Sulis, W., & Eilam, D. (1998). Quinpirole induces compulsive checking behavior in rats: A potential animal model of obsessive-compulsive disorder. Behavioral Neuroscience, 112, 1475?1485. Wachtel, S. R., Brooderson, R. J., & White, F. J. (1992). Parametric and pharmacological analyses of enhanced grooming response elicited by the D1 receptor agonist SKF38393 in the rat. Psychopharmacology, 109, 41?48. Zeiler, M. (1977). Schedules of reinforcement: the controlling variables. In Honig, W. K. & Staddon, J.E.R. (Eds.), Handbook of operant behavior (pp. 201-232). Englewood Cliffs, NJ: Prentice Hall. 29 EXPERIMENTS Abstract Variability in behavior can be measured and reinforced but its behavioral or pharmacological mechanisms are poorly understood. Dopamine agonists such as d amphetamine have been shown to induce both variability and stereotypy, so this drug and more specific dopamine agonists were used to examine the behavioral pharmacology of variability. In Experiment 1, Long Evans rats were trained under a multiple VARY 8:4 FR 4 schedule. In the VARY 8:4 component, all four-response sequences that differed from previous 8 were reinforced and variability (measured by entropy) was high. In the FR 4 component, all four-response sequences were reinforced and variability was low. d Amphetamine (0.3 to 3.0 mg/kg) reduced the high variability in the Vary 8:4 component and increased the low variability in the FR 4 component. The D2 agonist, quinpirole (.01 to .3 mg/kg) increased variability in the FR 4 component to levels approaching that in the VARY 8:4 component. The D1 agonist, SKF 38393 (1 to 17 mg/kg), decreased variability at high doses. In Experiment 2 the overall reinforcement rate was held constant by reinforcing correct sequences under a variable interval schedule 60" schedule of reinforcement. This intermittent schedule elevated variability in the FR 4 component and, attenuated the difference between the two components and the differential effects of d amphetamine. 30 Overall, amphetamine's effects on entropy are baseline-dependent, with high levels being decreased and low levels increased. Reduced variability was associated with D1 activation and increased variability was associated with D2 activation. Establishing a behavioral baseline in which the target response, variable sequences, is reinforced frequently produces the wide range of variability required to detect specific drug effects. 31 INTRODUCTION The Role of Dopamine Receptor Subtypes and Intermittent Reinforcement in Reinforced Variability Variability in behavior is sensitive to its consequences, i.e., it is an operant (Blough, 1966; Bryant & Church, 1974; Pryor, Haag, & O?Reilly, 1969; Schoenfeld, Harris, & Farmer, 1966; Schwartz, 1980, 1982). This response property is also sensitive to disruptors such as pre-feeding, non-contingent food (Doughty & Lattal, 2001), extinction (Eckerman & Lanson, 1969; Stokes, 1995; Neuringer, Kornell, & Olufs, 2001), and drugs that act on dopamine (Mook & Neuringer, 1994) or GABAergic (Cohen, Neuringer, & Rhodes, 1990; McElroy & Neuringer, 1990) neurotransmitter systems. Mook and Neuringer (1994) examined the effects of daily administration of d amphetamine on behavior under contingencies that reinforced variable or sequence- specific responding (less variable) in spontaneously hypertensive rats (SHR) and their background strain the Wistar-Kyoto rat (WKY). Under baseline conditions, the SHR rats performed more poorly (i.e., more variably) than the WKY rats in the sequence-specific condition but better under the "variability" condition. Daily d amphetamine eliminated this performance difference between the two strains by increasing variabillity in the WKYs when variable responding was reinforced, and increasing repitition in the SHRs when repetitave responding was reinforced. This suggests that a dopamine re-uptake 32 inhibitor can influence operant variability and that the direction of the effect depends upon the reinforcement contingencies or baseline levels of variability. Interestingly, genetic differences in baseline rates could be overcome by pharmacological or behavioral interventions. A component of variable resonding is switching to a different response device. Acute and chronic amphetamine was examined using a FCN-8 procedure, in which a sequence of 8 responses on one lever, followed by a single response on a second lever, was reinforced. (Laties, Wood, & Rees, 1981; Rees, Wood, & Laties, 1985). This schedule involves repetition on one lever and then switching to a second for reinforcement. Acute administration of d amphetamine produced premature switching, unless a stimulus signaled the completion of the 8-response sequence. Daily administration of d amphetamine attenuated this performance decrement (Rees, Wood, & Laties, 1987). The acute studies indicate that a dopamine agonist increases switching and, unlike the Mook and Neuringer observation, does not increase repetition even when reinforced, except when behavioral tolerance was allowed to develop. Evenden and Robbins (1983) examined the effects of d amphetamine on response switching and repitition in four different contexts. Like Mook and Neuringer, they found that the effects were dependent not only on drug dose but also of the response context the response and its baseline probabilities. Lower doses increased switching behavior while higher ones increased perseverative behavior. When the response cost was similar for two choices, "repeat" or "switch," amphetamine increased switching between the two. Switching had a higher probability of occurrence than repetition in the control conditions. The effect was 33 further enhanced when amphetamine was administered and switching increased and repetition decreased. The present study was designed to determine the role of dopamine-receptor subtypes on operant variability by systematic administration of drugs selected to target specific receptor systems. d Amphetamine is a dopamine re-uptake inhibitor so it promotes activity at all dopamine receptor subtypes. SKF 38393 and quinpirole are, respectively, direct dopamine D1-receptor and D2-receptor agonists. A ?variability? and a ?FR? baseline were presented to the rats under a multiple schedule arrangement. In the VARY 8:4 component, all four-response sequences that differed from previous 8 were reinforced. In the FR 4 component, all four-response sequences were reinforced. In Experiment 1, every criterion sequence was reinforced. In Experiment 2, reinforcement was intermittent so that the overall reinforcement rate (reinforcers per min and especially reinforcers per sequence (criterion or not) were held approximately constant. EXPERIMENT 1 Method Subjects Nineteen, 5-month-old, male Long-Evans rats, were purchased from Harlan and housed two per cage, but separated by a plastic divider, in a room on a 12 hour light-dark cycle (lights on at 6:00 A.M.). The rats were provided free access to water except during the experimental session and were maintained at a weight of about 300 grams with afternoon feedings (at least 1 hour after experimental sessions). The colony meets all PHS standards and is AAALAC accredited. All procedures were approved by the Auburn Institutional Animal Care and Use Committee. 34 Apparatus The experiments were conducted in 10 commercially purchased operant chambers (Med Associates Inc. model #Med ENV 007) containing two front levers, each calibrated so that about 0.20 N registers a press. A pellet dispenser is situated midway between the two front levers and the reinforcer is a 45 mg sucrose pellet (Purina Mills, Inc., St. Louis, MO). Sonalert tones? (2900 and 4500 Hz, nominally) are calibrated to an amplitude of 70 dbC. A house light (28 V 100 ma) is located midway at the top of the back wall, opposite the levers, and a light emitting diode (LED) is over above each lever. Dimensions of the chamber are 12?L x 9 ??W x 11 ??H. The standard grid floor was covered with a secured piece of Plexiglas, which covers all but the back inch of the floor. This piece of Plexiglas was in place for the purpose of other experiments running in the laboratory. Each chamber was surrounded by a sound-attenuating cabinet with built-in ventilating fan that circulates air into the experimental environment and provides masking white noise. Programs for experimental procedures and data collection were written using MED-PC IV (Med-Associates, Georgia, VT). Session events are recorded with 0.01" resolution. Preliminary Training Lever-pressing was established using autoshaping on both levers, hereafter designated as Left (L) and Right (R) as described below. The rats were then trained to execute a four-response sequence on the two levers. The onset of the session was signaled by the illumination of the house light and lever lights. A VARY N:4 and a FR 4 component was presented in strict alternation as a multiple schedule. The schedule component changed after 10 reinforcers. To meet criterion in the VARY N:4 component, 35 the current four-response sequence had to differ from each of the previous N sequences (where N ? 8). For example, if the sequence was LLRL then a reinforcer would be delivered if none of the previous N sequences was LLRL. In the FR 4 component, any four-response sequence was reinforced. The reinforcement cycle commenced immediately after the animal met the response requirement and began as the lever lights and low tone turned off, a high tone (4500 Hz) sounded for 0.5?, and a 45-mg sucrose pellet was delivered. The session lasted one hour or after 100 reinforcers were presented. The FR 4 component, associated with a low (2900 Hz, nominally), was trained first by gradually increasing the response requirement from one to four. Training the VARY 1:4 condition, indicated by the absence of the low tone, began immediately after the animals reliably executed a four-response sequence. Initially, the four-response sequence had to differ from the previous sequence (Vary 1:4). This is called a ?Lag 1? criterion because only one previous sequence is considered. The reinforcement cycle began when the animal met the lag requirement. Non-criterion trials ended in a 15 second timeout, during which all lights in the chamber darkened and no pellet was delivered. When at least half of the trials met criterion for at least 5 sequential sessions, the lag was increased by one sequence to a LAG 2 criterion for the next session, making the schedule a VARY 2:4. Then a lag 3 was imposed, and so on to a lag 8. The maximal number of different possible sequences with a four response sequence and two levers is 24, or 16. If responding resulted in more errors than reinforcers for 5 consecutive days on a specific lag, the requirement was lowered to the previous lag that was achieved. Drug Administration 36 d Amphetamine sulfate (0.3-3.0 mg/kg), SKF-38393 HCL (1-17 mg/kg), and quinpirole HCL (0.01-.3 mg/kg) were dissolved in 0.9% saline solutions. All doses were measured as the salt. The saline vehicle served as a control for the drug injections. Rats were placed into the experimental chamber immediately after injection and the session began ten minutes later. Drugs were administered i.p. on Tuesday and Fridays. Mondays and Wednesdays served as non-injected control days and Thursdays served as a vehicle control. Rats were divided into two separate groups of nine and ten each and approximately matched for response rate and variability performances as denoted by individual U-values (described below). The first group received injections of d amphetamine and then the lower range of quinpirole doses of quinpirole (0.01, 0.03, and .056 mg/kg) and the second group received injections of SKF 38393 and then the higher range of quinpirole doses (0.1, 0.17, and 0.3 mg/kg). The use of two groups for the quinpirole dosing was done to span a wide range of doses in a reasonable amount of time. The broad range was necessary because low doses of quinpirole stimulate autoreceptors and higher doses stimulate post-synaptic receptors . (Arnsten, Cai, Steere, & Goldman- Rakic, 1995) The determination of acute dose-effect curves began after behavior stabilized and showed stimulus control in both the VARY 8:4 and FR 4 conditions. Each animal?s performance during the drug session was inspected at the end of a session. If a dose decreased overall response rates to more than 20% of baseline rates, then a higher dose was not given. Statistical Analyses 37 All statistical analyses were performed using SYSTAT? 11 (SYSTAT Software Inc. Richmond, CA, USA). The dependent measures were: 1) the total number of responses performed in each component by dose, and 2) the U-value, or entropy, an index of variability in the sequences produced. The U-value is an index of overall sequence variability that was introduced by Page and Neuringer (1985). The U statistic is denoted by the following equation: n pp U i ii 2 16 1 2 log )log(? = ?? = where p is equal to the probability of a given sequence i, and n is the total possible number of sequences. In the present study, n = 16 (24). A U value of 1.00 signifies that all possible sequences were emitted equally while a U value of 0.00 signifies that only a single sequence was produced. A multivariate repeated-measure analysis of variance (RMANOVA) was performed for each dependent variable. For each drug, dose served as the within-subjects factor. Dose by component (VARY 8:4 or FR 4) interactions were also calculated. Post hoc tests were performed using Dunnet's comparisons against a single control. P values greater than 0.1 are reported as having no significance and therefore are not included. Results Baseline Baseline performances were markedly different between the two components for most animals. Figure 1 shows histograms displaying the entropy and frequency of responding for each of the 16 possible sequences in both components on a non-injected control day. In the VARY 8:4 component, responses were distributed between the two 38 levers and U-values ranged from .80 to .91, signifying a high degree of variability in the sequences produced. In the FR 4 component, U-values were distinctly lower, ranging from 0 to 0.38. U-values were close to 0.00 when only one or two sequences occurred. d Amphetamine There was a main effect of dose, F (5, 40) = 3.02, p < .05, and component, F (1, 8) = 26.35, p < .001 on overall response rate. Post-hoc analyses show a dose-related decrease only at the highest dose. (Fig. 2). There was a main effect of component on entropy, F (1, 6) = 32.99, p < .001, and a marginal significance of dose on entropy, F (5, 30) = 2.15, p = .09 (Fig. 2). The interaction between dose and component on entropy was significant, F (5, 30) = 9.07, p < .001. A biphasic effect of amphetamine on entropy occurred in the FR 4 component, with moderate doses increasing entropy and the highest dose (3.0 mg/kg) decreasing entropy. SKF 38393 There was a main effect of dose, F (5, 45) = 18.24, p < .001, component, F (1, 9) = 51.50, p < .001, and an interaction, F (5, 45) = 7.45, p < .001 (Fig. 3) on overall responding. Increasing doses of SKF 38393 decreased responding in both components. For entropy, there was a main effect of dose, F (5, 40) = 2.83, p < .05, and component, F (1, 8) = 35.52, p < .001. The interaction was not significant. Entropy was reduced in both components at the highest dose, which also reduced response rate to below 20% of control rates (Fig. 3). 39 Quinpirole Low dose group (.01 - .056 mg/kg) In the VARY 8:4 component, low doses of quinpirole produced a biphasic effect on total responding with slight increases at low doses and decreases at higher doses. There was a main effect of dose, F (4, 28) = 11.96, p < .001, for component, F (1, 7) = 24.66, p < .01, and an interaction, F (4, 28) = 6.59, p < .001, on overall responding (Fig. 4). For entropy, there was a main effect of dose, F (5, 24) = 4.15, p < .01, and component, F (1, 6) = 49.45, p < .001, and a marginally significant interaction, F (4, 24) = 2.57, p = .06. High dose group (.1 - .3 mg/kg) In the high dose group, there was a main effect of dose, F (4, 36) = 14.99, p < .001, for component, F (1, 9) = 40.44, p < .001, and an interaction, F (4, 36) = 3.03, p < .05, on overall responding with dose-related decreases occurring, especially in the VARY 8:4 component (Fig. 4). There was a main effect of dose, F (4, 32) = 11.63, p < .001, for component, F (1, 8) = 63.93, p < .001, and an interaction, F (4, 32) = 10.71, p < .001 (Fig. 3) for entropy. In the FR 4 component, entropy increased in a dose-related fashion, approaching that of the VARY 8:4 component at higher doses while entropy in the VARY 8:4 component was unaffected by quinpirole. Figures 5 represents sequences produced in both the VARY 8:4 and FR 4 components of the .17 mg/kg dose. The distinction between entropy in the two components has disappeared. Discussion Behavioral variability, as indicated by the entropy score, differed between the two components. When every variable sequence was reinforced, variability increased as 40 compared with a component in which any sequence was reinforced. The preferred sequence differed across animals, but most showed a preference for one or two sequences in the FR 4 component. Amphetamine?s effects on operant variability depended upon the baseline value. Under the VARY 8:4 component, variability was high and amphetamine reduced it. When variability was low and responding was more stereotyped during the FR 4 component, d amphetamine increased variability. Quinpirole increased variability in the FR 4 component but had no effect on the VARY 8:4 component. Like amphetamine, SKF 38393 decreased variability at doses that also decreased overall responding. SKF 38393 decreased response rate in both the VARY and the FR components at 3 mg/kg (and higher) doses but had no effect on variability in either component. Taken together, the fact that rate and variability can be affected separately shows that these are independent properties of behavior. In the present experiment, reinforcer count per criterion sequence was held constant between the two components, but reinforcer delivery per unit of time (or per sequence produced) was not controlled. The discrepancies noted in the drug effects between the two components may have been due to the overall reinforcer rate rather than the vary contingency itself. This issue was addressed in Experiment 2. EXPERIMENT 2 In Experiment 1, response variability was examined in a VARY 8:4 and FR 4 component. In each component the reinforcer count was held constant at ten per component and every criterion sequence was reinforced, but reinforcer delivery per unit of time was not controlled. Animals required more time and more sequences to obtain 41 the ten reinforcers in the VARY 8:4 component because non-criterion sequences could, and did, occur. In contrast, every four-response sequence was reinforced in the FR 4 component. This had the advantage of producing a large difference in variability between the components but did so at the expense of having different overall rates of reinforcement per unit time. To address this issue, primary reinforcement rate and its intermittency were held constant for both components by issuing reinforcers on a variable interval 60? (VI 60?) schedule. Method Subjects Eighteen male Long-Evans rats from Experiment 1 were used for Experiment 2. Apparatus The same chambers from the first experiment were used in the Experiment 2. Preliminary Training This experiment began after Experiment 1. A VI 60? schedule was imposed so that on average, a criterion sequence was reinforced once every 60? but the actual interreinforcer interval varied unpredictably from 3.1 to 198 seconds. Correct sequences that occurred before the time interval had passed were followed by the high tone. The interval continued during these times. The components alternated after 10 reinforers were earned in each of them. The session lasted one hour. Other details, including drug administration, are as in Experiment 1. The determination of acute dose-effect curves began after behavior stabilized and showed stimulus control in both the VARY 8:4 and FR 4 conditions. Each animal?s performance during the drug session was inspected at the end of a session. If a dose 42 decreased overall response rates to more than 20% of baseline rates, then a higher dose was not given. Only d amphetamine was used; the similarity in entropy under the FR and VARY baselines suggested that further investigation with the other drugs was not warranted. Results Baseline In both the VARY 8:4 component and the FR 4 components, responses were distributed on the two levers and U-values remained stable and high around .86-.87, signifying a high degree of variability in the sequences produced in both components. d Amphetamine There was a main effect of dose, F (5, 75) = 3.68, p < .01, and component, F (1, 15) = 123.08, p < .001, on overall response rate but no interaction. A dose-related decrease in response rates occurred for both components. For entropy, there was a main effect of dose, F (5, 75) = 2.39, p < .05, component, F (1, 15) = 55.22, p < .001. There was a statistical interaction, F (5, 75) = 3.70, p < .01 but the magnitude of the effect was very small (Fig. 6). Overall, there was a small dose-related decrease in entropy in the VARY 8:4 component and non-monotonic changes in entropy in the FR 4 component. Discussion During baseline, when the VI 60? was in place, response rate in the FR 4 component became more variable than in Experiment 1, while that in the VARY 8:4 component was unchanged, so the difference in variability between the two components diminished, although it remained statistically different. The attenuated difference between the two components was likely due to the close link between criterion responses 43 and reinforcement in Experiment 1 and its weaker link in Experiment 2. In Experiment 1, the reinforcer was delivered after every criterion sequence of four responses. In Experiment 2, however, the reinforcer would sometimes follow the completion of a four- response sequence, but most sequences were not reinforced, although correct sequences were followed by a brief tone. This difference was especially dramatic in the FR 4 component since every sequence was reinforced in Experiment 1 while in Experiment 2 reinforcement intermittency was more similar to that in the VARY 8:4 component. The level of variability in the VARY 8:4 component remained unchanged with the addition of the VI 60? schedule. Experiment 2 provides further support for the idea that amphetamine?s effects on operant variability depend upon the baseline value. Under both the VARY 8:4 and the FR 4 components, variability was high during baseline and amphetamine reduced it. The addition of reinforcer intermittency did not appear to influence amphetamine?s effects. General Discussion The present report replicates and extends previous reports that behavioral disruptors, including ethanol (Cohen, Neuringer, & Rhodes, 1990; McElroy & Neuringer, 1990), pre-feeding, and non-contingent food (Doughty & Lattal, 2001), all increase variability under conditions in which variability is low. In those studies a variability contingency was compared with one in which a specific response sequence was reinforced. In the present study, a variability contingency was compared with one in which any four-response sequence was reinforced, so stereotypy or the execution of a limited number of response sequences was permitted but not required. 44 Amphetamine?s effects on operant variability depended upon the baseline value. This could be interpreted as a pharmacological effect of the drug or as non-specific disruption. For example, cocaine is a behavioral stimulant that is also a dopamine re- uptake inhibitor and has rate-dependent effects on baseline response rate as d amphetamine. Howell, Byrd, and Marr (1986) found that white noise and cocaine have similar disrupting effects on behavior under an FI schedule of reinforcement, suggesting that cocaine's effects were nonspecific, i.e., it was a generic disruptor, or that such disruptions are mediated by dopamine receptors. In the reinforced variability task, the effects of food satiation and noncontingent food are also dependent upon the baseline levels of variability (Doughty & Lattal, 2001). Both of these disruptors increased variability in a component in which the same response sequence (LRLR) was reinforced while leaving operant variability intact. Evidence that amphetamine?s effects are pharmacologically specific, rather than the result of nonspecific disruptors, comes from effects of drugs that act at D1 and D2 receptor subtypes. Quinpirole (D2 agonist) increased variability in the FR 4 component even as it reduced response rates. Several studies (Arnsten et al., 1995; Eilam, et al., 1992; Einat & Szechtman, 1995) reported that lower doses of quinpirole (e.g. .0001-.01 mg/kg) reduced levels of locomotion, a ?behavioral quieting,? while higher doses (e.g. .05-.1 mg/kg) produce hyperactivity in the form of increased locomotion and bouts of responding. In the present study, quinpirole produced non-monotonic changes in response rates in the VARY 8:4 component, with a 15-20% increase at the lowest dose and decreased response rate at higher doses in the VARY 8:4 component. This suggests that the aforementioned, and apparently paradoxical, effects of quinpirole refer to two 45 functional domains. The "quieting" at low doses may be related to the rate-reducing effects while the "hyperactivity" and increased bouts of responding, which were increased, may draw from similar, but currently unknown, mechanisms that promote variable responding. There were no quinpirole-related changes in response rate in the FR 4 component across a 20-fold range of doses, but variability increased in a dose-related fashion beginning at 0.03 mg/kg, a dose that was 10-times lower than the dose that reduced response rate in that component. d Amphetamine's effect on variability in the FR 4 component was reproduced by quinpirole, but not by SKF 38393. In the VARY 8:4 component, however, variability was unaffected by the quinpirole even at doses that lowered response rate. These data suggest that promoting activity at the D2- receptors increases variability, reduces stereotypy, under conditions in which variability is low, but has no effect when variability is high. Therefore from the present experiment, it appears that the effects of d amphetamine may have occurred, in part, because of its action on dopamine D2-receptors. In contrast, the SKF 38393 results suggest that D1 receptor stimulation reduces variability, increases stereotypy, and produces only reductions in response rate. One effect of d amphetamine, decreased variability when it is high (in the VARY 8:4 component), was reproduced by SKF 38393, albeit at doses higher than those that reduced response rate. D1 agonists like SKF 38393 produce self-grooming behaviors or other repetitive movements that are unrelated to the reinforced behavior (Bratcher, el al., 2005; Eilam, et al. 1992; Wachtel, Brooderson, & White, 1992), behavior that is inconsistent with lever- 46 pressing and therefore could be related to the lowered response rates and diminished variability/increased stereotypy noted in the present report. Also, D1 activation may be linked to general search for reinforcers rather than sensitivity to reinforcers present (Kurylo, 2004). In other words, SKF 38393 produces behavioral plasticity in a changing environment. In the present experiment, the decrease in response rate may have been due to more self-grooming behaviors produced by increasing doses of the drug while the absence of change in entropy between the two components was related to the adaptation produced by the drug to the different contexts. Kurylo (2004) and Schultz (1998) attribute the different behavioral actions to neural mechanisms. D1 receptors produce the immediate release of dopamine into the synapse thereby activating behavior quickly while D2 receptors maintain levels of dopamine and are responsible for linking changes in the environment to reinforcers' presence or absence. Therefore, SKF 38393 only disrupted response rate and not entropy with self-induced grooming, whereas quinpirole disrupted entropy because D2?receptor activation is sensitive to changes in the environment. A mechanistic issue that deserves some attention is whether variability is the response class being reinforced or whether it is the outcome of the indirect reinforcement of switching behavior. One could view the procedure as having four operants, repetitive responding on each lever and changing from left to right and from right to left levers. Machado (1997) questioned whether variability is the operant and sought to synthesize variable responding by directly reinforcing switching. Machado argued that even though the reinforcing of switching increased variability, it did not reach the levels obtained 47 when variability was reinforced directly. It seems plausible that reinforced switching is a component of variable behavior, but not necessarily the only contingency operating. This proposed mechanism is relevant because amphetamine increases switching even when it is not directly reinforced. The effects of amphetamine on variable responding in the FR 4 component may be linked to both the findings of Evenden & Robbins (1983) and Laties (1972). As Machado noted, reinforcing variable responding and switching go hand-in-hand with increased variability. Therefore, it is possible that amphetamine produced the effects that it did in the FR 4 component by increasing switching. The use of the FR 4 component as a control component differs from others used in operant variability research in which the non-vary component would be either a sequence that must be repeated (McElroy & Neuringer, 1990) or yoking reinforcers in the VARY component to those in a REPEAT component (Page & Neuringer, 1985). The specific implementation of the FR 4 component was chosen because we wanted the conditions of the multiple schedule to be as similar as possible to each other with the exception of the vary contingency. Specifically, we wanted to compare behavior when variability was required with that when variability is merely allowed. In the latter case, an animal could, and did, produce a stereotypic sequence. Each animal produced different preferred sequence but the presence of one or a few highly preferred sequence(s) was seen in all animals. Several advantages of this control procedure can be noted. One is practical. Initial training and the establishment of a baseline occurs more rapidly if the animal is permitted to select its preferred sequences than if the same specific sequence is required of all 48 animals. Another advantage is that this control procedure can detect increased or decreased variability resulting from drug effects. If only one specific sequence is required then the drug could only increase variability. Also, with drug administration there is the likelihood that response rate will be disrupted at higher doses causing fewer reinforcers to be earned in one or both of the components. Therefore, if fewer reinforcers are delivered, for instance, in the VARY component and these are then yoked to the FR 4 component, response rate will most likely decrease in the FR 4 component as well. Whether the drug caused the decrease in response rate or the yoking did so would then be in question. Finally, drug-induced disruption of a specific response sequence could be related to the disruption of a chain of responses (Thompson & Moerschbaecher, 1979), and not increased variability. The VARY 8:4 and FR 4 components differed in ways that could influence drug effects. One problem that needs to be addressed in the present experiment is regarding the time out in the VARY component. Because any four-response sequence in the FR 4 component would have resulted in a reinforcer, no time outs occurred in this component. However, time outs did occur in the VARY component when there was an incorrect sequence. The specifics of how this would influence the drug effects are not clear but the distinction should be noted. Another problem was addressed in Experiment 2. In Experiment 1, the VARY and FR components differed in reinforcement rate, as measured as reinforcers/unit time, in Experiment 1, even though they were similar in the number of reinforcers/correct response sequence. The decision to hold the number of reinforcers/criterion responses was deliberate and was used 1) to ensure that that the variability in the two components 49 would be substantially different and 2) to establish two conditions in which each operant class (here defined as a criterion sequence) was reinforced. We could not hold reinforcers per correct response unit and reinforcers per unit time constant across both components simultaneously. To address this issue, reinforcement rate was held constant between the two components by implementing a VI 60? schedule in Experiment 2. As suspected, intermittency increased variability in the FR 4 component, therefore making it indistinguishable from the VARY 8:4 component. For the sake of comparison of drug effects, it is necessary to have two components that are easily distinguishable. 50 References Arnsten, A., Cai, J., Steere, J., & Goldman-Rakic, P. (1995). Dopamine D2 receptor mechanisms contribute to age-related cognitive decline: The effects of quinpirole on memory and motor performance in monkeys. Journal of Neuroscience, 15, 3429-3439. Blough, D. S. (1966). The reinforcement of least frequent interresponse times. Journal of the Experimental Analysis of Behavior, 9, 581-591. Bryant, D., & Church, R. M. G. (1974). The determinants of random choice. Animal Learning & Behavior, 2, 245-248. Cohen, L., Neuringer, A., & Rhodes, D. (1990). Effects of ethanol on reinforced variations and repetitions by rats under a multiple schedule. Journal of the Experimental Analysis of Behavior, 54, 1?12. Doughty, A. H., & Lattal, K. A. (2001). Resistance to change of operant variation and repetition. Journal of the Experimental Analysis of Behavior, 76, 195-215. Eckerman, D. A., & Lanson, R. N. (1969). Variability of response location for pigeons responding under continuous reinforcement, intermittent reinforcement, and extinction. Journal of the Experimental Analysis of Behavior, 12, 73?80. Eilam, D., Talangbayan, H., Canaran, G., & Szechtman, H. (1992). Dopaminergic control of locomotion, mouthing, snout contact, and grooming: Opposing roles of D1 and D2 receptors. Psychopharmacology, 106, 447-454. Einat, H., & Szechtman, H. (1995). Perseveration without hyperlocomotion in a spontaneous alternation task in rats sensitized to the dopamine agonist quinpirole. Physiology and Behavior, 57, 55-59. Evenden, J. L., & Robbins, T. W. (1983). Increased response switching, perseveration, and perseverative switching following d-amphetamine in the rat. Psychopharmacology, 80, 67-73. Howell, L. L., Byrd L. D., & Marr, M. J. (1986). Similarities in the rate-altering effects of white noise and cocaine. Journal of Experimental Analysis of Behavior, 46, 381-394. Kurylo, D. D. (2004). Effects of quinpirole on operant conditioning: perseveration of behavioral components. Behavioural Brain Research, 155, 117-124. 51 Laties, V. G. (1972). The modification of drug effects on behavior by external discriminative stimuli. Journal of Pharmacology and Experimental Therapeutics, 183, 1-13 Laties, V. G., Wood, R. W., & Rees, D. C., (1981). Stimulus control and the effects of d-amphetamine in the rat. Psychopharmacology, 75, 277-282. Machado, A. (1997). Increasing the variability of response sequences in pigeons by adjusting the frequency of switching between two keys. Journal of the Experimental Analysis of Behavior, 68, 1-25. McElroy, E., & Neuringer, A. (1990). Effects of alcohol on reinforced repetitions and reinforced variation in rats. Psychopharmacology, 102, 49?55. Mook, D. M., & Neuringer, A. (1994). Different effects of amphetamine on reinforced variations versus repetitions in spontaneously hypertensive rats (SHR). Physiology & Behavior, 56, 939-944. Neuringer, A., Kornell, N., & Olufs, M. (2001). Stability and variability in extinction. Journal of Experimental Psychology: Animal Behavior Processes, 27, 79?94. Nevin, J. A. (1988). Behavioral momentum and the partial reinforcement effect. Psychological Bulletin, 103, 44-56. Nevin, J. A. (1992). An integrative model for the study of behavioral momentum. Journal of the Experimental Analysis of Behavior, 57, 301?316. Page, S., & Neuringer, A. (1985). Variability is an operant. Journal of Experimental Psychology: Animal Behavior Processes, 11, 429?452. Pryor, K. W., Haag, R., & O?Reilly, J. (1969). The creative porpoise: Training for novel behavior. Journal of the Experimental Analysis of Behavior, 12, 653-661. Rees, D. C., Wood, R. W., & Laties, V. G. (1985). The roles of stimulus control and reinforcement frequency in modulating the behavioral effects of d-amphetamine in the rat. Journal of the Experimental Analysis of Behavior, 43, 243-255. Rees, D. C., Wood, R. W., & Laties, V. G. (1985). Stimulus control and the development of behavioral tolerance to daily injections of d-amphetamine in the rat. Journal of Pharmacology and Experimental Therapeutics, 240, 65-73. Schoenfeld, W. N. (1968). On the difference in resistance to extinction following regular and periodic reinforcement. Journal of the Experimental Analysis of Behavior, 11, 259?261. 52 Schoenfeld, W. N., Harris, A. H., & Farmer, J. (1966). Conditioning response variability. Psychological Reports, 19, 551-557. Schultz, W. (1998). Predictive reward signal of dopamine neurons. Journal of Neurophysiology, 80. Schwartz, B. (1980). Development of complex stereotyped behavior in pigeons. Journal of the Experimental Analysis of Behavior, 33, 153-166. Schwartz, B. (1982). Failure to produce variability with reinforcement. Journal of the Experimental Analysis of Behavior, 37, 171-181. Stokes, P. D. (1995). Learned variability. Animal Learning and Behavior, 23, 164?176. Thompson, D. M., & Moerschbaecher, J. M. (1979). An experimental analysis of the effects of d-amphetamine and cocaine on the acquisition and performance of response chains in monkeys. Journal of the Experimental Analysis of Behavior, 32, 433?444. Wachtel, S. R., Brooderson, R. J., & White, F. J. (1992). Parametric and pharmacological analyses of enhanced grooming response elicited by the D1 receptor agonist SKF38393 in the rat. Psychopharmacology, 109, 41?48. 53 Figures Sequences RRRRLLLLLRRRLLRR LLLR RLLLRRLLRRRLRLRRRRLRLRLL LLRLRLLRLRRLLRLRRLRL Fr eq ue nc y 0.0 0.2 0.4 0.6 0.8 1.0 Sequences RRRRLLLL LRRRLLRR LLLR RLLL RRLLRRRLRLRRRRLRLRLL LLRL RLLRLRRLLRLRRLRL Fr eq ue nc y 0.0 0.2 0.4 0.6 0.8 1.0 Figure 1. Histograms for RAT 624 displaying entropy and frequency of responding for each of the 16 possible sequences in the VARY 8:4 (top panel) and the FR 4 (bottom panel) components on a non-injected control day. The number of changeover responses required for each sequence is represented by the dotted lines (i.e. from zero up to three changeovers required to complete the sequences). U = .10 U = .89 54 d Amphetamine - Total Responses Dose C V 0.3 1 1.7 3 To tal R es po ns es 0 200 400 600 800 d Amphetamine - Entropy Dose C V 0.3 1 1.7 3 En trr op y 0.0 0.2 0.4 0.6 0.8 1.0 Figure 2. Total responses (left panel) and entropy (right panel) for different doses of d amphetamine for the VARY 8:4 (filled triangles) and the FR 4 (unfilled circles) components. Error bars = 1 S.E.M. SKF 38393 - Total Responses Dose C V 1 10 17 To tal R es po ns es 0 200 400 600 800 SKF 38393 - Entropy Dose C V 1 10 17 En tro py 0.0 0.2 0.4 0.6 0.8 1.0 Figure 3. Dose-response functions for total responses (left panel) and entropy (right panel) under SKF 38393 for the VARY 8:4 (filled triangles) and the FR 4 (unfilled circles) components. Error bars = 1 S.E.M. 55 Quinpirole - Total Responses Dose C V 0.01 0.1 1 To tal R es po ns es 0 200 400 600 800 1000 1200 Quinpirole - Entropy Dose C V 0.01 0.1 1 En tro py 0.0 0.2 0.4 0.6 0.8 1.0 1.2 Figure 4. Dose-response functions for total responses (top panel) and entropy (lower panel) under quinpirole for the VARY 8:4 and the FR 4 components. Triangles represent the VARY 8:4 component with filled being the low dose group and unfilled being the high dose group. Circles represent the FR 4 component with filled being the low dose group and unfilled being the high dose group. Error bars = 1 S.E.M. 56 Sequences RRRR LLLL LRRRLLRR LLLR RLLL RRLLRRRLRLRRRRLR LRLL LLRL RLLR LRRL LRLR RLRL Fr eq ue nc y 0.0 0.2 0.4 0.6 0.8 1.0 Sequences RRRR LLLL LRRRLLRR LLLR RLLL RRLLRRRLRLRRRRLR LRLL LLRL RLLR LRRL LRLR RLRL Fr eq ue nc y 0.0 0.2 0.4 0.6 0.8 1.0 Figure 5. Histograms for RAT 616 displaying entropy and frequency of responding for each of the 16 possible sequences in the VARY 8:4 (top panel) and FR 4 (bottom panel) components when administered .17 mg/kg of Quinpirole. The number of changeover responses required for each sequence is represented by the dotted lines (i.e. from zero up to three changeovers required to complete the sequences). U = .95 U = .94 57 d Amphetamine - Total Responses Dose C V 0.3 1 1.7 3 To tal R es po ns es 0 200 400 600 800 1000 1200 1400 d Amphetamine - Entropy Dose C V 0.3 1 1.7 3 En tro py 0.0 0.2 0.4 0.6 0.8 1.0 Figure 6. Total responses (left panel) and entropy (right panel) for different doses of d amphetamine for the VARY 8:4 (filled triangles) and the FR 4 (unfilled circles) components under a VI 60? schedule. Error bars = 1 S.E.M.