DRUG EFFECTS ON BEHAVIOR IN TRANSITION: DOES CONTEXT MATTER? Except where reference is made to the work of others, the work described in this dissertation is my own or was done in collaboration with my advisory committee. This dissertation does not include proprietary or classified information. _____________________________________ Kelly M. Banna Certificate of Approval: ________________________________ ______________________________ James M. Johnston M. Christopher Newland, Chair Professor Alumni Professor Psychology Psychology ________________________________ ______________________________ Christopher J. Correia Asheber Abebe Gebrekidan Assistant Professor Assistant Professor Psychology Mathematics and Statistics _________________________________ Joe F. Pittman Interim Dean Graduate School DRUG EFFECTS ON BEHAVIOR IN TRANSITION: DOES CONTEXT MATTER? Kelly M. Banna A Dissertation Submitted to the Graduate Faculty of Auburn University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy Auburn, Alabama December 17, 2007 iii DRUG EFFECTS ON BEHAVIOR IN TRANSITION: DOES CONTEXT MATTER? Kelly M. Banna Permission is granted to Auburn University to make copies of this dissertation at its discretion, upon request of individuals or institutions and at their expense. The author reserves all publication rights. _____________________________ Signature of Author ______________________________ Date of Graduation iv VITA Kelly Banna was born to Nazih and Patricia Banna in Youngstown, Ohio in 1977. She graduated from Boardman High School in 1995 and attended college at James Madison University in Harrisonburg, VA. There, she earned a B.S. in psychology with a minor in statistics and completed a senior honors thesis entitled, ?An Evaluation of the Discriminative Stimulus Properties of Gamma-Hydroxybutyric Acid in Rats,? with Dr. Sherry Serdikoff. Kelly graduated magna cum laude in 1995 with distinction. In 2005, she completed an M.S. in experimental psychology at Auburn University in Auburn, Alabama under the instruction of Dr. M. Christopher Newland. Her master?s thesis was entitled, ?Concurrent Schedule Behavior in the Bluegill Sunfish: An Evaluation of Choice and Learning in a Variable Environment.? v DISSERTATION ABSTRACT DRUG EFFECTS ON BEHAVIOR IN TRANSITION: DOES CONTEXT MATTER? Kelly Marie Banna Doctor of Philosophy, December 17, 2007 (M.S., Auburn University, 2005) (B.S., James Madison University, 1999) 99 Typed Pages Directed by M. Christopher Newland The generalized matching law has been used extensively to characterize choice under concurrent schedules of reinforcement. This model uses data obtained during periods of stable responding to describe response allocation among two or more response alternatives. Much less research has been conducted on the acquisition of choice. Here, we describe a procedure to study choice and its acquisition over the course of a single, 2- hour experimental session and suggest a mathematical model for quantifying the rate and magnitude of behavior change. Further, we examined the role of external stimuli and d- amphetamine on transitional and steady-state behavior. vi Rats were trained to respond under concurrent schedules of reinforcement during 2-hour experimental sessions. Each session began with a scheduled reinforcement ratio of 1:1 (baseline). Thirty minutes into the session, the reinforcement ratio either remained 1:1 or changed to one of the following (transition): 32:1, 16:1, 8:1, 4:1, 1:4, 1:8, 1:16, or 1:32. A 10-second time out occurred between baseline and transition periods, during which responses had no programmed consequences. Subjects were divided into three experimental groups based on the degree to which this transition was signaled. For one- third of the animals, the time out was not accompanied by any changes in external stimuli. For one-third of the animals, stimulus and house lights were extinguished at the beginning of the time out and were re-illuminated at the end of the time out. For the remaining third, stimulus and house lights were also extinguished at the beginning of the time out, and all stimulus lights were re-illuminated at the end of the time out with the exception that the stimulus light above the newly-lean lever remained extinguished for the duration of the transition. In addition, subjects in all groups were tested in the 8:1 and 32:1 conditions following administration of 0.3, 0.56, and 1.0 mg/kg d-amphetamine. Results showed that the generalized matching equation and the logistic equation described by Newland and Reile (1999) were good descriptors of choice and acquisition, respectively. d-Amphetamine increased both sensitivity to changes in the reinforcement ratio and the rate at which preference was acquired. Effects of stimulus group were only observed on the rate of changing over between response alternatives. These data demonstrate the usefulness of single-session transitions in examining the effects of drugs on choice and its acquisition, and support the hypothesis that the dopamine agonist d- amphetamine increases behavioral sensitivity to reinforcement. vii ACKNOWLEDGEMENTS Have you ever really had a teacher? One who saw you as a raw but precious thing, a jewel that, with wisdom, could be polished to a proud shine? - Mitch Albom, Tuesdays with Morrie The author would like to thank Dr. M. Christopher Newland for his guidance and patience in completing this dissertation, and for his support over the past eight years. He is the type of mentor that all students wish for, but very few are blessed enough to find. She would also like to thank Drs. Asheber Abebe, Christopher Correia, Martha Escobar, James Johnston, and Vishnu Suppiramaniam for serving on her dissertation committee. A special thank you to the graduate and undergraduate students who helped conduct the research described herein: Erin Pesek, Josh Johnston, Kelly Wright, Joshlyn Bush, Allison French, Audrey Massilla, Charles Smith, and Ashley Sowell. Finally, the author would like to thank T.J. Clancy for his love and support while completing this project. viii Publication Manual of the American Psychological Association, 5 th Ed. Microsoft? Word (2002) Microsoft? Excel (2002) RS 1 SigmaPlot 8.0 Systat 11.0 ix TABLE OF CONTENTS LIST OF TABLES...............................................................................................................x LIST OF FIGURES............................................................................................................xi CHAPTER 1. INTRODUCTION........................................................................................1 CHAPTER 2. THE USE OF SINGLE-SESSION TRANSITIONS TO EVALUATE CHOICE AND ITS ACQUISITION.....................................................................24 CHAPTER 3. THE EFFECTS OF DISCRIMINATIVE STIMULI AND d-AMPHETAMINE ON THE ACQUISITION OF CHOICE...............................55 REFERENCES...................................................................................................................83 x LIST OF TABLES Table 2.1 Accumulation of reinforcers on a visit-by-visit basis 44 Table 2.2 Number of sessions included in statistical analyses 45 Table 2.3 Parameter estimates for matching functions 46 Table 3.1 F and p-Values for Dose Effects on Transition 74 Parameters xi LIST OF FIGURES Figure 2.1 Representative 32:1 transition 49 Figure 2.2 Response and changeover rates as a function of session 50 segment and transition magnitude Figure 2.3 Matching functions for individual animals 51 Figure 2.4 H max , slope, and Y max as a function of transition 52 magnitude Figure 2.5 Frequency distribution of responses per visit during 53 a 32:1 transition for a single subject Figure 2.6 Average response rate as a function of transition 54 magnitude and session segment Figure 3.1. Overall response rates as a function of amphetamine 77 (mg/kg) and stimulus group. Figure 3.2. Changeover rates as a function of amphetamine (mg/kg) 78 and stimulus group. Figure 3.3. Steady state parameter estimates as a function of 79 amphetamine (mg/kg) and stimulus group. Figure 3.4. Representative matching functions for control, 80 0.3 mg/kg, and 1.0 mg/kg conditions from subject 103. Figure 3.5. Transition parameter estimates as a function of 81 amphetamine (mg/kg) and stimulus group. Figure 3.6. Representative transitions during control (upper panel) 82 and 1.0 mg/kg AMP sessions. 1 CHAPTER 1: INTRODUCTION Concurrent schedules of reinforcement are one of the most commonly used preparations for studying choice in animal models. Behavior under such schedules is well-characterized and has been successfully modeled using an equation known as the generalized matching relation, in which the allocation of behavior between two response alternatives is charted as a function of the ratio of reinforcers available from the two alternatives (Baum, 1974; Davison and McCarthy, 1988). More behavior tends to occur on the richer alternative and, remarkably, this relationship can be modeled very precisely using a power-law formulation. Methodologically, these studies are usually conducted by analyzing behavior in steady-state. This can require several weeks for each reinforcer ratio (i.e., the ratio of reinforcers programmed on one lever to those programmed on the other) because each ratio is maintained until behavioral stability is established. Such a design requires several months of data collection before a complete matching analysis can be completed, a time requirement that makes it difficult to study the influence of variables other than reinforcement rate on choice. For example, if two weeks are required for behavior to stabilize under a pair of concurrent schedules (a conservative estimate), it would take ten weeks to collect enough data for a complete matching function containing five reinforcer 2 ratios. If one were interested in evaluating the effects of a particular drug, for example, it would take approximately 60 weeks to complete a full dose-effect curve comprising control, saline, and four drug doses. This comprises approximately half of a rodent?s lifespan, a time commitment that makes such a study prohibitive. Therefore, the effects of drugs on concurrent schedule behavior have been largely ignored. We (Newland and Reile, 1999; Newland, Reile, and Langston, 2004) have been examining approaches to determine matching functions within a single session. This novel procedure not only allows us to construct an entire steady-state matching function in a fraction of the time required using traditional methods, but it also allows for quantitative modeling of behavioral transitions within each experimental session. A complete matching function can be determined over the course of only a few weeks and, moreover, the acquisition of choice can be modeled in a single session. This approach also permits us to examine the effects of drugs on the acquisition of choice, which is important because the careful selection of drugs enables the study of pharmacological or neurochemical correlates of choice in behaving animals. For example, if transitional or choice behavior is altered by administration of a dopaminergic drug, it would suggest that dopaminergic systems are involved in these processes. Similarly, if dopaminergics do not affect these behaviors, it would provide evidence that choice and learning are neurologically mediated through other neurotransmitter systems. A second area that has received little attention in concurrent schedule research is the role of exteroceptive stimuli on choice. While there is a large body of literature on stimulus control in general, literature searches have revealed only five studies on the role of stimulus factors in concurrent schedule behavior. This is important because literature 3 suggests that exteroceptive stimuli can make behavior resistant to disruption by some classes of drugs. The goal of the present study is to begin to fill this void in the literature devoted to concurrent schedule research. I propose to use the single-session transition procedure described by Newland and Reile (1999) to evaluate the effects of exteroceptive stimuli, and their putative interactions with drugs, on choice and learning. A detailed rationale and description of the methods are provided below. Stimulus-Drug Interactions on Schedule-Controlled Behavior Previous research has demonstrated that drug effects on schedule-controlled behavior can be mediated by the presence of external stimuli that are correlated with experimentally-relevant events. These effects appear to be a function of both the operant task and the drug used. For example, Laties and Weiss (1966) used a fixed interval (FI) schedule of reinforcement to evaluate the effects of stimuli that signal the passage of time on responding and how the presence of such stimuli mediated the behavioral effects of drugs. Under an FI schedule of reinforcement, a reinforcer (e.g., a food pellet) is delivered following the first response to occur after a fixed amount of time has passed. For example, under an FI 30 s schedule, a food pellet is delivered following the first response to occur after 30 s has elapsed. Responding under an FI schedule of reinforcement is characterized by low response rates early in the interval, followed by high response rates toward the end of the interval. When consecutive responses are plotted against time, this produces a characteristic ?scallop? pattern known as a fixed- interval scallop. 4 In the Laties and Weiss (1966) study, pigeons were trained on a three-component schedule which consisted of an FI 5 minute (no-clock) component, a clocked FI 5 minute component, and a 3-minute time out that separated successive components. The FI component was a standard FI schedule in which the first response to occur after 5 minutes was reinforced. The contingencies were identical in the clocked FI component, but various external stimuli were presented to signal the passage of time. Specifically, each of the 5 minutes comprising the interval was signaled by the presence of a unique symbol projected onto the response key. This served as a marker for the passage of time, or as a type of ?clock.? Laties and Weiss (1966) evaluated the effects of stimuli and drugs on the pattern of FI responding by calculating the index of curvature (IOC) for each condition. The IOC is calculated by plotting cumulative responses as a function of time into the interval. The area (A 1 ) under the resulting curve is then subtracted from the area that would be predicted by a constant rate of responding (A 2 ), and the difference is divided by A 2 (see Fry, Kelleher, & Cook, 1960, for a more detailed description). Larger absolute values of the IOC indicate a greater degree of curvature in the response distribution, demonstrating a greater difference in response rates at the beginning and end of each interval. In this particular study, the IOC could range from -0.8 to 0.8 with an IOC of 0 indicating a steady rate of responding throughout the interval. In the no-clock condition, the values of the IOC were 0.34, 0.32, and 0.43 for each of three subjects. These values were much higher in the clock condition, which yielded corresponding values of 0.76, 0.66, and 0.8. These data indicate that the difference between terminal and initial response rates of the 5 5-minute intervals was greater in the clock than the no clock condition, demonstrating a high degree of stimulus control over responding in the clock condition. In addition to their effects on the IOC, external stimuli also attenuated drug- induced disruption of behavior by amphetamine, scopolamine, and pentobarbital. In the no clock condition, the IOC was greatly reduced following administration of these drugs, indicating a steadier rate of responding throughout the interval when compared with non- drug conditions. However, IOCs were not altered by these drugs in the clock condition. Chlorpromazine and promazine, however, did disrupt responding in both no-clock and clock conditions with greater impairment seen in the no-clock condition. The results of this study demonstrate that FI responding can come under the control of external stimuli and that these stimuli mediate the disruptive effects of some, but not all, drugs. While the Laties and Weiss (1966) study demonstrated that drug effects may be attenuated by the presence of external stimuli, the results may have been spurious. It has been demonstrated (Dews, 1958) that the behavioral effects of drugs often depend on the rate at which the behavior occurs in non-drug conditions. That is, the effects of drugs are often rate dependent. In Laties & Weiss? study, response rates differed between the clock and no clock conditions. Therefore, the differential effects of drugs on signaled and unsignaled performance in their experiment may have been a function of the different rates of responding occurring in each condition. To determine whether the results of the previous experiment were a function of stimulus control or rate-dependency, several studies were conducted using a fixed consecutive number schedule (FCN), a schedule in which response rates are not sensitive to the presence or absence of external stimuli. Under an FCN schedule of reinforcement, 6 a reinforcer is delivered following a series of responses on two keys. Subjects are required to make a minimum number of responses on one key (the work key), followed by a single response on a second key (the reinforcement key). For example, under an FCN 10 schedule, a subject must make at least 10 responses on the work key before making a single response on the reinforcement key, at which point a reinforcer is delivered. Laties (1972) used and FCN 8 schedule of reinforcement to evaluate the effects of six drugs on signaled and unsignaled responding in pigeons. In the signaled condition (FCN-S D ), the work and reinforcement keys were transilluminated with white and green lights, respectively. When eight responses were made on the work key, the color changed from white to red and a response on the green reinforcement key resulted in 2.7 s access to mixed grain. Unsignaled conditions (FCN) were similar except that the work key did not change to red when the response criterion was met. Data from non-drug conditions showed that response patterns differed as a function of stimulus condition. The distribution of runs (i.e., consecutive responses) as a function of run length on the work key sharply peaked at a run length of 8 for the FCN-S D condition, but was much flatter in the FCN condition. Also, the conditional probability of making a response on the reinforcement key given X responses on the work key (i.e., P[response on reinforcement key | X responses on the work key]) increased dramatically when X = 8 in the FCN-S D condition but not in the FCN condition. These results suggest that subjects were much more likely to make exactly 8 responses on the work key before switching to the reinforcement key when completion of the work requirement was signaled than when it was unsignaled. Further, a greater percentage of runs ended with reinforcement in the 7 FCN-S D condition when compared with the FCN condition. Finally, there were no differences in response rates between the signaled and unsignaled conditions, which allows for clear interpretation of any observed drug effects. Drug challenges were conducted in both signaled and unsignaled conditions using the following drugs: chlorpromazine, promazine, d-amphetamine, scopolamine hydrobromide, methyl scopolamine bromide, and haloperidol. In the FCN condition, premature switching (i.e., switching from the work to the reinforcement key prior to the completion of 8 work responses) increased dramatically following administration of d- amphetamine, chlorpromazine, promazine, and scopolamine. However, the effects of d- amphetamine and scopolamine were greatly attenuated in the FCN-S D condition. Significant levels of premature switching were still observed in the FCN-S D condition following chlorpromazine and promazine challenges. While haloperidol increased the variability of run lengths, these effects did not differ between conditions. These results support those obtained by Laties and Weiss (1966), demonstrating that the addition of an external discriminative stimulus mitigates the effects of some drugs on FCN schedules. These effects appear to be a function of drug class, at least when applied to FI and FCN schedules. The effects of the psychomotor stimulant d-amphetamine and the acetylcholine antagonist scopolamine are mediated by the degree of external stimulus control, whereas the effects of the neuroleptics chlorpromazine, promazine, and haloperidol, which are generally characterized by their antagonistic effects at dopamine (DA) synapses, are not. To determine whether the failure of external stimuli to attenuate the effects of the DA antagonists was due to procedural nuances or the specific drugs selected, Szostak and 8 Tombaugh (1981) evaluated the effects of pimozide on FCN responding in pigeons. While chlorpromazine, promazine, and haloperidol act at both DA and noradrenergic (NA) synapses, pimozide is a selective DA receptor blocker. The specificity of pimozide allows for a clearer characterization of the interactions between external stimuli and DA- acting drugs on responding under an FCN. The procedure was similar to that employed by Laties (1972). However, Laties employed a within-subject design, while Szostak and Tombaugh used a between-subjects design. The differences in response patterns observed by Szostak and Tombaugh (1981) under non-drug conditions were similar to those obtained by Laties (1972). In the FCN- S D condition, the distribution of run lengths on the work key peaked sharply at 8 responses, while distributions in the FCN condition were much flatter and the number of reinforced trials was higher among subjects in the S D condition. Additionally, the conditional probability of responding on the reinforcement key given a specific number of responses on the work key peaked at 8 work key responses in the S D condition but at much higher responses (approximately 11-16) in the no-S D condition. These results demonstrate a high degree of stimulus control in the FCN-S D , but not the FCN, condition. Pimozide (0.1, 0.3, or 0.5 mg/kg) challenges yielded decreases in overall response rates in both stimulus conditions. However, drug effects on other measures were condition-specific. For example, the distribution of run lengths was disturbed to a greater degree in the FCN condition than in the FCN-S D condition. Specifically, the number of runs consisting of less than 8 responses increased and the number of runs greater than 8 responses decreased in the FCN condition. These effects were dose-dependent. In the FCN-S D condition, however, changes in the distribution of run lengths were manifested 9 as increases in the number of runs greater than 8. Also, the conditional probability of switching to the reinforcement key after a particular number of responses on the work key was altered in both conditions. In the FCN condition, the probability of switching prior to 8 responses increased in a dose-dependent fashion. Changes in the S D condition were manifested as an increase in the probability of switching after 9, 10, or 11 responses such that these probabilities were equal to that of switching after 8 responses. These changes yielded a decrease in the number of reinforced trials among subjects in the FCN, but not the FCN-S D , condition. When compared to the results obtained by Laties and Weiss (1966) and Laties (1972), this study demonstrates that the mitigating effects of external stimuli on drug effects may depend heavily on the neurotransmitter specificity of the drugs employed. The goal of the present experiment is to examine the interaction of external stimuli and drug effects on learning using a relatively new procedure for studying behavior in transition. Specifically, it is designed to determine whether (a) concurrent schedule performance during steady state and transitions differs according to the amount of external stimuli presented and (b) drug effects on these behaviors are mediated by the presence of external stimuli. This will be accomplished using single-session transitions between two pairs of concurrent reinforcement schedules, which are described in detail in the next section. 10 Concurrent Schedules of Reinforcement as a Model of Choice and Learning Concurrent schedule preparations are used in laboratory settings to study behavior allocation between two (or more) activities that can be reinforced at different rates. Two common arrangements are two-key and Findley procedures. Under the two-key procedure, two manipulanda are available to the subject (usually keys for pigeons and levers for rodents), and responding on each manipulandum is reinforced with some probability, usually under a variable interval (VI) schedule of reinforcement. In such a schedule, the first response to occur following a predetermined average period of time produces reinforcer (e.g., a food pellet) delivery. Each schedule in the two-key procedure is arranged on a different manipulandum, and can be adjusted independently of the other schedule. For example, imagine a situation in which a rat is responding on a two-lever concurrent schedule. Responding on the left lever is reinforced on a VI 60 s schedule (a reinforcer is delivered following the first response to occur after 60 s, on average, has elapsed). Responding on the right lever is reinforced on a VI 90 s schedule. This would be referred to as a conc VI 60 s VI 90 s schedule of reinforcement. In the Findley (Findley, 1958) procedure, a subject is also given access to two response keys, in this case, a reinforcement (or main) key and a changeover (CO) key. Both reinforcement schedules are programmed on the main key, one of which is active at any given time. A response on the CO key changes the active schedule on the main key. This allows the subject to select which schedule of reinforcement is active on the main key. In traditional Findley procedures, each schedule is paired with a different 11 discriminative stimulus (e.g., a key light) such that CO responses change both the stimulus and the reinforcement schedule on the main key. Concurrent schedules can also be characterized by whether reinforcers on the two schedules are arranged dependently or independently. When a reinforcer becomes available in a dependent concurrent schedule (i.e., a reinforcer is ?arranged? and a response will trigger its delivery), the timer for the other schedule stops until that reinforcer is delivered. That is, when a reinforcer is arranged on schedule A, no reinforcers will be arranged on schedule B until the reinforcer on schedule A is delivered (Stubbs and Pliskoff, 1969). No such requirement exists in an independent schedule. In this case, the timers for both schedules run continuously, and the arrangement and delivery of reinforcers on one schedule do not depend on the delivery of reinforcers on the other. An additional feature of concurrent schedule arrangements is the changeover delay (COD). A changeover is defined by a subject switching from responding on one schedule of reinforcement to responding on the other. For example, if a rat is responding on the right lever in the two-key paradigm, the first response on the left lever designates a changeover (and vice versa). The COD is a period of time following a changeover response during which no reinforcers are delivered, regardless of whether the animal responds. This serves to better separate the contingencies arranged on each schedule and help prevent unintentional reinforcement of alternating between the two schedules. The Matching Relation In 1961, Herrnstein proposed what has become known as strict matching, and equation designed to model behavioral allocation under a concurrent schedule of reinforcement. It is modeled using the following equation: B BB R RR 1 12 1 12 + = + [1.1] where B1 and B2 represent the number of responses occurring on each lever and R1 and R2 are the number of reinforcers earned from responding on alternatives one and two, respectively. This is also known as proportional matching because the proportion of responses occurring on one lever is predicted to equal the proportion of reinforcers occurring on that lever. The quantities in Equation 1.1 can also be expressed as ratios, yielding B B R R 1 2 1 2 = . [1.2] Between the years of 1968 and 1974, strict matching was replaced by a more generalized form of Equation 1.2 that appeared to be a better quantitative model of the behavior observed under concurrent reinforcement schedules. Lander and Irwin (1968) suggested that behavior is actually a power function of the reinforcement ratio. Equation 1.2 thus evolved into 12 B B R R a 1 2 1 2 = ? ? ? ? ? ? . [1.3] Here, a is a fitted parameter that describes the rate of change in behavior as the reinforcement ratio is varied (this will be discussed further below). An additional parameter was added to this equation in 1974 by Baum: B B c R R a 1 2 1 2 = ? ? ? ? ? ? . [1.4] Equation 1.4 is referred to as the generalized matching relation and is often expressed in its logarithmic form log log log B B ca R R 1 2 1 2 =+ ? ? ? ? ? ? . [1.5] This transformation produces a linear function with intercept log c and slope a. With respect to matching, a is a measure of sensitivity to the reinforcement ratio. When a = 1, behavior is perfectly sensitive to the reinforcement ratio. That is, every one unit change in the reinforcement ratio yields a one unit change in the behavior ratio. Values of a less than one (a < 1) demonstrate less than unit sensitivity and are indicative of what is often referred to as ?undermatching.? This is a situation in which less behavior occurs on the rich lever than would be predicted by strict matching. Values of a greater than one (a > 1) demonstrate greater than unit sensitivity, and indicates what is called ?overmatching? ? more behavior is allocated to the rich lever than is predicted by strict matching. 13 14 Log c is a measure of response bias. Response bias is observed when more behavior is allocated to one response device than would be predicted by strict matching under all reinforcement ratio conditions. When log c = 0 (c = 1), no response bias is present. That is, under conditions in which the reinforcement ratios associated with each alternative are equal, half of all responses are allocated to each response device. When log c is greater than 0 (c > 1), more behavior is allocated to the alternative represented in the numerator of the response ratio than would be predicted. When log c is less than 0 (c < 1), the opposite is true. Theoretically, bias is interpreted as indicating a preference for responding on a particular alternative that cannot be accounted for by the reinforcement ratio. Bias can be a result of many variables, such as one device requiring less force to operate than the other. Behavior in Transition The matching relation described above is used to describe choice during steady state responding. Steady state responding is defined as ?a pattern of responding that exhibits relatively little variation in its measured dimensional quantities over a period of time? (Johnston & Pennypacker, 1993, p. 199). Under concurrent schedules of reinforcement, the dependent variable is typically the behavior ratio, B 1 / B 2 . Behavior is considered to have stabilized when the behavior ratio varies little over time and there is a lack of trends in the data. This is often assessed graphically, and criteria for stability are defined by the researcher. Behavior tends to approach a steady state following an extended period of responding under a particular pair of schedules. Steady state behavior is contrasted with transitional behavior, which is behavior that occurs during the shift from one steady state to another (Johnston & Pennypacker, 1993) and is characterized by significant changes (e.g., increases or decreases) in the behavior ratio. The beginning of the transition period is defined as ?the point at which programmed reinforcement contingencies change? (Newland & Reile, 1999, p. 323), while the end of the transition is marked by a return to steady state responding. Transitional responding is usually identified graphically, and is often a result of alterations in the contingencies maintaining behavior (Sidman, 1960). Modeling transitional behavior. Like steady state behavior, transitional behavior can also be modeled quantitatively. The following logistic equation was proposed by Newland and colleagues (Newland, Yezhou, L?gdberg, & Berlin, 1994): YP P e kR X half =+ + ? ?0 1 () . [1.6] In this equation, Y is the proportion of responses occurring on either the left or right lever and X is the independent variable. Four parameters are fitted using non-linear least- squares regression. The lower asymptote is given by P 0 , the upper asymptote is given by P ? , k is the slope of the straight portion of the S-shaped curve, and R half is the number of reinforcers that have been delivered when the transition is half way complete. The upper and lower asymptotes reflect the proportion of behavior occurring on one of the levers at the beginning and the end of the transition, respectively. K and R half are rate parameters 15 that describe the rapidity at which the transition occurs. Transitions that occur at a slow pace are reflected by small k and/or high R half values, while the opposite describe transitions that occur at a faster rate. Two types of independent variables have been used in this equation. The first is session number. Using session as the independent variable provides a fairly molar description of behavior as the values of Y represent the proportion of behavior occurring on one lever over the course of an entire session. More recently, cumulative number of reinforcers has replaced session as the independent variable. This allows for a more molecular evaluation of behavior because Y values are the proportion of responses occurring on a particular lever following the delivery of each reinforcer. This allows for a reinforcer-by-reinforcer assessment of behavior allocation throughout the course of each session and facilitates comparisons across studies through the use of a common metric. Equation 1.6 has been modified slightly since its first appearance in the literature. In 1999, Newland and Reile described the following equation in place of Equation 1.6: )( 2 1 1 log XXk m half e Y B B ? + = . [1.7] Two changes distinguish this model from Equation 1.6. First, the dependent variable is the log of the response ratio rather than the proportion of responses occurring on a particular lever. Second, the lower asymptote P 0 has been removed from the equation. This is compensated for by normalizing the data in such a way that P 0 = 0. In order to do 16 17 this, the median (or mean) response ratio from the last 10 minutes of the baseline portion of the session is subtracted from each response ratio during the transition. Note that in Equation 1.7 the upper asymptote is denoted by Y m rather than P ? . Application of the logistic model to study chemical effects on behavior. In their 1999 paper, Newland and Reile suggested that Equation 1.7 could be used to model the effects of contaminant exposure on transitional behavior observed in concurrent schedule procedures. To date, two such studies have been published in the literature. In the first, squirrel monkeys were exposed to either methylmercury (MeHg) or lead (Pb) during gestation and then subjected to concurrent schedule tests between the ages of five and six years (Newland et al., 1994). Steady state responding was analyzed using least-squares linear regression of log-transformed response ratios (Equation 1.5) and behavior in transition was analyzed using least-squares non-linear regression of response proportions by applying Equation 1.6. The parameter estimates from each transition were used as dependent variables that were compared using t-tests or one-way analyses of variance. The results of this study showed that transitional behavior was sensitive to contaminant exposure. For subjects in the control group (i.e., not exposed to MeHg or lead), behavior tracked the schedule changes and transitions were completed within three to six sessions. In contrast, response ratios of monkeys exposed to MeHg in utero demonstrated almost no changes during the first session, and the behavior of subjects exposed to high levels of lead (maternal blood lead levels greater than 40 ?g/dl) changed, but shifted such that more behavior was allocated to the lean rather than the rich lever. In addition, the transitions of exposed subjects progressed more slowly, ended before 18 response ratios reached programmed reinforcement ratios, and were more inconsistent and variable than those of control subjects (p. 9). In contrast, the transitions for subjects exposed to low doses of lead (maternal blood lead levels less than 40 ?g/dl) were often similar to controls in magnitude. However, the number of reinforcers required to complete transitions was two-to-four times higher in exposed monkeys. In addition to transitions, steady state behavior was also sensitive to contaminant exposure during gestation. Both intercept and slope parameters demonstrated a main effect of exposure group. For the control group, little or no bias was apparent (intercept close to zero) and behavior was characterized by overmatching (slope of matching functions greater than or equal to one). However, matching functions for exposed subjects demonstrated strong biases and undermatching (shallow slopes). The study by Newland and colleagues (1994) was the first to use Equation 1.7 to model concurrent schedule behavior in transition. It provided a novel method for examining transitional behavior at a molecular level, that is, on a reinforcer-by-reinforcer basis. This enabled researchers to examine how consequences influence choice in a changing environment on a molecular level. A similar experiment was recently published using rats as subjects (Newland, Reile, & Langston, 2004). In this study, female rats were exposed to 0, 0.5, or 6.4 ppm MeHg per day in drinking water during pregnancy. The behavior of offspring was investigated using single-session transitions at 1.7 and 2.3 years of age. Experimental sessions consisted of a 30 minute baseline phase (reinforcement ratio 1:1) followed by a 150 minute transition phase (reinforcement ratio 1:1, greater than 1:1, or less than 1:1). Equation 1.5 was fit to the last ten visits of a session to examine steady state responding 19 and transitions were analyzed by fitting Equation 1.7 to all data following a transition. The effects of gestational exposure to MeHg on steady state and transitional behavior were analyzed by performing ANOVAs on all parameters generated by Equations 1.5 and 1.7. These analyses failed to detect differences in steady state responding among exposure groups in either young or old rats (i.e., no significant exposure effects on slope or intercept parameters). In addition, the magnitude (Y m ) and slope (k) terms generated by Equation 1.7 did not vary among groups. This demonstrated that the behavior of all animals reached similar asymptotes and that, once the transition began, it progressed at the same rate regardless of exposure regimen. However, significant effects of MeHg were observed for half-maximal reinforcers (X half ) at both 1.7 and 2.3 years of age (results for young rats achieved significance only after log-transforming the data). Specifically, the number of reinforcers required to complete half of a transition was higher in exposed than in control rats. This suggests that more reinforcers were required before the behavior of exposed animals began to change, indicating that learning progressed at a slower pace for these subjects. The previous results demonstrate several things. First, they show that transitional behavior is sensitive to chemical exposure (in these cases, exposure to environmental contaminants), and that Equation 1.7 models these differences well. Second, the Newland et al. (2004) paper established that transitions can be modeled in a single experimental session. However, one factor that has not yet been investigated is the role external stimuli can play in modulating single session transitions. 20 The Role of Stimulus Control in Concurrent Schedule Behavior Baum (1974) presented the hypothesis that the slope (sensitivity) parameter in Equation 1.5 (a) may be influenced by the degree of discriminability between two concurrently presented schedules. Specifically, as a subject?s ability to discriminate between the two schedules decreases, behavior will tend toward undermatching (a is less than 1.0). A number of factors may play a role in schedule discriminability. In the simplest case, two schedules differ only in the density of reinforcement associated with each. This can be accomplished by using a Findley procedure in which responding on the CO key alternates reinforcement schedules, but not stimuli, on the main key. In this case, both reinforcement schedules are associated with the same location and visual stimuli, and are differentiated only by the frequency of reinforcer delivery. Miller, Saunders, and Bourland (1980) termed this a ?parallel? schedule of reinforcement and used it to evaluate the role of stimulus disparity in concurrent schedule performance in pigeons. Using line orientation as the visual stimulus, they demonstrated that values of a decreased as differences in line orientation decreased. Specifically, when the visual stimuli associated with two reinforcement schedules were identical (0? difference), average slope terms for matching functions ranged from 0.20 to 0.30. However, when the difference in line orientation associated with each reinforcement schedule was at its greatest (45? difference), average slope terms ranged from 0.91 to 1.0. Similar results were obtained by Bourland and Miller (1981). These findings demonstrate that the slope of matching functions is partially influenced by the discriminability of external stimuli associated with each schedule in a concurrent schedule. It may even be the case the this parameter is purely a function of the degree to which the two schedules can be discriminated from one 21 another and that difference in rates of reinforcement are partial contributors to this discriminability. While most studies of stimulus control in concurrent schedule responding have focused on steady state behavior, a few have examined its role in acquisition or transitional responding. Hanna, Blackman, and Todorov (1992) used 5-hour sessions to evaluate the role of stimulus disparity on the acquisition of concurrent schedule responding in pigeons. Six pigeons were exposed to 20 pairs of concurrent schedules, which were selected from a total of five possible VI schedules: VI 72 s, VI 90 s, VI 120 s, VI 180 s, and VI 360 s. In Group 1 (n = 3), each schedule was always associated with one key color that differed from all other schedules. In Group 2 (n = 3), both keys in each schedule pair were pink (i.e., all key colors remained pink, regardless of which schedule was in effect). Each condition was presented for one 5-hour session and one 1- hour session which were separated by one non-experimental day. Once all 20 conditions had been completed, two days of stimulus control tests were conducted by presenting each key color twice for 15 s under extinction. The stimulus conditions for the two groups were then reversed and each of the twenty VI pairs repeated, followed by two additional days of stimulus control tests. Matching functions under each stimulus condition were constructed for each subject at 1-hour intervals into the session. Hanna and colleagues (1992) found that the slope of matching functions differed as a function of stimulus condition early in the session. Specifically, values of a were higher during the first hour of the session in the different color condition than in the same color condition. Slope terms did not significantly differ between hours 2 and 5, nor did they differ during the subsequent 1-hour session. This suggests that rate of acquisition of 22 choice, but not final performance, is influenced by the presence of external stimuli that differentially signal reinforcement rates. Kr?geloh and Davison (2003) have also examined the effect of signaled reinforcement ratios on acquisition of choice using a procedure described by Davison and Baum (2000). In this procedure, several pairs of concurrent schedules are presented to each subject within a single session. The reinforcement ratio of each pair is varied pseudo randomly following 10 reinforcement deliveries. Kr?geloh and Davison varied the conditions of their study according to the number of reinforcement ratios presented per session and the type of external stimuli associated with each schedule. In some conditions, the reinforcement ratio in each component was unsignaled. That is, stimuli presented on response keys remained constant throughout the entire session. In signaled conditions, the reinforcement ratio in each component was differentially signaled by alternating the frequency with which a red and yellow light flashed. Data were analyzed by averaging the inter-reinforcer log response ratios for the last 35 of 50 sessions of each experimental condition, and analyses were performed with group averages. Kr?geloh and Davison plotted these ratios as a function of scheduled reinforcement ratio to create ten matching functions per component. This allows for the computation of sensitivity parameters prior to each reinforcer delivery. In signaled components, log response ratios closely approximated scheduled reinforcement ratios prior to delivery of the first reinforcer and changed only slightly following the first few reinforcers before reaching an asymptote within each component. On the other hand, log response ratios were closer to indifference in unsignaled components and changed more dramatically following delivery of successive reinforcers. 23 Similarly, sensitivity values were higher in signaled than unsignaled components both before delivery of the first reinforcer and throughout each component. These results demonstrate that behavior was under control of the discriminative stimuli in the signaled component and that acquisition of choice was facilitated by the presence of such stimuli. The research reviewed here clearly demonstrates that concurrent schedule behavior is influenced by the presence of discriminative stimuli. However, this body of literature is far from complete. For example, of the four studies reviewed here, all used pigeons for subjects, only two examined acquisition of choice, and none evaluated drug effects on choice. The current experiment is designed to draw from and expand upon these experiments. Specifically, the present study has been designed to examine the effects of acute drug exposure on steady-state and transitional behavior during single- session concurrent schedule transitions in rats. 24 CHAPTER 2: THE USE OF SINGLE-SESSION TRANSITIONS TO EVALUATE CHOICE AND ITS ACQUISITION ABSTRACT The present study used single-session transitions between two pairs of concurrent schedules to evaluate choice and its acquisition. Eight female Long-Evans rats were trained to respond under concurrent schedules of reinforcement during experimental sessions that lasted two hours. The generalized matching equation (Baum, 1974) was used to model steady-state behavior at the end of each session, while transitional behavior that emerged following the change in reinforcement schedules was modeled using a logistic equation (Newland, Yezhou, L?gdberg, & Berlin, 1994). Results showed that the generalized matching and logistic equations were appropriate models for behavior generated during single-session transitions, and that changes in response rates are mainly due to increases in the number of long visits to the richer of two alternatives. The generalized matching equation (Baum, 1974) has been widely used to model steady-state behavior maintained by concurrent schedules of reinforcement. The logarithmic form, which is most commonly used, is given in Equation 2.1: log log log B B ca R R 1 2 1 2 =+ ? ? ? ? ? ? . [2.1] Here, B 1 /B 2 is the ratio of responses on alternative 1 to alternative 2, R 1 /R 2 is the ratio of reinforcers obtained (or scheduled) on alternative 1 to alternative 2, log c is the y- intercept, and a is the slope of the linear function. Log c is a measure of response bias, or a preference for responding on one alternative that is independent of the response ratio. The slope parameter, a, is a measure of behavioral sensitivity to changes in the response ratio. While a great deal of research has been dedicated to modeling steady-state behavior, relatively little attention has been paid to modeling transitional behavior. Transitional behavior occurs during the shift from one steady state to another (Sidman, 1960; Johnston & Pennypacker, 1993), is characterized by significant changes in the behavior ratio, B 1 /B 2 , and is usually observed following a change in reinforcement contingencies (Newland & Reile, 1999). Mazur and colleagues (Bailey & Mazur, 1990; Mazur, 1992, 1997; Mazur & Ratti, 1991) and Davison and Baum (Davison & Baum, 2000) have conducted some of the most recent work on behavior in transition. Using random ratios of reinforcement, Bailey and Mazur and Mazur and Ratti used hyperbolic equations to model transitional behavior in pigeons, and demonstrated that the rate of acquisition of choice is a function of the magnitude of the reinforcement ratio on two response keys. Specifically, they showed that it was the magnitude of the reinforcement 25 ratio, and not the difference between the two ratios, that was important. More recently, Davison and Baum and colleagues have evaluated the acquisition of choice when reinforcer ratios change rapidly throughout one experimental session. This research has shown that preference for a rich alternative develops quickly, and while slope estimates fall short of those calculated from multi-session experiments, they reach 59 to 72% of asymptotic slope values after only 10 to 12 reinforcers have been delivered. Another approach to modeling transitional behavior within a single session was proposed by Newland, Yezhou, L?gdberg, and Berlin (1994), and was modified and further described by Newland and Reile (1999) and Newland, Reile, and Langston (2004). Here, a logistic function is used to model transitional behavior, and takes the form )( 2 1 1 log XXk m half e Y B B ? + = , [2.2] where B 1 /B 2 is the ratio of responses on alternative 1 to alternative 2, X is the number of reinforcers that have been delivered, and the free parameters X half , k, and Y m are the number of reinforcers that have been delivered when the transition is half-way complete, the slope of the S-shaped portion of the function, and the magnitude of the transition (i.e., change in baseline response ratio), respectively (see Figure 2.1). Newland and colleagues (2004) applied Equation 2.2 to concurrent schedule behavior derived from a new experimental procedure in which a transition between two reinforcement ratios occurs in a single session. At the beginning of each session, behavior was reinforced under a concurrent variable interval-variable interval (conc VI VI) schedule in which the reinforcer ratio was 1:1 (baseline). Thirty minutes into the 26 27 session, the reinforcer ratio changed such that one alternative became rich (i.e., delivered reinforcers at a higher rate than the other alternative), while the overall rate of reinforcement was the same as the baseline period. This transition period lasted an additional 2 hours. Log response ratios were plotted as a function of cumulative reinforcers earned on a visit-by-visit basis, where one visit consisted of a bout of responding on one lever followed by a bout of responding on the other lever (see Table 2.1). Data were smoothed using a 9-point LOWESS algorithm, and Equation 2.2 was fit to the smoothed data. In addition to transition analyses, response ratios during the last 30 minutes of each session were plotted as a function of scheduled reinforcer ratio and Equation 2.1 was fit to the data, allowing for the analysis of steady-state behavior. While behavior did not reach the level of stability observed when behavior is given many weeks to stabilize, slope estimates approximated those obtained when response ratios are allowed to stabilize over the course of several days. The present study was designed to further examine potential determinants of the acquisition of choice. We examined the role of reinforcer magnitude by charting choice from baseline, in which the reinforcer ratio from the two alternatives was 1:1, to one in which it was 4:1, 16:1, or 32:1. We also examined the role of visit duration or, more specifically, the number of responses on a visit, in driving the allocation of behavior to the two alternatives. Specifically, we examined whether preference for the richer alternative is due to a wholesale shift in the number of responses on that alternative or to an increase in the number of very long visits. The former would be characterized by a change in both the mean and mode of the distribution of visit lengths, and the second by a change in the right tail of the distribution but with little variation in the mode. Finally, 28 this experiment was also designed to evaluate the degree to which Equations 2.1 and 2.2 are able to describe asymptotic and transitional behavior, respectively, when transitions occur in a single session. Method Subjects Subjects were 8 female Long-Evans rats obtained from Harlan Laboratories (Indianapolis, Indiana). They were housed at the Biological Research Facility at Auburn University on a 12h-12h dark/light cycle (lights on at 0600h). Rats were housed in pairs in single Plexiglas cages measuring 42 x 21.5 x 20.5 cm, and containing a clear plastic divider designed to divide the cage in half diagonally. Rats were fed ad libitum for several weeks to determine asymptotic body weights. Once weights stabilized, food restriction was initiated to reduce and maintain each rat at 85-90% of their ad libitum weights. Apparatus Experiments were conducted in operant chambers purchased from Med Associates (Georgia, Vermont) and housed in sound-attenuating cabinets. Each chamber contained two levers on the front panel and one on the back. The levers on the front panel were retractable and were calibrated to 0.20 N. One light emitting diode (LEDs) was located above each front lever and a 20mg sucrose pellet dispenser was located equidistantly between the two levers. A house light (28 V 100 ma) was located near the 29 ceiling of the chamber on the front panel and was aligned with the center of the pellet dispenser. Experimental events were programmed and controlled by a computer using MED- PC IV software (Med Associated, Georgia, Vermont). This computer was located in a room adjacent to that which housed the operant chambers. Procedure Once body weights stabilized, rats were trained to lever press for 20 mg sucrose pellets in an overnight session using an autoshaping procedure similar to that used by Paletz, Day, Craig-Schmidt, and Newland (in press). Training on the initial lever was considered complete following 90 independent responses (i.e., responses made after autoshaping trials ended). Responding on the opposite lever was trained in a similar fashion the following evening. Concurrent schedule training commenced after all subjects successfully completed autoshaping on both levers. At the beginning of each session, the house light and LEDs above each lever were illuminated. Initially, responding was reinforced on concurrent variable interval 60 s schedules (conc VI 60s VI 60s) for the duration of a 120 minute session. This yielded an average overall reinforcement rate of 2 pellets per minute. Once stability criteria were met (i.e., the proportion of responses occurring on the left lever was between 0.4 and 0.6 for the last three of at least five consecutive sessions), single session transitions began. The overall average reinforcement rate was held constant throughout each session (2 per minute). However, 30 minutes into each 30 session, the scheduled reinforcement ratio changed from 1:1 (baseline portion) to one in which either the left or right alternative became rich (transition portion). A 10 second time out separated the baseline and transition portions of each session. During the time out, all chamber lights remained on and lever pressing had no programmed consequences. Each subject was exposed to the following transition magnitudes: 32:1, 1:32, 16:1, 1:16, 8:1, 1:8, 4:1, 1:4, and 1:1. Transition sessions occurred on Tuesdays and Fridays, and sessions in which the reinforcement ratio remained 1:1 for the duration of the 120 minute session were conducted on Mondays, Wednesdays, and Thursdays. Two sessions were run daily during the light portion of the light-dark cycle. To control for time day effects, four animals from each group were assigned to each session time. A changeover delay (COD) of 2 seconds was employed throughout each condition. Data Analysis Response and changeover (CO) rates. Response and CO rates were analyzed separately for each transition magnitude. Baseline rates were calculated for each session by dividing the total number of responses during the first 30 minutes of the session by 30, transition rates were calculated by dividing the total number of responses during the middle 60 minutes of the session by 60, and asymptotic rates were calculated by dividing the total number of responses during the last 30 minutes of each session by 30. Baseline, transition, and asymptotic response and CO rates were then averaged within each animal at each transition magnitude by combining sessions at each magnitude with its reciprocal. For example, to calculate the average baseline response rate for the 32:1 transition 31 magnitude, the baseline response rates during all 32:1 and 1:32 transitions were averaged within each subject. This was also done for transition and asymptotic response and CO rates for 16:1, 8:1, and 4:1 transitions. Repeated measures analyses of variance (RMANOVA) with two within-subjects variables (magnitude and session segment) were used to evaluated differences in response and CO rates. Steady state. To evaluate steady state behavior that emerges at the end of the transition period, response ratios from the last 30 minutes of each session were plotted as a function of scheduled and obtained reinforcement ratios. Matching functions for each subject were constructed by fitting Equation 2.1 to these data. Transitions. Log response ratio was plotted as a function of cumulative reinforcers on a visit-by-visit basis, where each visit consisted of one bout of responding on the right lever and one bout of responding on the left lever. Data were normalized by calculating the median log response ratio prior to the transition and subtracting this from each data point. A 9-point LOWESS smoothing algorithm was applied to data collected during the transition and asymptotic segments of the session (i.e., the last 90 minutes of the session). Equation 2.2 was then fit to the smoothed data, yielding three parameters per subject per transition (k, X half , and Y m ). Data were combined within each reinforcement magnitude (i.e., 4:1, 8:1, 16:1, and 32:1) by taking the median parameter estimates of the left (e.g., 32:1) and right (e.g., 1:32) transitions for each subject. These values were then averaged across animals, yielding mean k, mean X half , and mean Y m values for each transition magnitude. RMANOVAs with transition magnitude as a within-subjects factor were performed to test for magnitude effects on transition 32 parameters. Because slope parameters were non-normally distributed, k values were log- transformed prior to conducting statistical analyses. Histograms. To examine how changes in response rates over the course of a transition are influenced by changes in responding on the rich and lean levers, histograms were constructed for response bout length on each key during the baseline, transition, and last 30 minutes of each session. This was done for each transition magnitude. Normal, Weibull, and log-linear functions were fit to these histograms to determine which type of distribution best described the data. For all statistical analyses, a Type I error rate of 0.05 was used, and all degrees of freedom for within-subjects tests were adjusted using the Huynh-Feldt correction. Results Two 4:1 transitions and four each of 8:1, 16:1, and 32:1 transitions were conducted. For each magnitude, the right lever became rich in half of the sessions and the left lever became rich in the other half. Transition sessions were excluded from all statistical analyses if they met both of the following criteria: (1) Y m estimates exceeded the transition magnitude observed at the end of the session and (2) X half estimates exceeded the total number of reinforcers delivered during the session. Also, slope estimates > 1 were set = 1. The number of sessions included for each transition magnitude and reasons for exclusion are presented in Table 2.2. Exclusion criteria are discussed in further detail in the Discussion. 33 Figure 2.1 shows response (top panel) and CO (bottom panel) rates as a function of transition magnitude and session segment. A RMANOVA with 2 within-subjects factors (magnitude X session segment) showed a significant effect of session segment on response rate (F (2, 14) = 17.674, p = 0.003, ? = 0.56). The top panel of Figure 2.2 shows that response rates tended to decrease across session at each reinforcement magnitude. A RMANOVA with 2 within-subjects factors (magnitude X session segment) showed significant main effects for the two independent variables and a significant interaction on CO rates (F (8,56) = 4.583, p = .000, ? = 0.98). The bottom panel of Figure 2.2 suggests that CO rates decreased across session at all transition magnitudes, and that this effect became more pronounced as the transition magnitude increased. Parameter estimates from matching functions fit using either scheduled or obtained reinforcer ratios are provided in Table 2.3. Figure 2.3 shows matching functions generated with obtained reinforcer ratios for individual subjects. Slope values ranged from 0.41 to 0.71 and from 0.46 to 0.75 for functions fit with scheduled and obtained reinforcers, respectively. Both independent variables accounted for a substantial amount of variance in response ratios. Scheduled reinforcer ratios accounted for between 42 and 94% of the variance in response ratios, while obtained reinforcer ratios accounted for between 74 and 94% of the variance. Transition parameters are plotted as a function of transition magnitude in Figure 2.4. Because only two 4:1 sessions were conducted, combined parameter estimates were strongly influenced by extreme values (because the median was always equal to the mean). Therefore, data on 4:1 parameters were excluded from all statistical analyses and figures. RMANOVAs for 8:1, 16:1, and 32:1 transitions showed no significant effects of 34 magnitude on X half , slope, or Y m (all p?s > 0.05). Figure 2.5 consists of six histograms showing the distribution of response bout lengths for a single subject during a 32:1 transition. Panels on the left show data from the rich lever, while panels on the right show data from the lean lever. The top, middle, and bottom panels show bout length distributions from baseline, transition, and last 30 minutes of the session, respectively. Histograms were constructed using a bin width of one response. Means, standard deviations, and modes from log-linear functions applied to each distribution are shown as these functions provided the best fit to the data. Summary data for all subjects are shown in Figure 2.6. RMANOVAs for mean bout length on the rich lever (top panel, closed symbols) showed a significant interaction between magnitude and session segment (F (6, 42) = 8.336, p = 0.001, ? = 0.63), and significant main effects for magnitude (F (3, 21) = 8.063, p = 0.006, ? = 0.61) and segment (F (2, 14) = 25.955, p = 0.001, ? = 0.57). RMANOVAs for mean bout length on the lean lever (top panel, open symbols) showed a significant main effect of session segment (F (2, 14) = 37.386, p = 0.000, ? = 0.71). These results demonstrate that average bout lengths on the rich lever increased across the session, and that the size of this increase depended on the magnitude of the transition. Specifically, as the transition magnitude increased, so did the change in average bout length. For the lean lever, average bout lengths decreased across session, but were not affected by transition magnitude, a result that may be due to a floor effect. RMANOVAs for modal bout lengths indicated a significant magnitude X segment interaction on the rich lever (F (6, 42) = 4.028, p = 0.005, ? = 0.87) and significant main effects of session segment on both the rich (F (2, 14) = 14.815, p = 0.003, ? = 0.60) and lean levers (F (2, 14) = 23.310, p = 35 0.001, ? = 0.62). The pattern of changes in modal bout lengths are similar to that observed among mean bout lengths, but there was a much greater increase in mean than modal bout lengths on the rich lever. Discussion Single-session transitions were used to study the acquisition of choice in rats responding under concurrent schedules of reinforcement. The generalized matching relation (Equation 2.1) was used to model behavior from the last 30 minutes of each session, by which time behavior ratios had stabilized, even if they had not reached their ultimate steady state value. By charting choice on a visit-by-visit basis, we were able to model the acquisition of choice with good resolution. It was evident that an acquisition phase could be described by a brief period of near stasis, or very slow change, followed by a rapid transition, and finally, an approach to an asymptote. A logistic function (Equation 2.2) was employed to model behavior during the transition phase because it provided the "S-shape" required to model this biphasic transition. Overall Measures of Behavior In the present study, response- and CO rates were used to compare overall responding under concurrent schedules of reinforcement as a function of session segment and reinforcer ratio. A key element of these experiments was that the overall rate of reinforcement was held constant across all reinforcer ratios. Accordingly, overall response rates were relatively consistent across all reinforcer ratios employed. 36 Both response and CO rates decreased across session. The decline in response rates over the course of a session has been noted previously (McSweeny & Murphy, 2000), and may reflect habituation. Of particular importance is that they remained sufficiently high that choice could be modeled using data from the last 30 minutes of each session. Statistical analyses showed that the decline in CO rate interacted with reinforcer magnitude. This may be a joint function of habituation, as with response rate, and of the well-know inverse relationship between changeover rate and reinforcer magnitude (Baum, 1974; Stubbs & Pliskoff, 1969). Steady State/Asymptotic Behavior The generalized matching equation was well-suited for describing terminal behavior ratios that emerged at the end of single-session transitions. Equation 2.1 accounted for a large proportion of the variance in response ratios when obtained reinforcement ratios were used as the independent variable. Bias, as indicated by Equation 2.1, was generally low and unsystematic. Slope values, said to reflect sensitivity to reinforcement (Baum, 1974) or the difference in the reinforcement rates between the two alternatives (Davison & Nevin, 1999), were generally lower than those reported in multi-session studies. In the single session transitions described here, slope values averaged 0.63 as compared with averages of approximately 0.88 for rats in multi- sessions transitions (Baum, 1979). Nonetheless, it is noteworthy that this level of sensitivity emerges over the course of a 2-hour session. In fact, Davison and Baum (2000) have shown that average slope estimates reach values between 0.52 and 0.63 within 10 reinforcers following a condition change. Taken together, these data suggest 37 that initial preference for a newly rich lever is acquired rapidly, but that a slower rise to asymptotic ratios emerges after extended exposure. For the purposes of comparison, two matching functions were fit for each subject: one using obtained reinforcer ratios as the independent variable and one using scheduled reinforcer ratios. In most cases, slope and r 2 values were only slightly lower for scheduled than for obtained reinforcer ratios. With one notable exception (subject 104), scheduled reinforcer ratios accounted for between 75 and 94% of the variance in response ratios, indicating that fits constructed with obtained ratios described the data only slightly better than those obtained with scheduled ratios. This is not surprising considering that scheduled and obtained ratios are discrete and continuous variables, respectively. While obtained reinforcer ratios have traditionally been used in concurrent schedule research, scheduled ratios are sometimes useful for assessing the effects of other variables on matching, such as drug or toxicant exposure, or when an extreme bias exists. In such cases, obtained and scheduled reinforcement ratios produce very different matching functions, and therefore provide different information about choice. For example, in monkeys exposed during gestation to lead or methylmercury, Equation 2.1 described response ratios well when fits were constructed using obtained reinforcement ratios, but very poorly when scheduled ratios were used, an important effect of developmental neurotoxicant exposure that would have been overlooked had researchers relied solely on obtained reinforcer ratios (Newland et al., 1994). The discrepancy occurred because monkeys exposed to high levels of lead or methylmercury often perseverated on an initially rich lever that became lean after a transition was imposed (see also Paletz et al., in press or Reed, Paletz, & Newland, 2006). Response ratios are a joint 38 function of the scheduled ratios and of behavior. That is, responding on an alternative is required in order for a reinforcer to be delivered. Obtained reinforcer ratios work well when there is adequate responding on both alternatives but convey different information when, as in the case of neurotoxicant exposure, there is inadequate sampling of both response alternatives. The use of scheduled ratios allows for the assessment of changes in behavior as a sole function of changes in the environmental contingencies, an effect that may appear in toxicant and drug studies. While no such manipulations were reported here, parameters are provided for comparison with possible future studies. The Acquisition of Choice The distributions of the number of responses during a visit across key segments of the transition (Fig 2.6 and 2.7) suggest that there are two phases to the acquisition of choice. The first phase is dominated by an increase in the right tail of the distribution for the rich alternative. This increase arises early in the transition but not immediately. Note that only the right tail of the distribution is affected, while the increases in the mode are more modest. Thus, choice is driven not by a wholesale shift in the visit length, but rather by the emergence of a few very long visits on the rich lever. In fact, the number of long visits on the rich lever increases sufficiently to double the average number of responses per visit by the end of the transition phase. This early increase in behavior ratios is followed by a more gradual shift in the left end of the distribution, which describes short visit durations. These increases in response bouts on the rich lever are accompanied by a corresponding decrease in bout length on the lean lever. The magnitude of such decreases is much lower than that associated with the increases on the rich lever, which is 39 probably due to a floor effect (i.e., the fewest number of responses that can occur in a single visit is one). By the end of the session, the mode of the distribution on the lean lever decreased by approximately half. Therefore, it appears as though initial behavior change is characterized by an increase in the right tail of the bout length distribution, while long term changes are due to a more gradual shift in the center (mode) of this distribution. It simply takes more time for the mode to shift than the mean. Additional studies are needed to confirm this hypothesis as we do not have data regarding long-term changes in these measures. In addition to steady state analyses, single-session transitions provided a way to mathematically describe transitional behavior within an individual session using parameters that reflect actual physical properties of that behavior (Newland & Reile, 1999). While statistical analyses did not show any significant differences in these measures as a function of magnitude, the trends in individual data observed in Figure 2.4 suggest that the rate of acquisition was higher for 16:1 and 32:1 transitions than for 8:1 sessions. It may be the case that larger discrepancies in reinforcement rates increase the discriminability between the two alternatives, thereby increasing the rate at which choice is acquired. Single Session Transitions and the Neural Determinants of Choice The ability to study the acquisition of choice in a single session is a methodological advance over approaches that require several days to weeks, and could facilitate the study of other factors that may influences choice and its acquisition. For example, despite a huge literature on behavioral determinants of choice and the growing 40 importance of choice in the neuroscience of the cerebral cortex and basal ganglia (e.g., Schultz, Tremblay & Hollerman, 2000; Tremblay & Schultz, 1999), very little drug research has been conducted on potential neural or pharmacological influences on the acquisition of choice. This is likely due to the time-intensive nature of the traditional procedure. If we conservatively estimate that it takes two weeks for behavior to stabilize under each of five reinforcer ratios, it would take 10 weeks to construct a full matching function. To determine a dose-effect curve using five conditions (control, saline, and three drug doses), it would take a minimum of 60 weeks to collect the necessary data. This is nearly half of a rat?s life span and introduces age as a confounding factor. Using a single-session paradigm, dose-effect curves can be evaluated over the period of a few months. The Logistic Equation for Modeling Choice Equation 2.2, a form of the logistic equation, was selected largely for empirical reasons. First, it provides the appropriate shape with which to describe transitions, and it produces parameters that are easily interpretable in terms of behavior. That is, Y m , X half , and k reflect the asymptotic behavior ratio, the number of reinforcers required for the transition to become half-way complete, and the rate of behavior change, rather than functioning as hypothetical constructs. By using these terms, it is possible to describe concretely the course of a transition and make meaningful comparisons between conditions. Second, the logistic model provided a good fit for the transitional behavior that occurs over the course of a single session. This equation was far superior to concave 41 functions such as hyperbolic or exponential functions, but was indistinguishable from other S- shaped functions, such as Gompertz equations (Newland & Reile, 1999). It might be noted that both the logistic and Gompertz arise in models of population dynamics, selectionistic models that might be adapted to describe choice (Edelstein- Keshet, 1988). While Equation 2.2 did a good job of modeling behavior in transition, there remain a few issues that require further investigation. Most of these issues involve cases in which estimated parameters either did not describe the data well or could not be interpreted meaningfully. In some cases, the slope parameter was estimated to be greater than 1. In a literal sense, this means that a transition has occurred within one reinforcer, or in practice, during the course of one visit. For example, a slope value of 38 is literally interpreted as follows: during the transition, the delivery of each additional reinforcer yields an increase of 38 in the log response ratio. Alternatively, it can be shown algebraically that 1/k = the number of reinforcers that span the middle third of the transition. Therefore, a slope of 38 means that 1/38 reinforcers were required to span 1/3 of the transition. From a practical standpoint, this is indistinguishable from saying that the transition occurred within a single visit, and as such, is no different from a situation in which k = 1. Further, the error in the estimate of slopes of this magnitude was generally very large, which led to a lack of confidence that this number was different from 1.0. Because we cannot resolve more finely than a single visit., the error bars showed that such slopes are indistinguishable from 1.0, and the extreme slope values caused estimate averages to be very unstable, all slope values greater than 1 were set equal to 1 in the 42 present study. This happened rarely, and was done for three 8:1, three 16:1, and two 32:1 transitions. A more difficult situation arose when behavior ratios were still increasing at the end of a session and failed to reach an estimable asymptote. This was usually manifested as extremely high Y m values and X half estimates that exceeded the range of the data obtained, and confidence intervals that overlapped with zero. Typically, these involved cases in which there was no discernable transition or, for some reason, transitions that proceeded very slowly. Unfortunately, there is no simple or logical remedy for this issue. In this study, transitions that produced (a) magnitude estimates that were never achieved and (b) X half estimates that exceeded the total number of reinforcers delivered were eliminated from analyses. This occurred for relatively few transitions, but appeared to happen most frequently in the 32:1 condition. It is possible that larger reinforcer ratios produce poorer fits because response ratios did not have time to completely stabilize over the course of 2 hours. An assumption embedded in the use of the logistic model might be noted here. This equation is symmetric about the half-max value, yielding a function that is symmetric about the inflection point. That is, the rising portion to the left of the inflection point is a mirror image of the asymptotic portion to the right of that value. The Gompertz model does not make this assumption of symmetry but, instead, produces a rapidly rising portion followed by a slower asymptote to the right of the inflection point. Conceptually, this has some appeal but, was not used here for two main reasons. In comparisons using variance accounted for and standard error of regressions as markers of the quality of the fit, the Gompertz model was indistinguishable from the logistic model 43 (Newland & Reile, 1999). Moreover, the Gompertz model is more complex and the parameters are somewhat more difficult to interpret. However, future investigations might focus on the issue of symmetry and functional forms of the Gompertz that yield interpretable parameters. In the present study, we have demonstrated the utility of single-session transitions in assessing both transitional and steady state behavior generated under concurrent schedules of reinforcement. This procedure provides a method for collecting data on choice and its acquisition in relatively short periods of time, and the logistic equation used to model the transition provides idemnotic parameters that can be used to chart the course of acquisition and to make meaningful comparisons across a variety of experimental conditions. 44 Table 2.1 Accumulation of Reinforcers on a Visit-by-Visit Basis Alternative 1 Alternative 2 Visit Responses Reinforcers Responses Reinforcers Rs Ratio Cumulative Reinforcers 1 10 3 18 4 10/18 7 2 8 3 12 2 8/12 12 3 15 5 9 2 15/2 19 45 Table 2.2 Number of Sessions Included in Statistical Analyses Transition Magnitude 4:1 8:1 16:1 32:1 Number of sessions included 14 30 31 22 Excluded for Y m and X half criteria 0 1 1 6 Excluded for other reasons 1 2 1 0 4 1 Other reasons include a failure of the statistical algorithm to estimate a function and missing sessions. 46 Table 2.3 Parameter Estimates for Matching Functions Slope Intercept (log c) r 2 Rat SCH OBT SCH OBT SCH OBT 101 0.60 0.66 -0.18 -0.13 0.76 0.90 102 0.53 0.59 0.001 0.02 0.93 0.94 103 0.55 0.56 0.08 0.06 0.94 0.94 104 0.45 0.69 0.01 0.11 0.42 0.76 105 0.57 0.62 0.09 0.08 0.86 0.89 106 0.71 0.75 0.07 0.04 0.90 0.91 107 0.42 0.47 0.01 0.002 0.75 0.74 108 0.64 0.73 -0.23 -0.18 0.90 0.94 47 FIGURE CAPTIONS Figure 2.1. Representative 32:1 transition (from subject 106). Log response ratio is plotted as a function of cumulative reinforcers earned. Data points are LOWESS smoothed values. The horizontal line indicates the beginning of the transition (x = 0). Negative values of x are reinforcers delivered prior to the transition. BL = baseline. Figure 2.2. Response and changeover rates as a function of session segment and transition magnitude. Error bars are ? 1 SEM. BL = baseline, TR = transition, END = last 30 minutes of session. Figure 2.3. Matching functions for individual animals. Matching equations and variance accounted for are provided in the lower right hand corner of each figure. Dashed diagonal line represents perfect matching (i.e., B 1 /B 2 = R 1 /R 2 ). Figure 2.4. Average X half , slope, and Y m as a function of transition magnitude. Closed symbols represent group means, and error bars are ? 1 SEM. Open symbols are parameters from individual matching functions. 48 Figure 2.5. Frequency distribution of responses per visit during a 32:1 transition for a single subject (rat 101). Left panels represent responses on the rich (i.e., left) lever, while right panels represent responses on the lean (i.e., right) lever. BL = baseline, TR = transition, END = last 30 minutes of session. Figure 2.6. Average response rate as a function of transition magnitude and session segment. Means and modes are presented in the top and bottom panels, respectively. Error bars are ? 1 SEM. Closed symbols = rich lever, open symbols = lean lever, BL = baseline, TR = transition, END = last 30 minutes of session. Cumulative Reinforcers -40 -20 0 20 40 60 80 100 120 140 160 180 Log Response Rati o -0.2 0.0 0.2 0.4 0.6 0.8 1.0 y = 0.68 / (1 + exp(0.07 * (30.2 - x))) BL Transition Log Response Rati o FIGURE 2.1 49 Session Segment BL TR END BL TR END BL TR END BL TR END BL TR END Ch an geo ver s / M i n u t e 0 4 6 8 10 12 Responses / Minute 0 10 20 30 40 1:1 4:1 8:1 16:1 32:1 Ch an geo ver s / M i n u t e Responses / Minute FIGURE 2.2 50 0.01 0.1 1 10 100 0.01 0.1 1 10 100 0.01 0.1 1 10 100 0.01 0.1 1 10 100 0.01 0.1 1 10 100 0.01 0.1 1 10 100 101 Y = -0.13 + 0.66x r 2 = 0.90 105 Y = 0.08 + 0.62x r 2 = 0.89 102 Y = 0.02 + 0.59x r 2 = 0.94 106 Y = 0.04 + 0.75x r 2 = 0.91 103 Y = 0.06 + .56x r 2 = 0.94 104 Y = 0.11 + .69x r 2 = 0.76 Y = 0.002 + 0.47x r 2 = 0.74 107 Y = -0.18 + 0.73x r 2 = 0.94 108 Obtained Reinforcer Ratio Respons e Ratio Respons e Ratio FIGURE 2.3 51 Transition Magnitude 81632 Y max 1 2 3 4 5 6 Slope 0.01 0.1 1 H ma x 0 20 40 60 80 Mean Individual Y max Slope H ma x FIGURE 2.4 52 Number of Responses per Visit F r eq ue ncy Rich Lever Lean Lever BL BL TR TR END END Mean = 2.3 (1.2) Mode = 1.6 Mean = 3.5 (2.5) Mode = 1.8 Mean = 4.4 (3.0) Mode = 2.5 Mean = 1.8 (0.8) Mode = 1.3 Mean = 5.4 (3.1) Mode = 3.1 Mean = 1.2 (0.4) Mode = 1.0 F r eq ue ncy F r eq ue ncy FIGURE 2.5 53 Responses / Visi t (Mean) 0 5 10 15 BL TR END Scheduled Reinforcer Magnitude 4 8 16 32 4 8 16 32 4 8 16 32 Responses / V i sit (Mode) 0 5 10 15 Responses / Visi t (Mean) Responses / V i sit (Mode) FIGURE 2.6 54 55 CHAPTER 3: THE EFFECTS OF DISCRIMINATIVE STIMULI AND d-AMPHETAMINE ON THE ACQUISITION OF CHOICE ABSTRACT Twenty-four female Long-Evans rats were trained to respond under concurrent schedules of reinforcement during 2-hour long experimental sessions. Thirty minutes into each session, a transition occurred between the initial pair of schedules and a second pair of schedules to allow for the analysis of choice and its acquisition. Subjects were divided into three groups according to the degree to which this transition was signaled, and the effects of three doses of d-amphetamine were examined. Results revealed amphetamine effects on response rates, changeover rates, sensitivity to reinforcement, and the rate at which behavior changed following a transition. Only changeover rates were influenced by stimulus group. These data suggest that the dopamine agonist d- amphetamine increases sensitivity to reinforcement, and supports the hypothesis that dopamine plays an important role in choice and learning. Concurrent schedules of reinforcement are one of the most commonly used preparations for studying choice in animal models. Under such a schedule, two (or more) response alternatives are available to the subject. Responding on each may be reinforced by schedules that provide similar or, more often, quite different rates of reinforcement. More behavior tends to occur on the lever providing the greater reinforcement density known as the ?richer alternative?. Behavior under such schedules is well-characterized and has been successfully modeled using the generalized matching function, in which the allocation of behavior between two response alternatives is described as a function of the ratio of reinforcers delivered from the two alternatives (Baum, 1974; Davison and McCarthy, 1988). This function takes the form log log log B B ca R R 1 2 1 2 =+ ? ? ? ? ? ? . [3.1] Here, B 1 /B 2 is the ratio of responses on one lever to responses on the other, and R 1 /R 2 is the scheduled or obtained reinforcer ratio. Log c and a are free parameters that represent the y-intercept and slope of the function, respectively. Log c shows response bias, or a tendency to respond more frequently on the lever represented in the numerator of the response ratio, due to factors other than the reinforcer ratio. The parameter a is viewed a measure of behavioral sensitivity to the reinforcer ratio (Baum, 1974) or to the difference between the two reinforcer ratios presented (Davison and Nevin, 1999). 56 57 A large body of literature has been devoted to the study of choice under concurrent schedules, making the matching relation among the most replicable phenomena in behavioral science, but this preparation has rarely been used to study drug effects on choice. This is likely due to the time-intensive nature of the procedure. As traditionally conducted, at least 5 reinforcer ratios are imposed in a serial fashion, and each is continued until behavior reaches a predetermined criterion for stability. Therefore, these studies can require many months and sometimes more than a year to conduct. Using such a ?multi-session? procedure to study drug effect would extend this even further, and can comprise a significant proportion of a rodent's life. Several studies have investigated the use of within-session transitions between two or more concurrent schedules (Bailey & Mazur, 1990; Davison & Baum, 2000; Newland, Yezhou, L?gdberg, & Berlin, 1994; Newland, Reile, & Langston, 2004; Mazur 1992, 1997; Mazur & Ratti, 1991). Single-session methods are useful because they allow for the evaluation of the acquisition of choice, an area of concurrent schedule research that has been relatively neglected. A single-session procedure described by Newland and Reile (1999) produces both steady-state and transitional behavior that allows for the assessment of choice and acquisition in a single session. In this procedure, 2 to 2.5-hour long sessions are divided into baseline and transition segments. During the baseline segment of each session, responding on the right and left levers is reinforced at the same rate (i.e., the reinforcer ratio is 1:1). Thirty minutes into the session, the reinforcer ratio changes such that either the right or left lever becomes rich (the transition segment). Behavior typically shifts from approximate indifference during the baseline segment to a preference for the richer alternative during the transitional segment. Response ratios usually stabilize within the last 30 minutes of the session, and these response ratios are then used to construct matching functions using Equation 3.1. The acquisition of choice in single-session transitions is modeled by calculating the response ratio on a visit-by-visit basis (Newland & Reile, 1999). That is, each time the animal completes a visit on the left and the right alternative, a response ratio is calculated. . These ratios are plotted over the course of the entire session as a function of cumulative reinforcers earned. Data are normalized by subtracting the median response ratio during the baseline portion of session from each data point. This ensures that the lower asymptote of the logistic function is always 0. These data are smoothed using a LOWESS algorithm that permits overall trends to appear. The following logistic function is applied to the smoothed data: )( 2 1 1 log XXk m half e Y B B ? + = . [3.2] Equation 3.2 describes an I- shaped function. The independent variable X is the number of reinforcers that have been delivered, and the dependent variable is the log of the ratio of the number of responses on the rich lever to the number of responses on the lean lever (B 1 /B 2 ). The left, lower, horizontal asymptote represents the median of the response ratios before the change in reinforcement ratios. Because of normalization, it is forced to a value of zero on a log scale (representing a raw ratio of 10 0 = 1.0). The right, upper, horizontal asymptote (Y m ), estimates response ratios during the last 30 minutes of the session, and is estimated using non-linear regression. Because all data are normalized 58 59 with respect to baseline response ratios, Y m represents the magnitude of the transition, or the difference between the asymptotic response ratio at the end of the session and the response ratio during the baseline portion of the session. Two parameters that describe the course of the transition are k and X half . The parameter k is the slope of the rising portion ofI-shaped function, higher values of which represent a faster transition, while X half is the number of reinforcers that have been delivered when the transition is half-way complete. This parameter positions the rising portion horizontally. High values of X half indicate that the transition begins after many reinforcers have been delivered (i.e., acquisition proceeds at a slow rate). The values of Y m, k, and X half are estimated using nonlinear regression. Equations 3.1 and 3.2 have been used to describe the effects of developmental exposure to lead in squirrel monkeys (Newland et al., 1994) and methylmercury in squirrel monkeys (Newland et al., 1994) and rats (Newland et al., 2004) on the acquisition and maintenance of choice. Multiple-session transitions were used in the earlier study, and single-session transitions were used in the later study. Results showed that steady-state parameters from Equation 3.1 were affected by the highest levels of exposure. At lower exposure levels, steady-state parameters were indistinguishable from control animals, but differences in the rate parameters generated with Equation 3.2 (especially X half ) showed that the rate of acquisition was retarded among exposed subjects. This was indicative of a subtle impairment in acquisition while final performance remained intact following neurotoxicant exposure. The present study was designed to apply this procedure to the study of d- amphetamine on the acquisition of choice. Because the effects of d-amphetamine are due 60 mostly to its actions at dopaminergic synapses, we may also be able to determine the role of this neurotransmitter system in the acquisition of choice. In addition to d- amphetamine, we also examined the effects of providing an external stimulus to signal the transition. Drug effects on fixed interval (Laties & Weiss, 1966) and fixed consecutive number (Laties, 1972; Szostak & Tombaugh, 1981) schedules of reinforcement can be ameliorated by the presence of discriminative stimuli. Further, research on concurrent schedules has demonstrated that the rate of behavioral transitions is increased when transitions are signaled (Bourland & Miller, 1981; Hanna, Blackman, & Todorov, 1992; Kr?geloh and Davison, 2003; Miller, Saunders, & Bourland, 1980). We hypothesized that stimuli signaling the commencement of the transition or the location of the rich lever would modify the acquisition of choice. Method Subjects Twenty-four female Long-Evans rats (Harlan Laboratories, Indianapolis, Indiana) were housed at the Biological Research Facility at Auburn University, an AAALAC- accredited facility, on a 12h-12h dark/light cycle (lights on at 0600h). Single Plexiglas "shoebox" cages measuring 42 x 21.5 x 20.5 cm housed two rats each. A clear plastic divider designed to divide the cage in half along the longer diagonal enabled individual control over feeding and provided a triangular-shaped living space with a wide region for moving about at the base and a nesting region at the apex. Asymptotic body weights were obtained by feeding subjects ad libitum for several weeks. Once body weights 61 stabilized, a food restriction regimen was initiated to reduce and maintain each rat at 85- 90% of their ad libitum weights, which ranged from 220 to 230 g. Apparatus Experiments were conducted in operant chambers (Med Associates, Georgia, Vermont) and housed in sound-attenuating cabinets. Each chamber contained three levers: two retractable levers located on the front panel and one standard lever located on the back wall. During this study, only the front levers were active, and each was calibrated to 0.20 N. Three light emitting diodes (LEDs) were positioned above each front lever and a 20 mg sucrose pellet dispenser was located equidistantly between the two levers. A house light (28 V 100 ma) was located near the ceiling of the chamber on the front panel and was aligned with the center of the pellet dispenser. Experimental events were programmed and controlled by a computer using MED- PC IV software. This computer was located in a room adjacent to that which housed the operant chambers. Procedure Once body weights stabilized, an autoshaping procedure (Paletz, Day, Craig- Schmidt, & Newland, in press) was used to train lever-pressing on one of the front levers. Reinforcers were 20 mg sucrose pellets. The autoshaping procedure ended after 10 responses were made, at which point responding was reinforced on a fixed ratio (FR) 1 schedule of reinforcement until 90 additional pellets were earned. The following evening, subjects were trained to respond on the opposite lever using the same procedure. 62 Once all subjects completed autoshaping, concurrent schedule training began. Responding was initially reinforced on concurrent variable interval 60? schedules (conc VI 60? VI 60?) in 120 minute sessions, yielding an average overall reinforcement rate of 2 pellets per minute. Single-session transitions commenced when the proportion of responses occurring on the left lever stabilized between 0.4 and 0.6 for the last three of at least five consecutive sessions. The overall average reinforcement rate remained constant throughout each session (2 per minute) and under all conditions. However, 30 minutes into each session, one of two possibilities occurred. Either the reinforcer remained 1:1 or it changed to one of the following: 32:1, 1:32, 16:1, 1:16, 8:1, 1:8, 4:1, and 1:4. When the ratio changed, this was considered to be a ?transition session.? Transition sessions occurred on Tuesdays and Fridays, and on Mondays, Wednesdays, and Thursdays, the reinforcement ratio remained 1:1 throughout the session. Two sessions were conducted daily during the light portion of the light-dark cycle. Four animals from each group were assigned to each session to control for possible time of day effects. Stimulus conditions. The beginning of each session was signaled by illuminating the house light and LEDs above each lever. These stimuli remained on throughout the baseline portion of each session for all subjects. For one group (?discriminative stimulus?, or SD, group), the transition was signaled by a 10 s blackout 30 minutes into the session, during which time all lights were extinguished and lever presses had no programmed consequences. The houselight was illuminated at the end of the 10 s blackout, as was the LED over the newly-rich lever, and both remained lit throughout the 63 session. This was viewed as a discriminative stimulus (S D ) signaling the rich response alternative. During sessions in which the reinforcement ratio remained 1:1 throughout the session, both LEDs were illuminated following the 10 s blackout. For a second group (?black out?, or BO, group), the transition was also signaled by a 10 s blackout. However, both LEDs and the houselight were illuminated during the transition portion of the session. For the final group (?time out?, or TO, group), a 10 s timeout occurred between the baseline and transition portions of the session, during which time all lights remained illuminated but lever-presses had no programmed consequences. For the TO group, all stimuli remained consistent throughout the entire session. Drug challenges. Drug challenges were conducted on Tuesdays and Fridays, and Thursdays served as saline controls for baseline sessions. Transition sessions occurred on Wednesdays, with some of these sessions serving as saline controls and some of them serving as no-injection controls. Non-injection baseline sessions were conducted every Monday to control for possible weekend effects (e.g., weight gain or loss). Subjects were exposed to the drug doses in ascending order, and completed four transitions (two per magnitude, one in which the left side became rich and one in which the right side became rich) and one baseline session prior to increasing the dose of amphetamine. 64 Drugs d-Amphetamine sulfate (Sigma Chemical Co., St. Louis, MO) was dissolved in 0.9% saline and injected in a volume of 1.0 ml/kg body weight. Subjects were exposed to 0.3, 0.56, and 1.0 mg/kg body weight (calculated as the salt). Data Analysis Overall response and changeover rates. Overall response rates were averaged across sessions and plotted as a function of amphetamine dose for 1:1 and combined 1:32 / 32:1 transitions individually. Separate repeated measures analyses of variance (RMANOVA) using group as a between-subjects factor and dose as a within-subjects factor were used to detect differences in response rates at each transition magnitude. The average number of changeovers (COs) per minute was calculated for 1:1, 1:32, and 32:1 transitions. A CO occurred when an animal stopped responding on one lever and made at least one response on the other. More simply stated, a CO is made when an animal switches over from the right to the left, or the left to the right, lever. Separate CO rates were calculated for the first 30 minutes (BL) and the last 30 minutes (END) of each session. The mean CO rate was calculated for each pair of reciprocal transitions (e.g., 32:1 and 1:32) transitions, yielding BL, TR, and END CO rates for 1:1 and 32:1 magnitudes. Average CO rates were plotted as a function of amphetamine dose for each session segment at each magnitude, and RMANOVAs were used to determine effects of stimulus group and drug dose on CO per minute. 65 Steady state. Response ratios from the last 30 minutes of each session were plotted as a function of scheduled and obtained reinforcer ratios. Equation 3.1 was fit to these data using linear least-squares regression to evaluate steady state behavior that emerges at the end of each transition. Between-group differences were assessed using one-way analyses of variance (ANOVA) with stimulus group (SD, BO, and TO) as the factor of interest. Transitions. Log response ratios were plotted as a function of cumulative reinforcers earned on a visit-by-visit basis, where each visit consisted of a bout of responding on the right lever and a bout of responding on the left lever. The median log response ratio before the transition was calculated and subtracted from each data point to normalize the data. Data collected during the last 90 minutes of each session (the transition period) were smoothed using a 9-point LOWESS smoothing algorithm. Equation 3.2 was then fit to the smoothed data using non-linear least-squares regression. Each fit produced three parameters per subject per transition (k, X half , and Y m ). Data were then combined within each reinforcement magnitude (32:1 and 8:1) by calculating the median parameter estimates from the left (e.g., 32:1) and right (e.g., 1:32) transitions for each subject. Medians were averaged within each stimulus group, yielding mean k, mean X half , and mean Y m values for each stimulus condition at each transition magnitude. Stimulus group and amphetamine effects were evaluated using RMANOVAs for each parameter at each transition magnitude. Because the distribution of k values was non- normally distributed, all analyses were conducted using log-transformed slope values. 66 Linear and non-linear least-squares regression procedures were conducted using RS1 software (Brooks Automation, Inc., Chelmsford, MA), and RMANOVAs were performed with Systat 11.0 (Systat Software, Inc., San Jose, CA). For all statistical analyses, a Type I error rate of 0.05 was used, and all degrees of freedom for within- subjects tests were adjusted using the Huynh-Feldt correction. Results Four 8:1 and four 32:1 transitions comprised control conditions, while data for saline and amphetamine were derived from two transition sessions each. For both 8:1 and 32:1 transitions, the left lever became rich in half of the sessions and the right lever became rich in the other half. Transition sessions were excluded from all statistical analyses if they met both of the following criteria: (1) Y m estimates exceeded the transition magnitude observed at the end of the session and (2) X half estimates exceeded the total number of reinforcers delivered during the session. Saline sessions were not included in transition analyses for the 8:1 magnitude because a day-after effect was observed following a 0.56 mg/kg amphetamine session. Response rates as a function of dose are shown in Figure 3.1. Amphetamine resulted in a modest, dose-related decrease in overall response rate during both the 1:1 (F (4, 80) = 5.36, p = .006, ? = 0.59) and 32:1 (F (4, 80) = 4.01, p = .009, ? = 0.82) transition magnitudes. Figure 3.2 shows dose effect curves for CO rates. Amphetamine produced a modest biphasic effect on changeover rate during the baseline portion of 1:1 (F (4, 80) = 9.832, p = .000, ? = 0.61) and 32:1 (dose: F (4, 80) = 4.846, p = .003, ? = 0.86) sessions. There was also a significant effect of group on CO rate during baseline at both 67 magnitudes (F (2, 20) = 4.053, p = 0.033 for 1:1 and F (2, 20) = 4.408, p = 0.026 for 32:1). During the last 30 minutes of experimental sessions, RMANOVAs indicated a significant main effect of dose on CO rate for 1:1 transitions (F (4, 80) = 2.951, p = .05, ? = 0.62) and a significant main effect of group on CO rate for the 32:1 transitions (F (2, 20) = 5.567, p = 0.012). Where drug effects were present, CO rates increased slightly over control rates at the low dose of amphetamine and then decreased below control rates at moderate and high doses. Slopes and r 2 values from matching functions fit with obtained (left) and scheduled (right) reinforcer ratios are shown as a function of amphetamine dose and group in Figure 3.3. All doses of amphetamine produced a similar increase in slope (F (4, 80) = 13.005, p = .000, ? = 0.81) and r 2 (F (4, 76) = 9.814, p = .000, ? = 0.84) for functions constructed with obtained reinforcer ratios. There was a similar main effect of dose on slope (F (4, 80) = 5.639, p = .001, ? = 0.89) and r 2 (F (4, 76) = 5.891, p = .001, ? = 0.90) values fit with scheduled ratios. Neither dose nor group significantly affected bias estimates when obtained or scheduled ratios were used as the independent variable. Representative matching functions for a single subject are shown in Figure 3.4. Figure 3.6 shows dose effect curves for transition parameters for 8:1 and 32:1 transitions. There was a main effect of dose on all parameter estimates for 32:1 transitions, and on slope and X half for 8:1 transitions (see Table 3.1 for F and p values). Representative 32:1 transitions are shown for subject 106 during control and 1.0 mg/kg amphetamine sessions in Figure 3.7. 68 Discussion Rats responded under concurrent schedules of reinforcement during experimental sessions that contained a transition between two pairs of concurrently available reinforcement schedules. Terminal response ratios were used to construct matching functions using the generalized matching equation, while behavior that spanned the transitory period was modeled using a logistic function. The results of the present study show that behavior observed under these conditions was influenced by the administration of d-amphetamine, the presence of stimuli that signaled the transition, and the magnitude of the transition. Overall Measures of Response and CO Rate In general, the amphetamine doses and reinforcement contingencies used here did not result in large, drug-related effects on either the overall response CO rates. These doses were well below those that substantially reduce response rates. As expected, only modest rate increases were seen because amphetamine's rate-increasing effects are strongest when response rates are quite low, and the VI schedules used here resulted in moderate response rates. However, some interesting trends in response and CO rates can be noted. Overall response rates increased following 0.3 mg/kg of d-amphetamine and declined to control levels (or slightly lower) in a dose-dependent fashion for both 1:1 and 32:1 sessions. It is important to note that response rates remained high enough to provide useful data for transition and terminal analyses at all doses of amphetamine. The discriminative stimulus presented at the onset of the transition did not influence overall 69 rate (there was no statistical effect of group membership), but responding appeared to be highest among animals in the TO group and show less variation as a function of drug dose in the 1:1 condition than other stimulus groups. During sessions in which the reinforcer ratio remained 1:1 throughout, the effect of amphetamine on CO rates was similar during baseline and terminal portions of the session: low doses increased and higher doses decreased CO rates. Further, CO rates differed as a function of group during the baseline portion of 1:1 and 32:1 sessions, and during the last 30 minutes of 32:1 transitions. Graphically, this difference is manifested as a higher rate of changing over among animals in the TO group when compared to subjects in both signaled conditions. The low CO rates in the 32:1 condition replicate other reports showing that changing over increases as the reinforcer ratio approaches 1.0, i.e., as the reinforcer rates available on the two alternatives become more alike (Baum, 1974; Stubbs & Pliskoff, 1969). The pattern of group effects, however, is interesting. The pattern of changing over under concurrent schedules of reinforcement is sometimes described as ?fix and sample? (Aparicio & Baum, 2006; Baum, Schwendiman, & Bell, 1999). That is, subjects tend to spend most of the time responding on the rich lever (?fix?) and, from time to time, they make a few responses on the lean lever (?sample?). The visual trend in CO rates across groups indicated that visit durations were longer, and changing levers less frequent, when some stimulus change was associated with the transition. Changing levers was least frequent when a stimulus light signaled the richer lever. These data suggest that the presence of a stimulus associated with either a transition or the richer alternatives decreases this "fix-and-sample" pattern and prolongs visit durations on the 70 richer alternative. It can be noted that this was a subtle effect that, while supported by visual trends in the data, is only partially confirmed by statistical analyses. This pattern was present across all doses, indicating a lack of stimulus-drug interaction on CO rate. Matching Functions Matching functions were constructed using both obtained and scheduled reinforcer ratios. While obtained ratios are most commonly used in the concurrent schedule literature, chemical effects are sometimes best illuminated through the use of scheduled ratios. Because obtained ratios are a joint function of both the programmed reinforcer ratios and of behavior (Newland & Reile, 1999), they describe response ratios well when a sufficient number of responses occur on both the rich and lean levers. However, chemical insult sometimes produces a situation in which this fails to occur. In such cases, the use of scheduled ratios allows us to identify effects that are solely a function of changes in reinforcement contingencies. Although functions made with obtained ratios did not differ appreciably from those made with scheduled ratios in the present study, such differences have been present in previous studies. For example, monkeys exposed to lead or methylmercury in utero showed disrupted patterns of responding that were evident in matching functions that utilized scheduled reinforcement ratios but not in functions that were fit with obtained ratios (Newland et al., 1994). Such deficits would not have been detected had both types of functions not been examined. In the present study, Equation 3.1 accounted for a large proportion of variance in response ratios when either obtained or scheduled reinforcer ratios were used as the independent variable (see panels B and D of Figure 4.3). Further, the quality of these fits was 71 enhanced following exposure to d-amphetamine. This was particularly evident when obtained ratios were used as the independent variable. The results from matching and transition analyses show that d-amphetamine increased both sensitivity to reinforcement and the rate at which preference was acquired. Under control conditions, the mean slope across groups was 0.62. These values increased as a function of amphetamine dose, an effect that was slightly more pronounced in the SD group and when obtained ratios were used as the independent variable. The rate of acquisition was also influenced by administration of amphetamine. At all doses of amphetamine in 8:1 sessions, slope values were higher and X half values lower than in control conditions. These effects were also present at moderate and high doses of amphetamine for 32:1 transitions (see Figure 4.6). Taken together, these results suggest that the dopamine (DA) agonist d-amphetamine enhances sensitivity to reinforcement and to changes in the reinforcer distribution. This is not surprising given that amphetamine has been shown to promote DA activity in the nucleus accumbens, which plays a significant role in reinforcement (Spanagel & Weiss, 1999; Wise, 2004), and that mounting evidence suggests that DA is intimately involved in choice, especially as reinforcement contingencies change (Montague, Hyman, & Cohen, 2004; Tremblay & Schultz, 1999). In a recent report (Bratcher, Farmer-Dougan, Dougan, Heidenreich, & Garris, 2005), selective D 1 and D 2 agonists were reported to decrease sensitivity to reinforcement as indicated by effects on the slope of matching functions. Methodological differences between that study and the current one could point to important determinants of these drug-behavior interactions. For example, Bratcher et al. did not hold the overall 72 reinforcer rate constant, producing a situation in which the reinforcement rates and reinforcer ratios co-varied. This could be important because the effects of dopaminergic drugs are known to be rate-dependent and low reinforcer rates can result in low response rates. Second, the reinforcer ratios employed in that study were small (2:1 and 4:1 vs 8:1 and 32:1 in the present study), and subjects were exposed to each reinforcer schedule for multiple sessions. Thus, transitions between schedules did not occur while subjects were under the drugs' influence. In addition, multi-session procedures like those used by Bratcher et al produce sensitivity values higher than those observed in the single session transitions (Baum, 1974; Davison & Baum, 2000). In fact, two of three subjects exposed to SKF 38393 and all three subjects exposed to quinpirole demonstrated overmatching in baseline conditions. Finally, and potentially most important, are the types of drugs used. Amphetamine, which was used in the present study, blocks the re-uptake of DA, thereby affecting DA receptors in both the D 1 and D 2 families. Bratcher and colleagues used D 1 and D 2 -specific agonists (SKF 38393 and quinpirole, respectively) and the non-specific DA agonist apomorphine, whose relative affinity for D 1 and D 2 receptors varies as a function of dose. This is of particular interest because stimulation of both D 1 and D 2 receptors is often required for full expression of dopaminergic effects (Hodge, Samson, & Chappelle, 1997; Hodge, Samson, & Haraguchi, 1992; Ikemoto, Glazier, Murphy, & McBride, 1997). 73 Summary Single-session transitions provided a useful method for studying drug effects on choice and the acquisition of preference. The parameters produced by both the generalized matching equation and the logistic equation provided in Equation 3.2 were sensitive to drug exposure such that amphetamine increased both behavioral sensitivity to changes in reinforcement ratios and the rate at which behavior changed following a transition. These data provide further evidence that DA is involved in the acquisition and maintenance of choice. 74 Table 3.1 F and p-Values for Dose Effects on Transition Parameters 8:1 32:1 F (df; ?) p F (df; ?) p X half 21.1 (3, 60; 0.80) 0.000 22.9 (4, 84; 0.66) 0.000 Slope 11.3 (3, 60; 0.86) 0.000 17.3 (4, 84; 0.92) 0.000 Y m 2.0 (3, 60; 0.75) 0.144 3.1 (4, 84; 0.75) 0.034 75 FIGURE CAPTIONS Figure 3.1. Overall response rates as a function of amphetamine (mg/kg) and stimulus group. Upper panel is from 1:1 sessions and lower panel is from 32:1 transitions. Figure 3.2. Changeover rates as a function of amphetamine (mg/kg) and stimulus group. Upper panels are from baseline portions and lower panels are from the last 30 minutes of each session. Left panels are from 1:1 sessions and right panels are from 32:1 transitions. Figure 3.3. Steady state parameter estimates as a function of amphetamine (mg/kg) and stimulus group. Left panels were calculated using obtained reinforcer ratios as the independent variable and right panels were calculated using scheduled reinforcer ratios. Figure 3.4. Representative matching functions for control, 0.3 mg/kg, and 1.0 mg/kg conditions from subject 103. Panels on the left were constructed using obtained reinforcement ratios as the dependent variable, while panels on the right were constructed using scheduled reinforcement ratios as the dependent variable. Figure 3.5. Transition parameter estimates as a function of amphetamine (mg/kg) and stimulus group. 76 Figure 3.6. Representative transitions during control (upper panel) and 1.0 mg/kg AMP sessions. Log response ratio is plotted as a function of cumulative reinforcers earned. Data points are LOWESS smoothed values. The vertical line indicates the beginning of the transition (x = 0). Negative values of x are reinforcers delivered prior to the transition. R e sp on se s / M i n u t e Amphetamine (mg/kg) CS0.3 0.56 1 0 10 20 30 40 0 10 20 30 40 1:1 32:1 R e sp on se s / M i n u t e FIGURE 3.1 77 Amphetamine (mg/kg) Changeov e rs / Minute 0 5 10 15 1:1 BL CS0.3 0.56 1 0 5 10 15 1:1 END 0 5 10 15 TO BO SD 32:1 BL CS0.3 0.56 1 0 5 10 15 32:1 END Changeov e rs / Minute FIGURE 3.2 78 Slope 0.2 0.4 0.6 0.8 1.0 Amphetamine (mg/kg) CS0.3 0.56 1 r 2 0.0 0.6 0.8 1.0 CS0.3 0.56 1 TO BO SD Obtained Scheduled Slope r 2 Slope r 2 Slope 0. 0. 0. 0. 1. r 2 FIGURE 3.3 79 0.01 0.1 1 10 100 Control y =.07 + .53x r 2 = .94 Control y =.10 + .53x r 2 = .94 0.01 0.1 1 10 100 0.3 mg/kg y =.14 + .81x r 2 = .93 0.3 mg/kg y =.12 + .72x r 2 = .89 0.01 0.1 1 10 100 0.01 0.1 1 10 100 1.0 mg/kg y =.08 + .96x r 2 = .99 0.01 0.1 1 10 100 1.0 mg/kg y =.09 + .80x r 2 = .98 Reinforcer Ratio R e sp on se R a tio ScheduledObtained R e sp on se R a tio FIGURE 3.4 80 CS0.3 0.56 1 1 10 100 Amphetamine (mg/kg) 32:1 C 0.3 0.56 1 Y max 1 10 Slope 0.01 0.1 1 10 H max 0 10 20 30 40 50 60 70 8:1 TO BO SD Y max Slope H max Y max Slope H max 30 40 50 60 70 . Y max Slope 0.0 0. 1 10 H max 0 10 20 0. Y max Slope 1 1 H max FIGURE 3.5 81 -40 -20 0 20 40 60 80 100 120 140 160 180 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 y = 0.73 / (1 + exp(0.13*(11 - x))) -0.2 0.0 0.2 0.4 0.6 0.8 1.0 y = 0.68 / (1 + exp(0.07 * (30.2 - x))) BL TRANSITION Cumulative Reinforcers Lo g R e sp on s e R a t i o CONTROL 1.0 mg/kg d-AMP Lo g R e sp on s e R a t i o Lo g R e sp on s e R a t i o FIGURE 3.6 82 83 REFERENCES Aparicio, C. & Baum, W. (2006). Fix and sample with rats in the dynamics of choice. Journal of the Experimental Analysis of Behavior, 86(1), 43-63. Bailey, J. & Mazur, J. (1990). Choice behavior in transition: Development of preference for the higher probability of reinforcement. Journal of the Experimental Analysis of Behavior, 53, 409-422. Baum, W. (1974). On two types of deviation from the matching law: Bias and undermatching. Journal of the Experimental Analysis of Behavior, 22(1), pp. 231-242. Baum, W. (1979). Matching, undermatching and overmatching in studies of choice. Journal of the Experimental Analysis of Behavior, 32, 269-281. Baum, W., Schwendiman, J., & Bell, K. (1999). Choice, condition discrimination, and foraging theory. Journal of the Experimental Analysis of Behavior, 71, 355-373. Bourland, G. & Miller, J. (1981). The role of discriminative stimuli in concurrent performances. Journal of the Experimental Analysis of Behavior, 36, 231-239. Bratcher, N., Farmer-Dougan, V., Dougan, J., Heidenreich, B, Garris, P. (2005). The role of dopamine in reinforcement: Changes in reinforcement sensitivity induced by D 1 -type, D 2 -type, and nonselective dopamine receptor agonists. Journal of the Experimental Analysis of Behavior, 84, 374-399. 84 Davison, M. & Baum, W. (2000). Choice in a variable environment: Every reinforcer counts. Journal of the Experimental Analysis of Behavior, 74, 1-24. Davison, M. & McCarthy, D. (1988). The matching law: A research review. Hillsdale, NJ: Lawrence Erlbaum Associates. Davison, M. & Nevin, A. (1999). Stimuli, reinforcers, and behavior: An integration. Journal of the Experimental Analysis of Behavior, 71, 439?482. Dews, P. (1958). Analysis of effects of psychopharmacological agents in behavioral terms. Federation Proceedings, 17, 1024-1030. Edelstein-Keshet, L. (1988). Mathematical models in biology. New York: Random House. Findley, J. (1958). Preference and switching under concurrent scheduling. Journal of the Experimental Analysis of Behavior, 1, 123-144. Fry, W., Kelleher, R., & Cook, L. (1960). A mathematical index of performance on fixed-interval schedules of reinforcement. Journal of the Experimental Analysis of Behavior, 3, 193-199. Hanna, E., Blackman, D., & Todorov, J. (1992). Stimulus effects on concurrent performance in transition. Journal of the Experimental Analysis of Behavior, 58, 335-347. Herrnstein, R. (1961). Relative and absolute strength of response as a function of frequency of reinforcement. Journal of the Experimental Analysis of Behavior, 4, pp. 267-72. 85 Hodge, C. W., Samson, H. H., & Chappelle, A. M. (1997). Alcohol self-administration: Further examination of the role of dopamine receptors in the nucleus accumbens. Alcoholism: Clinical & Experimental Research, 21, 1083-1091. Hodge, C. W., Samson, H. H., & Haraguchi, M. (1992). Microinjections of dopamine agonists in the nucleus accumbens increase ethanol-reinforced responding. Pharmacology, Biochemistry & Behavior, 43, 249-254. Ikemoto, S., Glazier, B. S., Murphy, J. M., & McBride, W. J. (1997). Role of dopamine D 1 and D 2 receptors in the nucleus accumbens in mediating reward. Journal of Neuroscience, 17, 8580-8587. Johnston, J. & Pennypacker, H. (1993). Strategies and tactics of behavioral research (Second Edition). Hillsdale, NJ: Lawrence Erlbaum Associates. Lander, D. & Irwin, R. (1968). Multiple schedules: Effects of the distribution of reinforcements between components on the distribution of responses between components. Journal of the Experimental Analysis of Behavior, 11(5), pp. 517- 524. Kr?geloh, C. & Davison, M. (2003). Concurrent schedule performance in transition: Changeover delays and signaled reinforcer ratios. Journal of the Experimental Analysis of Behavior, 79, 87-109. Lander, D. & Irwin, R. (1968). Multiple schedules: Effects of the distribution of reinforcements between components on the distribution of responses between components. Journal of the Experimental Analysis of Behavior, 11, 517-524. 86 Laties, V. (1972). The modification of drug effects on behavior by external discriminative stimuli. Journal of Pharmacology and Experimental Therapeutics, 183, 1-13. Laties, V. & Weiss, B. (1966). Influence of drugs on behavior controlled by internal and external stimuli. The Journal of Pharmacology and Experimental Therapeutics, 152(3), 388-396. Mazur, J. (1991). Choice behavior in transition: Development of preference with ratio and interval schedules. Journal of Experimental Psychology: Animal Behavior Processes, 18(4), 364-378. Mazur, J. (1997). Effects of rate of reinforcement and rate of change on choice behaviour in transition. The Quarterly Journal of Experimental Psychology, 50B(2), 111- 128. Mazur, J. & Ratti, T. (1991). Choice behavior in transition: Development of preference in a free-operant procedure. Animal Learning and Behavior, 19(3), 241-248. McSweeny, F. & Murphy, E. (2000). Criticisms of the satiety hypothesis as an explanation for within-session decreases in responding. Journal of the Experimental Analysis of Behavior, 74, 347-361. Miller, J., Saunders, S., & Bourland, G. (1980). The role of stimulus disparity in concurrently available reinforcement schedules. Animal Leering and Behavior, 8(4), 635-641. Montague, P.R., Hyman, S., & Cohen, J. (2004). Computational roles for dopamine in behavioural control. Nature, 431, 760-767. 87 Newland, M.C. & Reile, P. (1999). Learning and behavior change as neurotoxic endpoints. In H.A. Gilson & G.J. Harry (Eds.), Target organ series: Neurotoxicology (pp. 311-337). New York, NY: Raven Press. Newland, M.C., Reile, P., & Langston, J. (2004). Gestational exposure to methylmercury retards choice in transition in aging rats. Neurotoxicology and Teratology. Newland, M. C., Yezhou, S., L?gdberg, B., & Berlin, M. (1994). Prolonged behavioral effects of in utero exposures to lead or methyl mercury: Reduced sensitivity to changes in reinforcement contingencies during behavioral transitions and in steady state. Toxicology and Applied Pharmacology, 126, 6-15. Paletz, E., Day, J., Craig-Schmidt, M., & Newland, M. C. (in press). Gestational exposure to methylmercury and n-3 polyunsaturated fatty acids: Spatial and visual discrimination reversal effects in adult and geriatric rats. NeuroToxicology. Reed, M. N., Paletz, E., & Newland, M. C. (2006). Gestational exposure to methylmercury and selenium: Effects on spatial discrimination reversal in adulthood. NeuroToxicology, 27(5), 721-732. Schultz, W. Tremblay, L. & Hollerman, J. (2000). Reward processing in primate orbitofrontal cortex and basal ganglia. Cerebral Cortex, 10, 272-283. Sidman, M. (1960). Tactics of scientific research. New York, NY: Basic Books. Spanagel, R & Weiss, F. (1999). The dopamine hypothesis of reward: past and current status. Trends in Neurosciences, 22(11), 521-527. Stubbs, D. & Pliskoff, S. (1969). Concurrent responding with fixed relative rate of reinforcement. Journal of the Experimental Analysis of Behavior, 12, 887-895. 88 Szostak, C. & Tombaugh, T. (1981). Use of a fixed consecutive number schedule of reinforcement to investigate the effects of pimozide on behavior controlled by internal and external stimuli. Pharmacology, Biochemistry, and Behavior, 15, 609-617. Tremblay, L. & Schultz, W. (1999). Relative reward preference in primate orbitofrontal cortex. Nature, 398(6729), 704-708. Wise, R.A. (2004). Dopamine, learning and motivation. Nature Reviews Neuroscience, 5, 483-494.