Mechanisms and Performance Measures in Mastery-Based Incremental Repeated Acquisition: Behavioral and Pharmacological Analyses by Jordan Michele Bailey A thesis submitted to the Graduate Faculty of Auburn University in partial fulfillment of the requirements for the Degree of Master of Science Auburn, Alabama December 18, 2009 Keywords: backward chaining, d-amphetamine, forward chaining, repeated acquisition, lever press, rats Approved by M. Christopher Newland, Chair, Professor of Psychology Martha Escobar, Professor of Psychology Jennifer Gilis-Mattson, Professor of Psychology Abstract Pharmacological and behavioral determinants of learning were examined using a mastery-based incremental repeated acquisition procedure. A 60-minute session began with a one link chain (single lever-press) that incremented to a maximum of a four-link chain using three levers: Left (L), Right (R) and Back (B). Backward (5 rats) and forward (5 rats) training procedures were used to build the chain. In pseudo-randomized presentations, a ?performance? session (same chain every session) and a ?learning? session (chain differed from session to session) was imposed. Some learning chains had an embedded repeated response (e.g., LRRB) and some learning chains had no such repeat (e.g., LRLB). The product of chain-length and number of reinforcers over total reinforcers quantified progress during a session (progress quotient, or ?PQ?). After behavior stabilized, low doses of d-amphetamine (0.01 to 3.0 mg/kg, ip) were administered. Acquisition was consistently superior for the backward training group during non- repeating learning sessions, across all but the highest doses of d-amphetamine. Very low, clinically- relevant doses of d-amphetamine improved acquisition for the backward training group during repeating- learning sessions. This study identifies a set of conditions under which very low doses of d-amphetamine enhance learning. It also suggests a sensitive and valid measure of acquisition for use in studies of a mastery-based incremental repeated acquisition. ii Acknowledgments The author would like to thank Dr. Christopher Newland for his guidance. She would also like to thank her committee members Drs. Martha Escobar and Jennifer Gillis, for their constructive comments. iii Table of Contents Abstract .......................................................................................................................................................ii Acknowledgments...................................................................................................................................... iii List of Tables...............................................................................................................................................v List of Figures.............................................................................................................................................vi List of Abbreviations ................................................................................................................................. vii Chapter 1: Introduction Repeated Acquisition ...................................................................................................................2 Boren and Devine ...........................................................................................................4 Thompson and Moerschbaecher ....................................................................................5 Incremental Repeated Acquisition................................................................................................9 IRA Procedures across Different Species ....................................................................11 IRA Procedures: The Rat ................................................................................11 IRA Procedures: The Mouse ...........................................................................13 IRA Procedures: The Pig.................................................................................14 IRA Procedures: The Non-Human Primate.....................................................14 IRA Procedures: The Human ..........................................................................15 Training Procedure.........................................................................................................16 Chain Definition..............................................................................................................17 d-Amphetamine.............................................................................................................18 What is a Low Dose of d-Amphetamine?........................................................19 References .................................................................................................................................21 Tables.........................................................................................................................................28 Table Captions............................................................................................................................30 iv Chapter 2: Experiments ..........................................................................................................................31 Abstract .......................................................................................................................................31 Introduction .................................................................................................................................32 Method ........................................................................................................................................34 Results ........................................................................................................................................37 Discussion...................................................................................................................................39 References..................................................................................................................................44 Tables..........................................................................................................................................48 Table Captions............................................................................................................................49 Figures ........................................................................................................................................50 Figure Captions...........................................................................................................................53 v vi List of Tables Table 1 .....................................................................................................................................................28 Table 1b ...................................................................................................................................................47 Table 2 .....................................................................................................................................................29 vii List of Figures Figure 1 ....................................................................................................................................................49 Figure 2 ....................................................................................................................................................50 Figure 3 ....................................................................................................................................................51 viii List of Abbreviations IRA Incremental repeated acquisition RA Repeated acquisition ip Intraperitoneal PQ Progress Quotient Chapter 1: Introduction Learning is a broad term with many divergent definitions, but when the goal is to bring this phenomenon into the laboratory, one specific definition is especially useful. That is, learning refers to behavior in transition from one state of behavior to another (Cohn & Paule, 1995; Newland & Reile, 1999; Sidman, 1960). This is an important distinction because learning is a transient event, which creates the need for a specialized task that can accurately measure behavior in transition (Cohn & Paule, 1995). The repeated acquisition of a behavioral chain (RA) and later, the incremental repeated acquisition of a behavioral chain (IRA) procedures were developed to do just that. By directly measuring the acquisition of novel response sequences within a session these procedures repeatedly measure behavior in transition. Incremental repeated acquisition is used by only a few, relatively advanced, laboratories to investigate drug, toxicant, and genetic influences over learning. This procedure requires the subject to learn a different set of behavioral responses during each experimental session. This is accomplished by employing behavioral chains, which are essentially sequences of responses or response units. Take for example the case of teaching a child to tie her shoe. First she is taught to put on her sock, then to put her shoe on over her sock, and finally she must learn to tie the shoe. Each response, in term, can be viewed as a complex response unit. For example, putting on a sock itself entails a subordinate chain. The description used here is an illustration of a ?forward? chain. However, this same task can be taught using a ?backward? behavioral chain. In such case, the child would begin with the sock and shoe on, first learning to tie the shoe, then the child would begin with just the sock on and learn to put on the shoe and then tie it. Finally, the child would have to put on the sock, then the shoe, and then tie the shoe. These scenarios illustrate the two distinctly different ways to train a chain, and both forward and backward chaining have been identified as training procedures to build behavioral chains for an IRA procedure. However, little research has focused on a direct comparison of forward and backward training procedures with respect to their importance to an IRA procedure. It is possible that the type of training procedure used to build the behavioral chain could affect acquisition during the IRA procedure, and thus is a variable of interest. The particular responses (and order of those responses) that define a behavioral chain are also likely to influence acquisition during the IRA procedure. For example, a chain could contain consecutively repeated responses or could be defined such that no consecutive repeats occur (i.e. 1 ?repeating? chains and ?non-repeating? chains). Such a manipulation of chain ?definition? has not received much attention in the IRA literature but may prove to be an important variable affecting acquisition. Finally, the dopamine neurotransmitter system is an important neurotransmitter system that mediates both learning and reinforcer efficacy (Feldman, Meyer & Quenzar, 1997; Montague, Hyman & Cohn, 2004; Wise, 2004). d Amphetamine is an indirect dopamine-agonist and therefore can be used to examine the role dopamine may have in learning during the IRA procedure. This drug is prescribed clinically to treat problems associated with learning and attention (Towbin & Leckman, 1992). The doses of d amphetamine used in animal studies are usually much higher than those used clinically; therefore, the clinical relevance of much of the d amphetamine literature has been questioned (Grilly & Loveland, 2001). The proposed studies are designed to examine different approaches to behavioral chaining in a laboratory model, using rats acquiring chains of lever-presses. This will enable us to examine different approaches to training chains as well as different structures of chains in a controlled setting. Finally, having done that, we will be interested in challenging the acquisition of these chains with d amphetamine. Therefore, we should be able to characterize in detail the manner by which very low, clinically relevant doses of d amphetamine influence learning and performance using an incremental repeated acquisition procedure. Specific questions that will be addressed are: 1) What is the potential effect that low dose d amphetamine administration will have on the learning and performance conditions of the IRA procedure? 2) Is the approach used for building chains (forward and backward chaining procedures) important for determining drug effects? and 3) Will d amphetamine and/or training procedure differentially affect accuracy on repeating chains and non repeating chain types (i.e. chain definition)? The resulting data will likely yield important information regarding the acquisition of new behaviors and response patterns, and will expand our understanding of how these variables affect acquisition during the IRA procedure (e.g. how they affect learning). Repeated Acquisition Boren (1963) was the first to describe a repeated acquisition (RA) procedure in an abstract and shortly thereafter Boren and Divine (1968) described the procedure in greater detail. They developed this procedure to study the acquisition of conditional discriminations (Boren & Devine, 1968). Their repeated 2 acquisition procedure required rhesus monkeys to acquire a novel response sequence in a single session, thereby making it possible to study the actual course of the acquisition of a complex behavior (Boren & Devine, 1968). This component of the procedure is considered the learning component, while another performance component required the animals to perform the same response sequence each time the performance component was in place. The RA procedure creates a steady state of responding across sessions as both the pattern of responding and the accuracy of acquisition stabilizes (Thompson & Moreschbaecher, 1978). An important strength of the RA procedure is that it generates the acquisition of a complex response sequence in a single session, so that acute drug effects can be examined (Moerschbaecher, 1976; Thompson, 1973, 1977). In addition to the creation of a steady state of acquisition, the repeated acquisition procedure has a number of other advantages over traditional learning tasks. Specifically, positive reinforcement, rather than aversive or stressful stimuli are used during RA. This is advantageous for human-application but also because avoidance and reinforcement may tap different behavioral and neural processes (Everitt & Trevor, 2005; Mowrer, 1947). Also, a within-subject design is possible, so each subject serves as its own control comparison when drugs are administered and a repeated measure of learning for the same subject over an extended period of time is possible. This is advantageous because (as with most repeated measures designs) numerous extraneous variables are avoided as individual differences do not present a problem. Other advantages to a within subject design include: an exact definition of learning can be obtained (i.e. the particular response sequence) through specific experimental manipulations; the experimenter can precisely control baseline levels of accuracy (through stimuli and schedule manipulations); and a microanalysis of response patterns is possible (for example, the experimenter can analyze in great detail whether a decrease in accuracy, or increase in errors, is due to an increase in random responding, or preservative responding) (Boren & Devine, 1968; Cohn & Paule, 1995; Evans & Wenger, 1990, 1992; Howard & Pollard, 1983; Pieper, 1976; Thompson, 1978; Wenger, Schmidt & Davisson, 2004). In addition to the aforementioned advantages for the learning component of the IRA procedure, the performance condition also provides distinct advantages. The performance condition acts as a control for non-specific motor or sedative effects when drugs are administered. Several researchers 3 have made considerable contributions to the initial creation and subsequent evolution of the repeated acquisition procedure; their work is discussed below. Boren and Divine. Boren and Divine?s original RA procedure is the framework from which all other RA and IRA work has emerged, so this influential study will be described first. When Boren and Devine (1968) first implemented the RA procedure they described two main goals for their study. The first was to examine some of the variables that led to the acquisition of response chains, as acquisition data was often omitted from published research at that time (Boren & Divine, 1968). Their second goal was to develop a procedure that would allow for adequate study of such acquisition data (Boren & Divine, 1968). For their initial experiment, they trained three food-deprived rhesus monkeys to press a particular lever in the presence of a discriminative stimulus (i.e. a shaping procedure). Then, the animals were presented with a row of levers, which appeared in groups of three. Each group of levers was equipped with three pilot lights, one above each lever. Each session, the lights above the three levers to the animal?s extreme left would illuminate, signaling a response was required on any one of those three levers. Once they responded five times on that group of levers, the lights illuminated above the next three lights and so on, moving from left to right signaling where a response was required until the entire four response sequence was performed, activating the food dispenser (Boren & Devine, 1968). The correct group of levers was always signaled by the illumination of pilot lights located above those levers. Once this initial training was completed, the criteria changed so that one particular lever in each group of three levers (instead of any one of the three levers) was correct. Using these spatially grouped levers the animals were trained to perform the same four response sequence each day until they met criterion. This was considered the ?performance? phase, because no new response sequences had to be acquired for reinforcement. Once behavior stabilized on the performance phase, learning sessions were presented every other session. During a learning session the animals were required to acquire a new 4-response sequence in order to activate the food dispenser (the learning phase). This phase was identical to the previous performance phase in every way except that a different 4 response sequence was required each session, so that the animals still progressed from left to right in the row of levers, but the particular correct lever for each group of 3 levers changed from session to session (Boren & Devine, 1968). All 4-response sequences were carefully selected to be roughly equivalent; identical sequences were never repeated on 4 subsequent sessions, ?simple orders? were avoided, and (within a set of six sequences) each lever appeared with equal frequency between sequences (Boren & Devine, 1968). Throughout this procedure, an incorrect response (e.g. lever press on incorrect lever) resulted in a 15 sec. timeout, during which the house light turned off and no response was reinforced (Boren & Devine, 1968). Once responding stabilized on the learning component, thus constituting a full RA procedure (i.e. a procedure with both performance and learning components), Boren and Devine manipulated two variables. These included length of timeout period and presentation of ?instructional? stimuli. First, they altered the timeout period so that an incorrect response resulted in a timeout that lasted 1 sec., 15 sec., 1 min., 4 min., or no timeout. They found that the greatest number of incorrect responses occurred during the no timeout condition while having a 1 second-4 minute timeout decreased incorrect responses significantly (Boren & Devine, 1968). Their second experiment examined the effect of ?instructional? stimuli (or, discriminative stimuli) on performance during the repeated acquisition procedure. In this experiment, one condition paired each link in the chain with a discrete visual stimulus (a light on above the correct lever), while in the other condition, all lights were illuminated over each group of three levers for the duration of the session, and thus, no discriminative stimuli were present. They found the presence of discriminative stimuli did not significantly reduce the number of incorrect responses. Thus, they determined that instructional stimuli have little impact on accuracy during the RA procedure (Boren & Devine, 1968). This was the first study to measure accuracy of acquisition, and therefore led the way for significant advances in how researchers approach measuring acquisition (i.e. learning) in the laboratory. Thompson and Moerschbarcher. Following Boren and Devine?s (1968) implementation of a RA procedure, several investigators adopted this procedure for studying learning under various conditions (Boren, Schrot & Fontes, 1979; Harting & McMillan, 1976; Moerschbaecher & Thompson, 1980; Thompson, 1973; Thompson & Moerschbaecher, 1978, 1980). Thompson and colleagues (e.g. Moerschbaecher) modified Boren and Devine?s original procedure by simplifying the response operanda to three levers (from twelve) and discriminating them by color rather than by location, thus changing the procedure into a form that gave rise to the variants used today. With these modifications Thompson and Moerschbaecher were able to train a new species, pigeons, on RA, thereby expanding the procedure?s utility. 5 Aside from simplifying the response operanda and including criteria for sequence selection (discussed below), Thompson?s RA procedure was very similar to Boren and Devine?s (1968) RA procedure. Thompson (1973) used this modified repeated acquisition procedure to study the effects of various drugs. This was one of the first studies to analyze the effects of drugs on learning using the RA procedure. Thompson (1973) trained six pigeons on the repeated acquisition procedure so that they first experienced a ?performance? phase where the same 4-response sequence was presented each session. After which, the ?learning? phase begun and a new 4-response sequence was learned each session. Thompson maintained the strict chain selection criteria for this study, ensuring each 4-response sequence was roughly equivalent (Boren & Devine, 1968; Thompson, 1973). After behavior stabilized on both the performance and learning conditions, Thompson administered four behaviorally active drugs: phenobarbital, chlordiazepoxide, chlorpromazine and d-amphetamine. Drugs were administered intramuscularly (I.M.) 30 min. prior to each session. Doses of d- amphetamine and chlorpromazine ranged from 0.5-8 mg/kg and doses of Phenobarbital, while chlordiazepoxide ranged from 5-80 mg/kg (Thompson, 1973). Phenobarbital and chlordiazepoxide impaired accuracy and increased total trial time as a function of dose, with lower doses of chlordiazepoxide (~10 mg/kg) than of phenobarbital (~20-40mg/kg) having this effect (Thompson, 1973). d Amphetamine increased total errors and total trial time. However, with d-amphetamine, individual differences in errors were seen at the low doses. For two subjects, the 0.5 and 1 mg/kg dose of d- amphetamine had no effect on accuracy or total trial time, while the 2 mg/kg dose had no effect on accuracy. For one subject, the 0.5 and 1 mg/kg dose impaired accuracy, while the 2 mg/kg dose increased total errors and total trial time for this subject. Additionally, chlorpromazine had no effect on accuracy but did increase total trial time at larger doses (Thompson, 1973). The differential affect of d- amphetamine reveals how important it may be to begin the range of d-amphetamine doses with lower doses (i.e. lower than 0.5 mg/kg). Following that study, Thompson and Moerschbaecher (1978) examined the affects of d amphetamine, cocaine, and fenfluramine on accuracy of acquisition during the RA procedure. Doses of d- amphetamine and cocaine ranged from 0.3 ? 10 mg/kg, and fenfluramine ranged from 1-10 mg/kg. They found that d-amphetamine and cocaine decreased accuracy in a dose related fashion for the 1 mg/kg - 10 6 mg/kg doses. However, doses of 1 mg/kg and 3 mg/kg of d amphetamine and cocaine had no effect on total trial time (Thompson and Moerschbaecher, 1978). This study showed that the RA procedure is sensitive enough to detect changes in accuracy at doses that do not affect pausing (e.g. trial time), or other noticeable behavioral effects. Thompson (1979) imposed strict criteria for the selection of response sequences, and explicitly detailed the criteria he used. While similar criteria were used in the previous studies, this paper described them in great detail. These criteria included the exclusion of sequences with consecutive repeats in the chain, sequences that used a lever in the same position for multiple sequences and sequences that did not use each lever at least once. Although Boren and Devine (1968) described similar criteria, it was after Thompson clearly articulated these restrictions that their use became widespread. The impact of this study was relatively large, and using these or similar chain-restricting criteria essentially became a ?rule? to follow when implementing the RA procedure. Further, in an attempt to determine an ideal implementation of RA, Thompson (1979) questioned whether discriminative stimuli played an important role when acquiring novel response sequences (despite Boren and Devine?s (1968) finding that they do not). Thompson?s (1979) primary finding was that associating each link in the sequence with a distinct stimulus greatly facilitated the acquisition of that sequence (Thompson, 1979). This finding shaped future implementations of the RA procedure, as subsequent studies employed the use of discriminative stimuli for each chain link. Thompson?s (1979) simplified version of RA was widely adopted by researchers. An important modification to the RA procedure was made by Moerschbaecher and Thompson (1980) when they transformed the procedure into a multiple schedule, with a performance, learning and faded-learning component. This allowed them to study the effect of fading out discriminative stimuli on acquisition during drug administrations and to exploit the value of the performance condition. They administered d amphetamine (0.1 ? 1.0 mg/kg), cocaine (0.1 to 3.2 mg/kg) and phencyclidine (.01 - .24 mg/kg) during the RA procedure. For d amphetamine, errors increased and responding decreased in a dose related fashion (Moerschbaecher & Thompson, 1980). This effect occurred during both performance and learning components but was most pronounced during the learning component, with error increasing 7 effects seen at lower doses in the learning component than in the faded learning or performance component. Cocaine administration resulted in a dose-related decrease in responding and increase in errors for all components of the multiple schedule, but again most drastically for the learning component (Moerschbaecher & Thompson, 1980). However, unlike with d amphetamine, particular doses of cocaine increased errors without decreasing responding (Moerschbaecher & Thompson, 1980). Further, because effects were seen on accuracy at doses that did not effect responding, this study showed how sensitive the RA procedure is for detecting subtle changes in accuracy, even when other gross motor or behavioral effects are not seen. Recently RA has been applied to the study of neurotoxicants. Specifically, the effects of lead (Pb) exposure on learning were examined using an RA procedure by Cohn, Cox and Cory-Slechta (1993). They found that rats exposed to environmentally relevant levels of lead via drinking water exhibited significant decrements in accuracy during the learning but not the performance component when compared to control animals (Cohn, Cox & Cory-Slechta, 1993). This indicates the differential effect that lead exposure has on learning a new task when compared to performing a task already learned. This effect would likely have been masked by a traditional learning procedure that does not measure acquisition. The authors discuss the implications of lead exposure for the classroom, where children are tested for both learning and performance skills (Cohn, Cox and Cory-Slechta, 1993). Interestingly, they also found significant differences in accuracy between some of the learning chains. The authors suggest that accuracy for some chains is less affected by lead exposure due to their similarity to the performance chain (Cohn, Cox and Cory-Slechta, 1993). Therefore, learning something a bit ?less-novel? seems to attenuate the effects of lead exposure. This provides an example of how variations in chain definition can lead to important discoveries regarding the manner in which learning is disrupted, and reinforces the notion that chain definition should not be so strictly limited (as discussed below). The findings from the early drug studies, where parametric manipulations of the variables affecting RA were made, proved beneficial to understanding the utility of the RA procedure. This was largely due to the fact that it gave researchers a reliable measure of acquisition, thus allowing the 8 measurement of behavior in transition. The early experiments manipulating timeout length and presence of discriminative stimuli, among other parameters, led to a more complete understanding of what is required to achieve maximal repeated acquisition of new responses (Boren & Devine, 1968; Moerschbaecher & Thompson, 1980; Thompson, 1970). The tradition of restricting response sequences (to be ?roughly equivalent?) began as a result of this early work. This tradition has helped to decrease unwanted variability in the data, but it has also stifled questions about the importance of chain definition or ?type?. By strictly limiting the type of chain used, the effect of chain type cannot be determined and any interaction between the structure of the chain and drugs are masked. Potentially, this limits any conclusions about acquisition that can be drawn from the results, as chain structure (even if chain structures are ?roughly equivalent?) may be a confounding variable. Additionally, the starting dose of d-amphetamine (>0.1 mg/kg) used in these studies is quite high, and not a clinically relevant dose. Despite these limitations, research utilizing the RA procedure has led to a better understanding of learning as behavior in transition. Incremental Repeated Acquisition Although useful, inadequacies arose with RA procedures as some experimental questions evolved beyond the reach of the procedure. Particularly, the RA procedure restricts the experimenter?s ability to detect subtle differences in response patterns (e.g. which links are the sources of errors) as well as limiting the extent to which the experimenter can detect maximal progression (e.g. how long a chain can be trained and how rapidly can a chain be acquired?). As a result of these inherent limitations of RA, the procedure was modified so that the chain length progressed within a single session, thus ameliorating many of the limitations brought on by the original RA procedure (Pieper, 1976). This modification allowed for an analysis of maximum sequence length completed, a measure of longest sequence possible for a participant, and a way to determine which sequence length was affected by a manipulation. This modified RA procedure is referred to as the incremental repeated acquisition procedure (IRA). Through its implementation, the IRA procedure has provided several important contributions. Specifically these include the growing recognition of the importance of building complex performance incrementally, and that with proper design performance can be acquired fairly quickly. However, several important innovations were needed before an IRA procedure was possible. Perhaps most importantly is the advent 9 of computers. IRA would be virtually impossible with the electromechanical equipment used in the early days of RA. The IRA procedure maintains many of the important features of the standard RA procedure; however by taking control of the acquisition process it introduces other important variables. Incrementing within a session adds an important dimension to the procedure because the response sequences must now be acquired through a chaining procedure during the test session. Whereas with the RA procedure a response chain is indeed required during a test session, the animal experiences it as a response sequence (likened to an FR4, for example) with discriminative stimuli associated with each link. A complete 4-response sequence must be made each trail (from the first to last trial in a session) to earn reinforcement. This contrasts an IRA procedure where at the start of each session the response sequence is quite short, but gradually lengthens as criteria are met. Therefore, initially a single lever press will result in reinforcement but once a preset criterion is reached a 2-link response chain will be required to activate reinforcer delivery, and so on until a maximum sequence length is reached within the session. For pharmacological studies the IRA procedure has a number of other advantages over the RA procedure. For instance, IRA allows researchers to detect subtle changes in the acquisition and response patterns for progressively more complex behavioral responses. It should be noted here that the difficulty of an IRA procedure presumably increases as the response sequence length increases thought the session (Wright & Paule, 2007). The IRA procedure also produces stable baselines of acquisition for short response sequences, in fewer sessions, than was possible under the original repeated acquisition procedure, and generates the acquisition of a complex response sequence in a single session, so that acute drug effects can be examined (Cohn & Paule, 1993; Paule & McMillan, 1984; Pieper, 1976). The IRA procedure increases the total number of dependent measures available to a researcher, thereby allowing for a more comprehensive examination of the experimental question. These include a measurement of maximum chain length reached and accuracy in each link of a multi-link chain (e.g. accuracy in the 1 st link in a 4-link chain, or in the 4 th link of a 4-link chain). Similarly, response rate, total errors and total correct responses can be measured for each link, multiple learning curves can result from a single session and this procedure can be used in a wide range of species (Paule & Killiam, 1986; Popke 10 et al., 2000, 2001; Piper, 1976; Weinberger & Killiam, 1978; Wenger et al., 2004; Wright and Paule, 2007). Also, different types of training procedures (forward and backward chaining) can be studied as independent variables during an IRA procedure, in addition to chain definition (repeating and non- repeating) and drug administration. IRA Procedures Across Different Species One impressive feature of the IRA procedure is that is has been successfully used in a wide range of species. Therefore, the following discussion of the IRA procedure will be grouped by species; such a distinction will prove useful as this procedure is intended to assess learning in humans, using an animal model (Table 1 summarizes the procedures referred to here). The IRA-by-species sections will be followed by a general discussion of the known research concerning the three specific variables mentioned previously: training procedure, chain definition, and low-dose d amphetamine administration. These three issues will be approached from a broader learning literature perspective, as little has been published examining these variables with respect to the IRA procedure. IRA Procedures: The Rat. Paule and McMillian (1984) used an IRA procedure in rats to generate acquisition data repeatedly for at least three different response sequence lengths within a one-hour session. They began each session with a single response that could be incremented up to a 5-link chain (although data are only reported up to a 3-link chain). Backward chaining was used exclusively and chains were selected to be roughly equivalent as described by Thompson (1970). Accuracy from each length chain was analyzed separately, resulting in accuracy values for the 1-link, 2-link, and 3-link chains. After responding stabilized on the IRA procedure, d-amphetamine, diazepam, morphine, pentobarbital, and chlorpromazine was administered to seven rats. They found that these drugs, representing different pharmacological classes, differentially affected processes involved with performance on the IRA procedure. The lowest doses of d amphetamine used (0.1 and 0.3 mg/kg) improved accuracy exclusively in the 3-link chain (Paule & McMillian, 1984). Because the animals were injected just before the start of the session, the authors suggested that the time-course of d amphetamine may explain why the increase in accuracy was only seen at the 3-link chain (and not the 1- or 2-link chains). The authors do not elaborate much on this explanation, however d amphetamine typically takes 10-15 minutes to affect behavior, so it is likely that they were seeing the gradual effect of the drug as 11 blood levels rose over time. Notably, this study showed that IRA could be successfully used in rats, and it showed how sensitive this procedure is at detecting even subtle changes is accuracy between different lengths of response sequences, as well as indicating that increases in accuracy could be detected. Paule and McMillian (1986) used an IRA procedure to assess the effects of the potent- neurotoxicant, trimethylin (TMT), a compound that damages the hippocampus (Idemudia, S.O. & McMillan, D.E., 1987; O?Connell, A.W., Earley, B. & Leonard, B.E., 1996) on learning in rats. The use of an IRA procedure allowed for a description of the learning deficits induced by TMT as a function of time from exposure. This is possible because the procedure was used to study learning repeatedly, so that they were able to chart the time course by which TMT?s effects appeared. They compared the effect of exposure on the performance condition to that of the learning condition. They found that TMT affected acquisition to a greater extent than it affected performance, showing that TMT selectively affects learning of new responses (Paule & McMillian, 1986). More recently, Mayorga, Popke, Fogle, Paule (2000) employed an operant task battery (OTB) comprising conditioned position responding [CPR], progressive ratio [PR], temporal response differentiation [TRD], and IRA tasks to assess the effects of d amphetamine and methylphenidate in rats. Each procedure was chosen because it tapped different functions (called ?brain functions? by the authors). The CPR task was used for auditory/visual/position discrimination, IRA for learning, TRD for timing, and PR for motivation. This series of operant tasks was considered a test battery that adequately characterized the effects of pharmacological agents; however the effects on the entire battery will not be discussed here, as they take us too far afield. The specific IRA procedure implemented in that study used three retractable levers and began with a one-lever sequence and progressed to a maximum sequence length of six. The training procedure was described as backward chaining but details were not provided. In this version of the IRA procedure, 20 correct (but not necessarily consecutive) responses resulted in a one min. blackout, which was followed by the presentation of the two-lever sequence. The chain incremented in this way until the allotted time elapsed, or a six-lever sequence was reached. Importantly, position indicator lights were used to signal their progress (or, position) in the current response sequence, indicating the number of correct responses remaining for reinforcer delivery (Mayorga, Popke, Fogle & Paule, 2000). At any point, 12 a response on an incorrect lever resulted in a 2-sec timeout, but did not reset the response requirement. Correct responses were followed by the illumination of the appropriate serial position indicator light, plus a correct response indicator light and a 1-sec blackout. d Amphetamine (0.1-6.0 mg/kg) and methylphenidate (1.12-18.0 mg/kg) were administered 15- min prior to each drug session. Response rates decreased as a result of both d amphetamine (3.0 and 6.0 mg/kg doses) and methylphenidate administration (18.0 mg/kg). Likewise, accuracy was decreased at doses that either decreased or had no effect on response rate (i.e. 6.0 mg/kg d amphetamine, 9.0 and 18.0 mg/kg methylphenidate). Across all tasks, d amphetamine and methylphenidate produced similar effects. This operant test battery (OTB) has been used to examine the effects of various other pharmacological agents. Popke, Mayorga, Fogle and Paule (2000) used a similar variety of tasks (i.e. CPR, PR, TRD, IRA) with the addition of a differential reinforcement of low response rates (DRL) procedure, to assess the effects of acute nicotine administration (0.3-1.0 mg/kg) in the rat. The particular IRA procedure used was identical to that used in Mayorga, Popke, Fogle and Paule (2000). However, they found no significant effects of nicotine administration on accuracy in the IRA procedure. Nicotine administration produced an inverted U-shaped curve for responding, so that subjects responded faster following moderate dose of nicotine (0.42 mg/kg) (Popke, Mayorga, Fogle & Paule, 2000). Response rates increased during the CPR and PR tasks across all doses, whereas accuracy was decreased for the TRD and DRL tasks. Similarly, Popke, Allen and Paule (2000) utilized this operant test battery to assess performance of rats during acute ethanol administration (0.5-3.0 g/kg). They found no significant effect of dose on accuracy or responding during the IRA procedure (Popke, Allen & Paule, 2000). Interestingly, the IRA procedure was the only procedure unaffected by ethanol administration, as performance on all other cognitive-behavioral tasks was disrupted at doses of >= 1.5 mg/kg (Popke, Allen & Paule, 2000). IRA Procedures: The Mouse. Recently, Wenger, Schmidt and Davisson (2004) utilized the RA procedure to assess learning in the Ts65Dn mouse, which was used as a model of human Down syndrome. Unlike much of the reported IRA procedures in the literature Wenger, Schmidt and Davisson (2004) provide an extremely detailed description of their procedure. This included the description of a 13 backward chaining training procedure and a list of each of the six chain definitions used (none of which used any consecutive repeats). Additionally, the authors describe the criterion to progress throughout the chain as a single response required for the first 5 reinforcers, a two-link chain required for the next 10 reinforcers, a three-link chain required for the next 30 reinforcers and a four-link chain required for the final 20 reinforcers. They found a significant difference in the performance of the Ts65Dn mice and control mice when the IRA procedure reached the 3- and 4-link chains, but not in the 1- and 2-link chains. The authors concluded that the Ts65Dn mice have a learning deficit that correlates with task difficulty (Wenger, Schmidt & Davisson, 2004). IRA Procedure: The Pig. Ferguson, Gopee, Paule, and Howard (in press) used IRA as a tool to asses the ?cognitive abilities? of mini-pigs. They found that the mini-pigs performance on the IRA procedure improved as a function of session. Specifically, total responses, total reinforcers, overall accuracy, and response rate all increased across sessions. Further, they found that each of the three mini-pigs used reached a 4-link chain at least once. As expected, accuracies were highest in the 1-link chain and were not significantly different for the 2- or 3-link chains. The authors concluded that IRA responding in mini-pigs was comparable to that of well trained rats (Gopee, Paule, and Howard, in press). IRA Procedures: Non-Human Primates. Pieper (1976) used IRA to evaluate the effects of stimulants and depressants on learning with rhesus monkeys and great apes (i.e. chimpanzees and orangutans). This was used to determine if primates could be an adequate animal model of psychopharmacological processes (Pieper, 1976). The IRA procedure was used so that the effects of the drugs on learning could be assessed without using a task that required extensive training prior to drug testing (Pieper, 1976). It was also used to examine the utility of maximum sequence length in addition to accuracy, could be used as an index of drug effects (Pieper, 1976). It was found that stimulant drugs (methamphetamine, phentermine, phendimetrazine, d amphetamine, diethylpropion, benphetamine) decreased maximum sequence length reached as a function of dose, while depressants (secobarbital, meprobamate, glutethimide, butabarbital) did not greatly affect maximum sequence length reached (accuracy, however, was not reported) (Pieper, 1976). Weinberger and Killiam (1978) incorporated the incrementing aspects of Pieper?s (1976) procedure and Thompson?s (1970) use of discriminative stimuli (rather then fading stimuli) to create a 14 slightly different version of the IRA procedure. Specifically, in Weinberger and Killiam?s (1978) experiment, three seizure-prone baboons acquired a different response sequence (across 3 levers) each session. They began each sequence at a 2-response length, incrementing by one until the animals reached a 4-response chain. It was noted that new responses were always added to the beginning of a previously mastered response sequence, i.e., backward chaining. Each link (1 st , 2 nd , 3 rd , etc) in the chain was paired with a discrete visual stimulus. The authors discuss how this is an important characteristic of IRA procedures because the link, not the particular lever, is paired with a discriminative stimulus. Additionally, they followed the strict chain type selection criteria articulated by Thompson (1970) and Boren and Devine (1968). Following steady-state responding, Weinberger and Killiam (1978) administered diazepam (0.5 - 2.0 mg/kg) and phenobarbital (5.0 ? 20.0 mg/kg) before each IRA session. They measured time to reach criterion for each chain, accuracy (total number of possible errorless chains divided by the actual number of chains completed), and total errors (Weinberger & Killiam, 1978). They found that animals under diazepam reached criterion faster, and did so more accurately than during the control condition, concluding that learning was most improved when treated with a 1.0 mg/kg dose of Diazepam (Weinberger & Killiam, 1978). They found similar effects of pentobarbital administration, where the 10.0 mg/kg dose improved repeated acquisition of novel response sequences for the seizure-prone baboons. Importantly, this version of the IRA procedure (including the use of discriminative stimuli, strict chain selection criteria and presumably the use of backward chaining) has proven to be the most commonly used version of the IRA procedure. IRA Procedure: The Human. In addition to the animal-model pharmacological studies conducted using the aforementioned operant test battery (OTB), Paule, Chelonis, Buffalo, Blake, and Casey (1999) correlated performance on the OTB with score on an IQ test in children. They found significant correlations (R = 0.532, p = 0.0001) between all three measures of IRA performance (i.e. percent completed, accuracy, and response rate) and IQ scores. They found weaker but still significant correlations between other operant test battery endpoints (e.g. color and position discrimination accuracy) and IQ scores. Some procedures, such as the ?motivational task? (a progressive ratio of reinforcement) was not correlated with score on an IQ test at all. The authors suggest that these results demonstrate the 15 relevance of these operant measures as ?metrics? of important brain functions (Paule, Chelonis, Buffalo, Blake, & Casey, 1999). They go on to suggest that since laboratory animals can perform these same operant tasks that this test battery will be useful in studying the effects of neuroactive compounds on various aspects of cognitive functioning in animals as an adequate model of human performance (Paule, Chelonis, Buffalo, Blake, & Casey, 1999). Training Procedure Training procedures are a crucial component of the IRA procedure. It is only though implementing a training procedure that response sequences can gradually increase in length throughout an IRA session. This is a fundamental aspect of any IRA procedure and is the mechanism that allows a 1- link chain (e.g. a right ?R? lever press) to increment to a 2-link chain (e.g. ?R-L?) and so on to the desired chain length. Traditionally, there are two divergent chaining procedures that can be used to increment a response sequence: forward chaining and backward chaining. When a backward chaining procedure is used, the link closest to reinforcement (i.e. the last link) is introduced first and subsequent links are added before (or, ?in front?) of the previously performed sequence. When a forward chaining procedure is used the first link of the sequence is presented first and subsequent links are added after (or, ?behind?) the previously performed sequence (see table 2). Weiss (1978) did not use the IRA procedure, but did conduct a comparison of basic forward and backward chaining procedures in humans. Ten first-year psychology students were required to learn four separate six-sequence response chains (in the form of key-presses) using either forward or backward chaining. On backward chaining sequences, the participants had significantly more errors than on forward chaining sequences. In fact, no participant learned any backward chain with fewer errors than they had learned a forward chain. Weiss used this to suggest that the forward chaining procedure is more effective for acquiring response sequences in humans (Weiss, 1978). Weiss (1978) proposed that forward chaining is more easily acquired because each response is directly reinforced at one point during training. By comparison, with backward chaining, only the last link is followed by reinforcement. Smith (1999) came to a similar conclusion when administering a comparable task (although this task used walking/steps rather than key-presses), that compared forward, backward and whole-task chaining procedures, in humans. The whole-task procedure resulted in significantly more 16 errors than forward and backward chaining. However, backward chaining resulted in slightly more errors than forward chaining, although this finding was not significant (Smith, 1999). As evidenced by these human studies, ambiguities in effectiveness and accuracy remain between these different chaining procedures. Conversely, the few animal studies that have directly assessed training procedures yielded support in favor of backward chaining as a preferable training procedure (Ferster & Perrott, 1968; Millenson, 1967). Millenson (1967) suggested backward chaining may result in higher accuracies because it strengthens the response closest to primary reinforcement first, working back from there. This favoring of backward chaining seems to have prevailed in the animal literature, as most of the IRA investigators that describe their training procedure, describe it as backward chaining. This preference occurs despite any direct evidence that backward chaining is superior to forward chaining in an IRA procedure. Chain Definition Chain definition refers to the sequences that compose a chain; therefore chain definition can determine the chain type as being repeating or non-repeating (for our purposes). Variations in chain type are rarely explored with respect to accuracy during an IRA procedure. As previously noted, Thompson (1970) and Boren and Devine (1968) described several restrictions to be implemented when creating chains, so to avoid a biased response-sequence. A ?biased? chain can be created in many ways and are usually described as being ?easier? to acquire, such as by using only one response (e.g. RRRR). The approach of restricting chain definition may have chilled investigations into the importance of chain definition. Despite a lack of such research, investigators have overwhelmingly accepted this practice of restriction and have avoided analyzing responding under a variety of chain definitions. As one notable exception, Wright and Paule (2007) sought to quantify chain difficulty during the IRA procedure. Rats were trained to lever press (up to) a six-link response sequence across three levers in a standard operant chamber. They limited the range of chain types so that sequences requiring consecutive responses on the same lever were not used (e.g. LCC, a ?repeating? chain) and response sequences that required all three levers to be pressed in sequential order were excluded (i.e. LCR). These exclusion criteria rendered 16 possible six-lever responses. The procedure was described as 17 building chains through backward chaining, and the details of the procedure closely resembled those already described. Wright and Paule (2007) found a clear ordering of accuracies across the 16 6-link response chains, which they cataloged as three difficulty levels for the chains based on accuracy for that chain. The difficulty levels were ?easy?, ?moderate?, and ?hard?. They attributed the increase in accuracy to the ?ease? by which the chain could be completed and concluded that the major determining factor for chain ease was the spatial proximity of responses. Wright and Paule (2007) determined that responses required on adjacent levers (e.g. LC) were ?easier? (i.e. resulted in more accurate responding) than responses required on non-adjacent levers (e.g. LR). This study did not fully elucidate the relationship between chain type and accuracy, primarily because of the exclusion criteria used. Many questions regarding chain definition (or, response-sequence difficulty) remain. Specifically, response sequences that have consecutive repeats at any location in the chain (LCC) may result in higher accuracies and altered responding. Further, chain type has not been manipulated during drug administration, which could yield important information regarding the sensitivity of various chain types to a pharmacological agent. d Amphetamine d Amphetamine, a dopamine agonist, promotes the activity of dopamine neurotransmitter systems by increasing the release of vesicular dopamine and blocking its re-uptake back into the presynaptic neuron. This could be important because dopamine has been implicated as a key player in reinforcement processes (for a review see McGregor & Roberts, 1994). At moderate to high doses d amphetamine is a psychomotor stimulant that causes stereotyped sensorimotor activation, manifested in the form of increased locomotor behavior and various stereotypy?s (Segal & Kuczenski, 1994). In rodents, low to moderate doses of d amphetamine (e.g. 0.25 -1.0 mg/kg) typically correspond to an increase in locomotor activity (e.g. rearing, sniffing, head bobbing) that may or may not be stereotyped in nature (Rebec & Bashore, 1984). It can be noted that there is some controversy over what constitutes a ?low? dose of d amphetamine (Grilly and Loveland, 2001), as discussed below. Higher doses of d amphetamine (e.g. 10 mg/kg) lead to a decrease in locomotor activities and increase in obvious stereotypies, which are usually highly focused and confined to a small area in space (e.g. repetitive grooming, repetitive head and 18 limb movements). The form of stereotypic behavior an animal exhibits depends on numerous variables (e.g. the environmental conditions, pervious learning history, and/or dose). In addition to changes in locomotion, d amphetamine influences responding during operant tasks. Responding under many different schedules of reinforcement has been extensively examined. In general, it has been found that the behavioral effects of d amphetamine depend largely upon the baseline rate of responding. When d amphetamine is administered low baseline rates of responding are increased, whereas high baseline rates of responding are decreased (Dews & Wenger, 1977). This effect has been described as a rate-dependent or, alternatively, a rate homogeneity effect, since the net result is a constant response rate through the interval (Byrd, 1979; McKim, 1973). This interesting and complicated effect on responding during operant tasks may be due to an increased efficacy of conditioned reinforcers (Robbins, 1978). This could account for the increase in responding seen for low-rate behaviors, while the rate decreasing effects of higher doses could be explained by the increase in incompatible, stereotyped behaviors associated with higher doses of d amphetamine. What is a ?low? dose of d amphetamine? Despite extensive research on the effects of d amphetamine, inconsistencies remain when describing what constitutes a ?low? dose. This issue is important because doses below 0.1 mg/kg lie in the clinically-relevant range, with doses of .07-.40 mg/kg used to enhance mood and improve performance on a variety of physical and cognitive tasks (in most individuals) (Grilly & Loveland, 2001). While researchers acknowledge d amphetamine induces qualitatively different behavioral effects as a function of dose, authors rarely provide any rationale for choosing a particular dose range (Grilly & Loveland, 2001). More recently, in a review of low dose d amphetamine administration in rats, Grilly and Loveland (2001) concluded that for studies examining ?learned behavior?, the lowest starting dose range used was .025 ? 4.0 mg/kg with the lowest dose resulting in differences in responding (i.e. an increase in responding) occurring at the 0.1 mg/kg dose (Glick & Muller, 1971). However, this study examined lever pressing under an FR schedule, which is arguably very different from an IRA procedure. Interestingly, the lowest dose range used for a drug discrimination task was 0.032 -1.0 mg/kg and again the lowest effective dose was 0.1 mg/kg (Rosen et 19 al., 1986). With respect to the IRA procedure, the lowest dose of d amphetamine used in rats is 0.03 mg/kg, which resulted in no detectable difference in accuracy (Paule & McMillan, 1984). Interestingly, the discrepancies noted in dosing range seem to be ubiquitous across all fields examining the effects of d amphetamine on behavior, and the learning (specifically, IRA) field of study is no exception. There are inconsistencies in both the dosing range and the effects of d-amphetamine on responding in the RA and IRA procedures. Several animal studies, administering a dose of 0.1 mg/kg, have found no effect on accuracy in RA (Moreschbaecher, 1980; Thompson, 1980) or IRA (Mayorga et al., 2000) procedures. Other RA studies began dosing at 0.3 mg/kg (Thompson, 1978; 1983) and 0.5 mg/kg (Moreshbeacher, 1979; Thompson 1973), finding no affect on accuracy at those doses. However, one RA study (Harting & McMillian, 1976) and one IRA study (Paule & McMillan, 1984) revealed subtle decreases in errors at both the 0.1 mg/kg and 0.3 mg/kg dose. These discrepant findings combined with the added sensitivity that an IRA procedure offers over a RA procedure (and the fact that these doses are higher than what is considered to be ?clinically relevant? doses), suggest that doses of d amphetamine should begin lower than is currently typical in the literature. Extremely low-dose administration of d amphetamine may result in detectable differences in accuracy during an IRA procedure and may reveal interesting aspects of the acquisition of response sequences, particularly when training procedure and chain type is manipulated. For a brief review of the effects on behavior of low dose d amphetamine administration see appendix table A1, which includes studies that have not been described in detail here. 20 References Boren, J.J. & Devine, D.D. (1968). The repeated acquisition of behavioral chains. Journal of the Experimental Analysis of Behavior, 11, 651-660. Boren, J.J. (1963). Repeated acquisition of new behavioral chains. American Psychologist, 17, 421. Byrd, L.D. (1979). The behavioral effects of cocaine: rate dependency or rate constancy. European journal of Pharmacology, 56(4), 355-62. Cohn, J. & Paule, M.G. (1995). Repeated acquisition of response sequences: The analysis of behavior in transition. Neuroscience and Biobehavioral Reviews, 19, 397-406. Dews, P.B. & Wenger, G.R. (1977). Rate-dependency of the behavioral effects of amphetamine. In Advances in Behavioral Pharmacology, Vol. 1,167-227, (Thompson, T. & Dews, P.B., Eds.). Diamond, D.M., Fleshner, M., Ingersoll, N, & Rose, G. (1996). Psychological stress impairs spatial working memory: Relevance to electrophysiological studies of hippocampal function. Behavioral Neuroscience, 110(4), 661-672. Diamond, D.M., Park, C.R., Heman, K.L., &Rose, G.M. (1999). Exposing rats to a predator impairs spatial working memory in the radial arm water maze. Hippocampus, 9, 542 ? 552. Feldman, R. S., Meyer, J.S. & Quenzar, L. F. (1997). Principles of Neuroscience. Sunderland, Massachusetts: Sinauer Associates, Inc. Ferguson, S.A., Neera, V.G., Paule, M.G. & Howard, P.C. (in press). Female mini-pig performance of temporal response differentiation, incremental repeated acquisition, and progressive ratio operant tasks. Behavioural Processes, doi: 10.1016/ j.beproc.2008.08.006. 21 Ferster, C.B. & Perrott, M.C. (1968). Behavior principals. New York: Appleton-Century-Crofts. Glick, S.D. & Muller, R.U. (1971). Paradoxical effects of low doses of d-amphetamine in rats. Psychopharmacologia, 22, 396-402. Grauer, E. & Kapon, Y. (1993). Wistar-Kyoto rats in the Morris water maze: Impaired working memory and hyper-reactivity to stress. Behavioral Brain Research, 59, 147-151. Grilly, D.M. & Gowans, G.C. (1981). Effects of maltrexone, and d-amphetamine, and their interaction on the stimulus control of choice behavior in rats. Psychopharmacology, 96, 73-80. Grilly, D.M. & Loveland, A. (2001). What is a ?low dose? of d-amphetamine for inducing behavioral effects in laboratory rats? Psychopharmacology, 153, 155-169. Harting, J. & McMillan, D.E. (1976). Effects of pentobarbital and d-amphetamine on the repeated acquisition of response sequences by pigeons. Psychopharmacology, 49(3), 245-248. Harting, J. & McMillan, D.E. (1976). Repeated acquisition of response sequences by pigeons under chained and tandem schedules with reset and nonreset contingencies. Psychological Records., 26 (3), 361-367. Ljungberg, T. & Enquist, M. (1987). Disruptive effects of low doses of d-amphetamine on the ability of rats to organize behavior into functional sequences. Psychopharmacology, 93, 146-151. Mabry, T.R., Gold, P.E. & McCarty, R. (1995). Age-related changes in plasma catecholamine responses to acute swim stress. Neurobiology of Learning and Memory, 63, 260-268. Mayorga, A.J., Popke, E.J., Fogle, C.M., & Paule, M.G. (2000). Similar effects of amphetamine and methylphenidate on the performance of complex operant tasks in rats. Behavioral Brain Research, 109, 59-68. 22 McGregor, A. & Roberts, D.C.S. (1994). Mechanisms of abuse. In A.K. Cho (Ed.), Amphetamine and its Analogs, 243-266. San Diego: Academic Press. McKim, W.A. (1973). The effects of scopolamine of fixed-intercal behavior in the rat: A rate-dependency effect. Psychopharmacologia, 32, 255-264. Miczek, K. A. (1973). Effects of scopolamine, amphetamine and chlordiazepoxide on punishment. Psychopharmacologia, 28, 373-389. Millenson, J. R. (1967). Principles of behavioral analysis. New York: Macmillan. Moerschbaecher, J.M., & Thompson, D.M. (1980). Effects of d amphetamine, cocaine and phencyclidine on the acquisition of response sequences with and without stimulus fading. Journal of the Experimental Analysis of Behavior, 33, 369-381. Moerschbaecher, J.M., Boren, J.J. & Schrot, J. (1978). Repeated acquisition of conditional discriminations. Journal of the Experimental Analysis of Behavior, 29(2), 252-232. Moerschbaecher, J.M., Boren, J.J., Schrot, J. & Fontes, J.C.S. (1979). Effects of cocaine and d- amphetamine on the repeated acquisition and performance of conditional discriminations. Journal of the Experimental Analysis of Behavior, 31, 127-140. Montague, P.R., Hyman, S.E. & Cohn, J. D. (2004). Computational roles for dopamine in behavioural control. Nature, 431(7010), 760-767. Mowrer, O.H. (1947). On the dual nature of learning: A reinterpretation of ?conditioning? and ?problem- solving?. Harvard Educational Review, 17, 102-148. Newland, M.C. & Reile, P.A. (1999). Learning and behavior change as neurotoxic endpoints. In H. A. Tilson and J. Harry (Eds). Target Organ Series: Neurotoxicology. New York: Raven Press, 311- 338. 23 O?Connell, A.W., Earley, B., & Leonard, B.E. (1996). The [sigma] ligand JO 1784 prevents Trimethyltin- induced behavioural and [sigma]-receptor dysfunction in the rat. Pharmacology & Toxicology. 78(5):296-302. Paule, M.G., Chelonis, J.J., Buffalo, E.A., Blake, D.J., & Casey, P.H. (1999). Operant test battery performance in children: correlation with IQ. Neurotoxicology and Teratology, 21 (3), 223-230. Paule, M.G. & Killam, E.K. (1986). Behavioral toxicity of chronic ethosuximide and sodium vaiproate treatment in the epileptic baboon Papio papio. Journal of Pharmacology and Experimental Therapeutics, 238, 32-38. Paule, M.G. & McMillan, D.E. (1984). Incremental repeated acquisition in the rat: Acute effects of drugs. Pharmacology, Biochemistry and Behavior, 21, 431-439. Pieper, W.A. (1976). Great apes and rhesus monkeys as subjects for psychopharmacological studies of stimulants and depressants. Federation Proceedings, 35, 2254-2257. Popke, E. J., Allen, R. R., Pearson, E. C., Hammond, T. C., & Paule, M. G. (2001). Differential effects of two NMDA receptor antagonists on cognitive-behavioral performance in young non-human primates: I. Neurotoxicology and Teratology, 23, 319?332. Popke, E.J., Allen, S.R., & Paule, M.G. (2000). Effects of acute ethanol on indices of cognitive-behavioral performance in rats. Alcohol, 20, 187-192. Popke, E.J., Mayorga, A.J., Fogle, C.M., & Paule, M.G. (2000). Effects of acute nicotine on several operant behaviors in rats. Pharmacology, Biochemistry, and Behavior, 65, 247-254. Rebec, G.V., & Bashore, T.R. (1984). Critical issues in assessing the behavioral effects of amphetamine. Neuroscience and Biobehavioral Reviews, 8, 153-159. 24 Robbins, T.W. (1978). The acquisition of responding with conditioned reinforcement: Effects of pipradol, methylphenidate, d amphetamine, and nomifensine. Psychopharmacology, 58, 79-87. Rosen, J.B., Young, A.M., Beuthin, F.C. & Louis-Ferdinand, R.T. (1986). Discriminative stimulus properties of amphetamine and other stimulants in lead-exposed and normal rats. Pharmacology, Biochemistry and Behavior 24, 211-215. Segal, D.S. & Kuczenski, R. (1994). Behavioral pharmacology of amphetamine. In Cho, A.K. & Segal, D.S. (Eds.), Amphetamine and Its Analogues: Psychopharmacology, Toxicology and Abuse (pp. 115-150), San Diego, CA: Academic Press. Seligman, M.E.P. & Johnston, J.C. (1973). A cognitive theory of avoidance learning. In F.J. McGuigan and D.B. Lumsden (Eds.), Contemporary Approaches to Conditioning and Learning. Washington: Winston and Sons, 69-110. Sidman, M. (1960). Tactics of Scientific Research: Evaluating Experimental Data in Psychology. New York: Basic Books, Inc. Skinner, B.F. (1938). The behavior of organisms: an experimental analysis. New York: Appleton-Century- Crofts. Smith, G.J. (1999). Teaching a long sequence of behavior using whole task training, forward chaining, and backward chaining. Perceptual and Motor Skills, 89, 951-965. Thompson, D. M. (1978). Stimulus control and drug effects. In D. E. Blackman & D. J. Sanger (Eds.), Contemporary research in behavioral pharmacology, 159?207, New York: Plenum. Thompson, D.M. (1973). Repeated acquisition as a behavioral baseline for studying drug effects. Journal of Pharmacology and Experimental Therapeutics, 184, 504-514. 25 Thompson, D.M. (1977). Development of tolerance to the disruptive effects of cocaine on repeated acquisition and performance of response sequences, Journal of Pharmacology and Experimental Therapeutic, 203, 294-302. Thompson, D.M., & Moerschbaecher, J.M. (1978). Operant methodology in the study of learning. Environmental Health Perspective, 26, 77-87. Thompson, D. M., & Moerschbaecher, J. M. (1979). An experimental analysis of the effects of d- amphetamine and cocaine on the acquisition and performance of response chains in monkeys. Journal of the Experimental Analysis of Behavior, 32, 433?444. Thompson, D.M., & Moerschbaecher, J.M. (1980). Effects of d amphetamine and cocaine on strained ration behavior in a repeated acquisition task. Journal of the Experimental Analysis of Behavior, 33, 141-148. Thompson, D.M., Moerschbarcher, J.M. & Winsauer, P.J. (1983). Drug effects on repeated acquisition: Comparison of cumulative and non-cumulative dosing. Journal of the Experimental Analysis of Behavior, 39(1), 175-184. Towbin, K.E. & Leckman, J.F. (1992). Attention deficit hyperactivity in childhood and adolescence. In H.L. Klawans, et al. (Eds.), Textbook of Clinical Neuropharmacology and Therapeutics, 323-333. New York: Raven Press. Weinberger, S.B. & Killam, E.B. (1977) A comparison of the effects of chronically administered diazepam and phentobarbital on learning in the Papio papio model of epilepsy. Proceedings of the Western Pharmacology Society, 20, 173-177. Weinberger, S.B. & Killam, E.B. (1978) Alterations in learning performance in the seizure-prone baboon: Effects of elicited seizures and chronic treatment with diazepam and Phenobarbital. Epilepsia, 19, 301-316. 26 Weiss, K.M. (1978). A comparison of forward and backward procedures for the acquisition of response chains in humans. Journal of the Experimental Analysis of Behavior, 29, 255-259. Wenger, G.R., Schmidt, C., & Davisson, M.T. (2004). Operant conditioning in the Ts65Dn mouse: Learning. Behavior Genetics, 34(1), 105-119. Wise, R.A. (2004). Dopamine, learning and motivation. Nature Reviews Neuroscience, 5(6), 483-494. Wright, L.K.M., & Paule, M.G. (2007). Response sequence difficulty in an incremental repeated acquisition (learning) procedure. Behavioral Processes, 75, 81-84. Zaharia, M.D., Kulczycki, J., Shanks, N., Meaney, M.J. & Anisman, H. (1996). The effects of early postnatal stimulation on Morris water-maze acquisition in adult mice: Genetic and maternal factors. Psychopharmacology, 128(3), 227-239. 27 Tables Table 1 Incremental Repeated Acquisition Procedures 1 Authors Date Species Chain Definition Training Procedure Drug Paule & McMillian 1984 Rat "roughly equivalent" not explicitly stated damphetamine, diazepam, morphine, pentobarbital, & chlorpromazine Paule & McMillian 1986 Rat "roughly equivalent" not explicitly stated trimethylin Mayorga, Popke, Fogle & Paule 2000 Rat "roughly equivalent" backward chaining d amphetamine & methylphenidate Popke, Mayorga, Fogle & Paule 2000 Rat "roughly equivalent" backward chaining nicotine Popke, Allen & Paule 2000 Rat "roughly equivalent" backward chaining ethanol Wenger, Schmidt & Davisson 2004 Mouse "roughly equivalent" backward chaining N/A Pieper 1976 Rhesus Monkey, Chimpanzee & Orangutan "roughly equivalent" not explicitly stated methamphetamine, phentermine, phendimetrazine, d-amphetamine, diethylpropion, benphetamine, secobarbital, meprobamate, glutethimide & butabarbital Weinberger and Killiam 1978 Baboon "roughly equivalent" backward chaining diazpam & phenobarbital Paule, Chelonis, Buffalo, Blake & Casey 1999 Human "roughly equivalent" not explicitly stated N/A Ferguson, Gopee, Paule & Howard in press Mini-pig "roughly equivalent" not explicitly stated N/A 28 Table 2 Training Procedures Backward Chaining Forward Chaining L-R-B-R L-R-B-R R ? Sucrose L ? Sucrose B-R ? Sucrose L-R ? Sucrose R-B-R ? Sucrose L-R-B ? Sucrose L-R-B-R ? Sucrose L-R-B-R ? Sucrose 29 Table Captions Table 1. Summary table of the previously described IRA literature. 1 Note the absence of forward chaining and variation in chain definition. Table 2. Depicts the same 4-response sequence incrementing by backward chaining (far left) and by forward chaining (far right). 30 Chapter 2: Experiments Abstract Pharmacological and behavioral determinants of learning were examined using a mastery-based incremental repeated acquisition procedure. A 60-minute session began with a one link chain (single lever-press) that incremented to a maximum of a four-link chain using three levers: Left (L), Right (R) and Back (B). Backward (5 rats) and forward (5 rats) training procedures were used to build the chain. In pseudo-randomized presentations, a ?performance? session (same chain every session) and a ?learning? session (chain differed from session to session) was imposed. Some learning chains had an embedded repeated response (e.g., LRRB) and some learning chains had no such repeat (e.g., LRLB). The product of chain-length and number of reinforcers over total reinforcers quantified progress during a session (progress quotient, or ?PQ?). After behavior stabilized, low doses of d-amphetamine (0.01 to 3.0 mg/kg, ip) were administered. Acquisition was consistently superior for the backward training group during non- repeating learning sessions, across all but the highest doses of d-amphetamine. Very low, clinically- relevant doses of d-amphetamine improved acquisition for the backward training group during repeating- learning sessions. This study identifies a set of conditions under which very low doses of d-amphetamine enhance learning. It also suggests a sensitive and valid measure of acquisition for use in studies of a mastery-based incremental repeated acquisition. 31 Introduction Mechanisms and performance measures in mastery-based incremental repeated acquisition: Behavioral and pharmacological analyses The repeated acquisition of behavioral chains (RA) is a procedure pioneered by Boren (1963) and Boren and Divine (1968) to study learning (i.e. behavior in transition) and complex behavior using a within-subject experimental design. Both Moerschbaecher and colleagues (e.g. 1979; 1980) and Thompson and colleagues (e.g. 1973; 1978; 1980; 1983) exploited this procedure to study drug effects on learning. The RA procedure was modified by Pieper (1976) and Weinberger and Killiam (1978) to increment to progressively longer chains within a session. This incremental repeated acquisition (IRA) procedure is used to study learning with the added benefit of manipulating the difficulty level of the task. Repeated acquisition procedures (RA and IRA) typically comprise two components, one that requires subjects to learn a different response sequence each experimental session (the learning component), and one that requires the performance of a response sequence already learned (the performance component). In studies of drug action, the performance component serves as a control for non-specific drug effects or effects on response rate or psychomotor function. The IRA procedure has the additional requirement that pre-set criteria must be met on shorter response chains before a longer response chain is introduced; in this way, performance and acquisition on progressively more difficult response sequences can be measured within a single session. If implemented using well-designed instructional techniques, the IRA procedure could improve the execution of the procedure by increasing the length of a chain that can be acquired. In various forms, this procedure has been effective in studying toxicant and drug effects on the acquisition of new behavior in individual subjects, showing that learning is differentially affected by drugs from different pharmacological classes (e.g. Cohn & Paule, 1995; Paule & McMillan, 1984; Wright & Paule, 2007). Different approaches can be taken to incrementing and defining the chain, but the role of these different approaches on the behavior that occurs is poorly understood. Chain type (defined as the particular structure of sequences that make up a response chain) and training procedure (defined as how the procedure increments to progressively longer response chains) are both likely to influence the speed 32 and accuracy of acquisition. Wright and Paule (2007) manipulated chain type by varying the number of levers used and the number of presses on adjacent levers. They found that accuracy was inversely related to the number of levers available and directly related to the adjacency of successive levers. Cohn, Cox and Cory-Slechta (1993) also reported an effect of chain type such that accuracy was better on chains that resembled the performance chain. In the present study we examined a different characteristic of a chain by comparing chains with and without consecutive repeats. We manipulated this variable not only because it has been under-explored in the literature but also because we were interested in low doses of a drug that can produce perseveration. Therefore, some chains were structured such that a tendency to perseverate might be beneficial under some conditions (i.e. backward trained repeating chains). While ?lab lore? (and clinical lore) holds that backward chaining is preferable (Lattal & Crawford- Godbey, 1985; Pear, 2001; Smith, 1999; Weiss, 1978) it is difficult to locate many animal studies that examine this issue systematically. The training procedure is a crucial component of IRA, as it determines how the chain will progress in length throughout the session. Backward chaining (in which each link in the chain is added before the previous link) is the most widely used approach to training IRA. Forward chaining (in which each link is added behind the previous link) would be the alternative. d-Amphetamine, a dopamine agonist, promotes the activity of dopamine neurotransmitter systems by increasing the release of vesicular dopamine and blocking its re-uptake back into the presynaptic neuron. This is important because dopamine has been implicated as a key player in reinforcement processes (McGregor & Roberts, 1994; Spanagel & Weiss, 1999; Wise, 2004). Despite extensive research on the effects of d-amphetamine, inconsistencies remain when describing what constitutes a ?low? dose. This issue is important because doses below 0.1 mg/kg lie within the clinically- relevant range, with doses of .07-.40 mg/kg used to enhance mood and improve performance on a variety of physical and cognitive tasks (in most individuals) (for a review see Grilly & Loveland, 2001). In some laboratory studies using animals (pigeons or monkeys), a dose of 0.1 mg/kg (Moerschbaecher & Thompson, 1980; Thompson & Moerschbaecher, 1980), 0.3 mg/kg (Thompson & Moerschbaecher, 1978; Thompson, Moerschbaecher & Winsauer, 1983), or 0.56 mg/kg 33 (Moerschbaecher, Boren, Schrot & Fontes, 1979; Thompson, 1973) did not affect accuracy in a RA procedure. With rats in an IRA procedure, a dose of 0.1 mg/kg had no effect on accuracy (Mayorga, Popke, Fogle & Paule, 2000). However, subtle decreases in errors were reported at 0.3 mg/kg with the RA procedure (Harting & McMillian, 1976) and 0.1 mg/kg with the IRA procedure (Paule & McMillan, 1984) during some test conditions. For the purposes of the present study, doses of 0.01 to 3.0 mg/kg were examined. Doses of 0.01 to 0.1 mg/kg will be considered ?low? because they are substantially lower than that typically used in the literature. Progress though the IRA procedure was measured for each training procedure (backward or forward chaining) and chain type (non-repeating or repeating) using a ?progress quotient? (PQ, defined below), which combines chain-length and reinforcer count into a single overall marker of the quality of learning and performance. This measure was superior to other potential dependent variables, like accuracy or longest-chain-achieved. The latter measures were problematic because of the mastery- based criterion to increment. These issues are addressed in the discussion section. Methods Subjects Ten male Long Evans rats, housed individually in a temperature- and humidity-controlled AAALAC-accredited colony room, were maintained on a 12-hour light-dark cycle (lights on at 7:00 a.m.) with free access to water in their home cages. Weight was maintained at 300 grams (+/- 5 grams) by individualized feeding of a Purina? chow diet. A portion of the animal?s caloric intake was available during experimental sessions, via 45 mg Purina? sucrose pellets. The study was approved by the Auburn University Animal Care and Use Committee. Apparatus Ten commercial operant chambers (Med Associates Inc. model #Med ENV 007) containing a lever on the back wall (designated the B lever) and a lever to the left (L lever) and right (R lever) of the pellet dispenser were used. Each chamber measured 12?L x 9 ??W x 11 ??H. Pellet dispensers were located between the two front levers and delivered 45 mg sucrose pellets (Purina Mills, Inc., St. Louis, MO). Each chamber was equipped with Sonalert tones? (2900 and 4500 Hz, nominally) calibrated to an 34 amplitude of 70 dbC for presentation of auditory stimuli. A 28-volt house light was located at the top of the back wall and illuminated each chamber. Each chamber was surrounded by a sound-attenuating cabinet, each with a built-in ventilating fan which circulated air throughout the experimental chamber and provided white noise for the duration of the experimental session. Programs for experimental procedures and data collection were written using MED-PC IV (Med-Associates, Georgia, VT), and all session events were recorded with 0.01" resolution. All programming equipment was located in a room adjacent to the testing room. Procedure All experimental sessions were conducted at approximately the same time each day, in the same testing room and sessions ended after 50 reinforcers in the four-link chain were obtained or one hour passed. Rats were randomly assigned to either the forward training (n=5) or backward training (n=5) groups. For the forward training group, to build a chain, links were added after each previously learned link. For example the chain LRBR was trained as follows: L ? sucrose, L ? R ? sucrose, L ? R ? B ? sucrose, L ? R ? B ? R ? sucrose. For the backward training group, chains were built by adding a new link before the previously learned link(s). For example, LRBR was trained as follows: R ? sucrose, B ? R ? sucrose, R ? B ? R ? sucrose, L ? R ? B ? R ? sucrose (see Table 1 for a more detailed depiction of these training procedures). A one-link chain (single response) began each session. Chain-length was increased using a pre- set mastery-based criterion. After 10 consecutive correct responses (no errors) a second link was added, forming a two-link chain. After five consecutive reinforcers (i.e. correct executions of the two-link chain) a third link was added and after five consecutive reinforcers a fourth link was added. Pressing an incorrect lever at any point in the chain resulted in a 2? blackout period in which the house light was darkened and no response was reinforced. Following this blackout period, the current chain length reset to the beginning. Each link in the chain was paired with a discrete auditory stimulus (a low-tone, low pulsing tone, high-tone or high pulsing tone, see Table 1). Animals were exposed first to the performance condition in which they were required to perform the same four-link sequence each session (LRBR for the forward group and RLRB for the backward 35 group). Initially animals were trained on a three-link chain but it quickly became apparent that a four-link chain was required to avoid ceiling effects. The fourth link in the performance chains appeared at the end of the chain for the forward group and at the front for the backward group so that both were added to the last place in the chain. Learning sequences (in which the animals were required to acquire a different response sequence within the experimental session) were introduced only after an animal achieved at least 80% accuracy on the four-link performance chain for three successive days. Two types of learning chains were presented: non-repeating (traditional) and repeating chains. Repeating chains were four-link sequences that included two consecutive responses on the same lever (e.g. LLRB, BRRL, RLBB). Non- repeating chains were four-link sequences that did not have consecutive responses on the same lever (e.g. LRBR, RBLR, BLBR). All animals received equal, pseudo-randomized presentations of each learning chain type (repeating and non-repeating) and the performance chain. Drug Administration Drug administration commenced after behavior showed no systematic change in overall response rate or progress quotient (PQ) for ten days. All animals received d-amphetamine sulfate (Sigma-Aldich, St. Louis, MO), dissolved in a 0.9% saline solution such that an injection volume of 0.3 ml (0.1 ml/kg) was maintained. The doses ranged from 0.01 to 3.0 mg/kg d-amphetamine, calculated as the salt. All doses of d-amphetamine and saline vehicle were administered intraperitonealy 10 minutes prior to each experimental session in a room adjacent to the experimental run room. Drugs were always administered on Tuesdays and Fridays. Saline vehicle was administered on Thursdays and Sundays while Mondays and Wednesdays served as non-injected control sessions. Sessions were conducted such that one ?performance?, ?learning-repeating? and ?learning-non-repeating? session was conducted at each dose. Statistical and data analysis Two dependent measures were analyzed in the present study: Overall response rate (number of responses per session) and a progress quotient (PQ): PQ = (1*Rf 1 + 2*Rf 2 + 3*Rf 3 + 4*Rf 4 ) / Rf tot (1) 36 where Rf i = number of reinforcers earned on a chain length of length i. This index serves as a measure of progress and avoids many problems that are associated with using a measure of accuracy or maximum sequence length reached during a mastery-based criterion IRA procedure, such as the one used here. For baseline data, 2-way RMANOVA?s (training procedure x chain type) and individual t-tests were used to analyze both dependent measures. For all drug data, the two dependent measures were examined using a two-way RMANOVA, with dose as a within-subject variable and training procedure (backward or forward chaining) as a between-subjects factor. This was conducted separately for each chain type: performance, learning-repeating, and learning-non-repeating. The two measures were also examined in separate analyses using a two-way RMANOVA, with dose and chain-type as within-subject variables. This was conducted separately for each training procedure (backward and forward training). The Greenhouse-Geisser correction was used when necessary (i.e. GG epsilon < 0.6). All animals are included in each analysis. Planned pair-wise comparisons between saline and an active dose were conducted if there was a statistical main effect of dose. To correct for multiple comparisons and retain a p-value of 0.05, this value was divided by 7, the number of comparisons for any dose-effect relationship, so planned comparisons required a p value of 0.007 to be significant. Conclusions about group differences were addressed by examining graphs. All error bars represent the S.E.M. and p < .05 was the criterion for statistical significance. Statistical analyses were conducted using SYStat (San Jose, CA). Results Baseline: Responding and PQ Score RMANOVA?s (training procedure x chain type) were used to analyze the baseline data. Neither chain type nor training procedure affected overall response rate during baseline (Figure 1A, all p?s >.05). Training procedure did not affect PQ score (p= .330). However, there was an effect of chain type on PQ score (F(2,16) = 10.039, p= .002). Dependent samples t-tests were used to clarify this effect, which revealed that performance PQ was higher than both repeating (t(4)= 4.843, p= .008) and non-repeating (t(4)= 4.156, p= .014) PQ scores for the forward training group. This effect was not seen for the backward training group (See figure 1B). 37 Drug Phase-Total Responding Dose. There was an overall increase in responding at 1.0 mg/kg for the performance and non- repeating learning conditions (Figure 2A and 2B). A main effect of dose was seen for the backward- trained non-repeating (F(8,32)= 10.20, p< .001) and performance (F(8,32)= 3.03, p= .012) chains as well as for the forward trained non-repeating (F(8,32)= 14.28, p< .001) and performance (F(8,32)= 2.59, p= .027) chains. Neither backward (p= .575) nor forward (p= .203) trained repeating chains were affected by dose (Figure 2A-C). Training Procedure. During drug administration overall response rate was not affected by training procedure for any chain type (Figure 2A-C, all p?s > 0.05). Chain Type. Within the forward training group (open squares), an interaction between chain type and dose occurred between the non-repeating and repeating chains (F(8,32)=5.05, p< .001) and non- repeating and performance chains (F(8,32)=5.39, p< .001). Similarly, within the backward training group (filled squares), an interaction between chain type and dose occurred between the non-repeating and repeating chains (F(8,32)=7.15, p=.000) and non- repeating and performance chains (F(8,32)=6.34, p=.000). Progress Index Dose. For all conditions, there was a main effect of dose such that low to moderate doses had no effect on PQ or increased it (Figure 3C) and higher doses always decreased it. The p values for the main effect of dose were all less than 0.05. Training Procedure. For the performance condition, the interaction between dose and training procedure had a p value of 0.053. For the non-repeating chains, there was a main effect of training procedure (F(1,8)=8.67, p= .019) but no interaction with dose. For the repeating chains, there was not a significant interaction between dose and training procedure (p= .441) (see Figure 3). Chain type. Within the forward training group, the chain types appeared to be statistically distinct from each other. The PQ was generally lower for both learning chains, non-repeating (F(1,4)=214.86, p< .001) and repeating (F(1,4)=23.17, p= .009), than for the performance chain. Also, scores for the non- repeating chains were lower than those for repeating (F(1,4)=12.23, p= .025) chains. An interaction 38 between chain type and dose was seen between the forward trained non-repeating and performance chains (F(8,32)= 4.63, p= .034). Within the Backward training group, non-repeating chains resulted in lower PQ scores than the performance (F(1,4)= 53.96, p= .002) and repeating (F(1,4)= 73.95, p= .001) chain types. However, repeating and performance chains types were statistically indistinguishable from each other (p= .844). An interaction between chain type and dose was seen between the backward trained non-repeating and performance chains (F(8,32) = 4.71, p= .047) and non-repeating and repeating (F(8,32)= 4.41, p= .044) chain types. Post-hoc tests and an examination of Figure 3 helped to clarify these effects. The PQ score for each dose was compared to vehicle for each training procedure/chain type combination. This revealed that low-doses (0.03, .056, 0.1 mg/kg) of d-amphetamine improved PQ when chains containing a repeat were established using a backward training procedure (Figure 3C). For other conditions, low to moderate doses were ineffective. At the highest dose, all learning chains were disrupted by d-amphetamine but, for the performance chain, the backward-chaining procedure showed greater disruption than forward chaining. Discussion The current study was designed to examine how training procedure (forward or backward chaining), learning chain structure (repeating or non-repeating chain types), and experience (performance chains vs. learning chains) influenced low-dose drug effects on behavior under an incremental repeated acquisition (IRA) of behavioral chains procedure. We developed an index, which we call ?PQ,? that measures the quality of progression on the mastery-based procedure that we used. The results indicate 1) training procedure did not affect PQ of performance chains; 2) the effects of training procedure on learning chains depended on chain type; 3) very low, clinically-relevant doses of d-amphetamine improved acquisition for one training procedure/chain type combination; and 4) differences in PQ were seen at doses during which no detectable change in overall response rates occurred. PQ, a summary measure of progress. The procedure used in the present study differed from those previously used in studies of incremental repeated acquisition by imposing a behaviorally based 39 mastery criterion for building the chain. Based on previous reports (Wenger, Schmidt & Davisson, 2004; Cohn, Cox & Cory-Slechta, 1993; Paule & McMillan, 1984; see also Galizio, Keith, Mansfield, & Pitts, 2003) we anticipated that a three-response chain may be the maximum attainable chain length, but it quickly became apparent that a four-response chain was required to prevent ceiling effects, and a longer chain was probably possible. In the mastery-based approach, a new link was added only after a shorter length chain was executed reliably and with no errors. This presented a challenge, however, to developing an adequate marker of the quality of progression. A single overall measure of accuracy is meaningful only if the conditions under which ?correct? and ?incorrect? responses are homogeneously implemented. Some animals, however, performed many short chains with relatively high accuracy and others produced much longer chains but sometimes with similar or even lower accuracy. A single measure like ?percent correct? does not capture the superior performance of those animals that reached longer chains. Examining accuracy as a function of chain length presents similar difficulties if the number of responses or reinforcers at each chain length is free to vary. Maximum chain length attained does not distinguish between an animal that met criterion for a four- link chain but never successfully produced one from an animal that received 50 reinforcers (the maximum possible) for a four-link chain. These difficulties might be overcome if accuracy during each chain-length is plotted individually and adjusted for maximum chain length attained. However, with multiple dependent and independent variables this is overly cumbersome. The progress quotient (PQ) couples the length of the chain attained with the number of reinforcers earned. Equation 1 weights a reinforcer for a four-link chain four times more than a reinforcer for a one-link chain. If the four-link chain was never attained then the weighting is zero for the four-link chain, so only short-chain reinforcers are counted. Thus, the PQ counts only responses that are part of correctly completed chains and emphasizes long chains. Note that if only four-link chains were included in a session then the maximum PQ possible would be 4. Similarly if only three-link chains were included then the maximum PQ would be 3. With the mastery criteria imposed, the inclusion of all chain-lengths, and the cap on fifty reinforcers for a four-link chain, the maximum PQ possible for the present study was 3.357. One potential weakness with PQ is that it is not directly sensitive to the number of errors made. 40 Errors are indirectly influential, however, by making it difficult to meet the mastery criterion, by contributing to the overall response rate, and by making it difficult to produce a large number of long chains during a time-limited session. Training Procedures and Chain Structure. The near-ubiquitous use of backward training for the IRA procedure (e.g. Mayorga, Popke, Fogle & Paule, 2000; Popke, Mayorga, Fogle & Paule, 2000; Popke, Allen & Paule, 2000; Wenger, Schmidt & Davisson, 2004; Weinberger & Killiam, 1978) reflects the overall preference for backward training seen in the animal literature. This may be traceable to suggestions that such an approach results in higher accuracy because 1) it strengthens the response closest to primary reinforcement first, and works back from there and 2) each link in the chain will act as a conditioned reinforcer for the next link in the chain (Millenson, 1967; Pear, 2001, respectively). In this way, as an animal moves through the chain it moves along a progressively strengthening reinforcement gradient and through stronger conditioned reinforcers. However, it is difficult to locate studies that directly compare forward and backward training with animals. Studies of humans have generally yielded mixed results, but sometimes have suggested that forward training is superior (Smith, 1999; Weiss, 1978). Both experience with a chain and chain structure were important determinants of PQ. During the pre-drug baseline sessions, the performance chain supported higher PQ than learning chains for the forward training group, even as overall response rates were the same across all conditions. Specifically, for the forward training group, the PQ was greatest for the performance chain, second for repeating- learning, and lowest for non-repeating learning chains. For the backward training group, however, the repeating chains resulted in nearly identical PQ scores as the performance chain, and both were higher than the non-repeating chains. These effects of chain type and training procedure during baseline were amplified as a result of low-dose d-amphetamine administration. d-Amphetamine. For the performance condition, when an animal produced the same chain every session, acute doses of d-amphetamine produced the biphasic dose-effect relationship on response rate commonly seen with psychomotor stimulants. Rate increases appeared at 1.0 mg/kg, and rates decreased from this peak at 3.0 mg/kg. Higher doses, which certainly would have substantially reduced 41 responding, were not examined. The pattern was less clear for the two learning conditions, but a tendency towards a rate increase was apparent at about 0.3 to 1.0 mg/kg. Measures of PQ showed a different pattern of effects. As with overall response rates, the highest dose of d-amphetamine resulted in a deterioration of PQ. The doses that had no effect on response rate, however, either systematically increased (backward chaining, repeating chains) or had no overall effect (performance and non-repeating chains) on PQ. For the performance condition, training procedure did not influence d-amphetamine?s effect on responding and had no detectable influence on PQ as a function of dose, although visually it appeared that forward training was slightly more resistant to the detrimental effects of high doses of d-amphetamine than was backward training. For the non-repeating chains, backward training produced higher PQ across all but the two highest doses of d-amphetamine. For the repeating chains (in the learning condition), low doses (0.01 ? 0.1 mg/kg) of d- amphetamine increased PQ when trained using backward chaining. The increase was so great that PQ was statistically indistinguishable from that of the performance chain. In fact, it was often very close to the maximum PQ possible. This improvement did not occur for the forward training group on repeating chains. This indicates that these very low doses can, under some conditions, improve learning. These doses are much lower than what is typical in the IRA literature and well below doses that changed response rates, but are a much closer approximation of clinical dosing (Grilly & Loveland, 2001). This may correspond to the effects of low-dose amphetamine seen in clinical populations, particularly the rate-reducing effects (and behavioral improvements) seen with ADHD children who have been administered stimulants (for example). Also, these results indicate that the IRA procedure is sensitive enough to detect changes in PQ score at very low doses, even with no detectable change in responding. Perhaps these results can be understood by considering the potential influence of perseveration (or, perseveration-like behavior). d-Amphetamine is known to induce perseveration on behavioral tasks (Spanagel & Weiss, 1999; Wise, 2004) such that repeating a response is made more likely while under the influence of d-amphetamine. This is exactly what is required in the repeating chains. The way that backward training builds a chain may also be relevant since it allows the animal to execute responses 42 that have already been reinforced as it moves through the chain, which may be interpreted as perseverative in nature. Concluding Remarks. Forward and backward approaches were equally effective during the performance condition, but the PQ was higher when backward chaining was used during the learning component, regardless of chain structure. Very low doses of d-amphetamine improved learning under some conditions. Among other things, these observations suggest that training procedure and chain structure may be of greatest importance during the acquisition of a response chain. Further, if not controlled for, these variables could be confounds in investigations of drug or toxicant effects on learning using the IRA procedure. These results supplement those of Wright and Paule (2007) and Cohn, Cox and Cory-Slechta (1993) who also noted the importance of chain type in the IRA procedure. The current implementation of an IRA procedure addressed two important procedural manipulations that were previously unaddressed in the literature, and offered a more suitable dependent measure for a mastery- based IRA procedure. 43 References Boren, J.J. & Devine, D.D. (1968). The repeated acquisition of behavioral chains. Journal of the Experimental Analysis of Behavior, 11, 651-660. Boren, J.J. (1963). Repeated acquisition of new behavioral chains. American Psychologist, 17, 421. Cohn, J, Cox, C. & Cory-Slechta, D.A. (1993). The effects of lead exposure on learning in a multiple repeated acquisition and performance schedule. Neurotoxicology 14, 329?346. Cohn, J. & Paule, M.G. (1995). Repeated acquisition of response sequences: The analysis of behavior in transition. Neuroscience and Biobehavioral Reviews, 19, 397-406. Galizio, M., Keith, J. R., Mansfield, W. & Pitts, R. C. (2003). Repeated spatial acquisition: Effects of NMDA antagonists and morphine. Experimental and Clinical Psychopharmacology, 11, 79-90. Grilly, D.M. & Loveland, A. (2001). What is a ?low dose? of d-amphetamine for inducing behavioral effects in laboratory rats? Psychopharmacology, 153, 155-169. Harting, J. & McMillan, D.E. (1976). Effects of pentobarbital and d-amphetamine on the repeated acquisition of response sequences by pigeons. Psychopharmacology, 49(3), 245-248. Lattal & Crawford-Godbey, (1985). Homogeneous chains, heterogeneous chains and delay of reinforcement. Journal of Experimental Analysis of Behavior, 44(3), 337-342. Mayorga, A.J., Popke, E.J., Fogle, C.M., & Paule, M.G. (2000). Similar effects of amphetamine and methylphenidate on the performance of complex operant tasks in rats. Behavioral Brain Research, 109, 59-68. 44 McGregor, A. & Roberts, D.C.S. (1994). Mechanisms of abuse. In A.K. Cho (Ed.), Amphetamine and its Analogs, 243-266. San Diego: Academic Press. Millenson, J. R. (1967). Principles of behavioral analysis. New York: Macmillan. Moerschbaecher, J.M., Boren, J.J., Schrot, J. & Fontes, J.C.S. (1979). Effects of cocaine and d- amphetamine on the repeated acquisition and performance of conditional discriminations. Journal of the Experimental Analysis of Behavior, 31, 127-140. Moerschbaecher, J.M., & Thompson, D.M. (1980). Effects of d amphetamine, cocaine and phencyclidine on the acquisition of response sequences with and without stimulus fading. Journal of the Experimental Analysis of Behavior, 33, 369-381. Paule, M.G. & McMillan, D.E. (1984). Incremental repeated acquisition in the rat: Acute effects of drugs. Pharmacology, Biochemistry and Behavior, 21, 431-439. Pear, J. (2001). The Science of Learning. Ann Arbor, MI: Edwards Brothers. Pieper, W.A. (1976). Great apes and rhesus monkeys as subjects for psychopharmacological studies of stimulants and depressants. Federation Proceedings, 35, 2254-2257. Popke, E.J., Allen, S.R., & Paule, M.G. (2000). Effects of acute ethanol on indices of cognitive-behavioral performance in rats. Alcohol, 20, 187-192. Popke, E.J., Mayorga, A.J., Fogle, C.M., & Paule, M.G. (2000). Effects of acute nicotine on several operant behaviors in rats. Pharmacology, Biochemistry, and Behavior, 65, 247-254. 45 Wise, R.A. (2004). Dopamine, learning and motivation. Nature Reviews Neuroscience, 5, 483-94. Spanagel, R. & Weiss. F. (1999). The dopamine hypothesis of reward: Past and current status. Trends in Neuroscience, 22, 521-527. Smith, G.J. (1999). Teaching a long sequence of behavior using whole task training, forward chaining, and backward chaining. Perceptual and Motor Skills, 89, 951-965. Thompson, D.M. (1973). Repeated acquisition as a behavioral baseline for studying drug effects. Journal of Pharmacology and Experimental Therapeutics, 184, 504-514. Thompson, D.M., & Moerschbaecher, J.M. (1978). Operant methodology in the study of learning. Environmental Health Perspective, 26, 77-87. Thompson, D.M., & Moerschbaecher, J.M. (1980). Effects of d amphetamine and cocaine on strained ratio behavior in a repeated acquisition task. Journal of the Experimental Analysis of Behavior, 33, 141-148. Thompson, D.M., Moerschbarcher, J.M. & Winsauer, P.J. (1983). Drug effects on repeated acquisition: Comparison of cumulative and non-cumulative dosing. Journal of the Experimental Analysis of Behavior, 39(1), 175-184. Weinberger, S.B. & Killam, E.B. (1978) Alterations in learning performance in the seizure-prone baboon: Effects of elicited seizures and chronic treatment with diazepam and Phenobarbital. Epilepsia, 19, 301-316. Weiss, K.M. (1978). A comparison of forward and backward procedures for the acquisition of response chains in humans. Journal of the Experimental Analysis of Behavior, 29, 255-259. 46 Wenger, G.R., Schmidt, C., & Davisson, M.T. (2004). Operant conditioning in the Ts65Dn mouse: Learning. Behavior Genetics, 34(1), 105-119. Wise, R.A. (2004). Dopamine, learning and motivation. Nature Reviews Neuroscience, 5(6), 483-494. Wright, L.K.M., & Paule, M.G. (2007). Response sequence difficulty in an incremental repeated acquisition (learning) procedure. Behavioral Processes, 75, 81-84. 47 Tables Table 1b Building a Backward and Forward Chain Backward Chaining: R-L-R-B Link 4 Link 3 Link 2 Link 1 High Tone : B ? Sucrose High Pulsing : R ? High Tone : B ? Sucrose Low Tone : L ? High Pulsing : R ? High Tone : B ? Sucrose Low Pulsing : R ? Low Tone : L ? High Pulsing : R ? High Tone : B ? Sucrose Forward Chaining: L-R-B-R Link 1 Link 2 Link 3 Link 4 Low Pulsing : L ? Sucrose Low Pulsing : L ? Low Tone : R ? Sucrose Low Pulsing : L ? Low Tone : R ? High Pulsing : B ? Sucrose Low Pulsing : L ? Low Tone : R ? High Pulsing : B ? High Tone : R ? Sucrose 48 Table Captions Table 1b. Depicts the two performance sequences used incrementing by backward chaining (top panel) and by forward chaining (bottom panel). 49 Figures Figure 1. Baseline Total Responding Chain Type Performance Non-Repeating Repeating Res pons es per S e s s i on 0 200 400 600 800 Forward Chaining Backward Chaining A. Baseline PQ Chain Type Performance Non-Repeating Repeating PQ 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 B. 50 Figure 2. Performance C V0.01 0.1 1 10 T o t a l Res p ons es / Ses s i on 0 200 400 600 800 1000 Non-Repeating Chains d Amphetamine mg/kg C V0.01 0.1 1 10 0 200 400 600 800 1000 Repeating Chains C V0.01 0.1 1 10 0 200 400 600 800 1000 Backward Chaining Forward Chaining * * ? ? ? A. B. C. 51 Figure 3. Performance C V0.01 0.1 1 10 PQ 0 1 2 3 4 Backward Chaining Forward Chaining Repeating Chains C V0.01 0.1 1 10 0 1 2 3 4 Non-Repeating Chains * d-Amphetamine mg/kg C V0.01 0.1 1 10 0 1 2 3 4 * * * * * * ? ? A. B. C. 52 Figure Captions Figure 1. Baseline responses per session grouped by chain type for each training procedure group is plotted in panel A. PQ scores grouped by chain type for each training procedure group is plotted in panel B. Error bars reflect S.E.M. Figure 2. Responses per session for each training procedure group are plotted for each of the three chain types. Error bars reflect S.E.M.; there are no significant main effects of training procedure (p > 0.05) for any chain type. An * above a data point denotes significant difference from vehicle for the backward chaining group, a ? indicates a significant difference from vehicle for the forward chaining group (p? .007 for both groups).Control (C), Vehicle (V) and d-amphetamine sessions are shown. Figure 3. PQ scores for each training procedure group are plotted for each of the three chain types (structured similarly as in Fig 2). An * above a data point denotes significant difference from vehicle for the backward chaining group, a ? indicates a significant difference from vehicle for the forward chaining group (p? .007 for both groups). The * in the panel title denotes a significant main effect of training procedure. The dashed line denotes the maximum possible PQ score, 3.357. 53